Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics Commons

Open Access. Powered by Scholars. Published by Universities.®

Electronic Theses and Dissertations

Discipline
Institution
Keyword
Publication Year

Articles 31 - 57 of 57

Full-Text Articles in Bioinformatics

Applying Machine Learning Algorithms For The Analysis Of Biological Sequences And Medical Records, Shaopeng Gu Jan 2019

Applying Machine Learning Algorithms For The Analysis Of Biological Sequences And Medical Records, Shaopeng Gu

Electronic Theses and Dissertations

The modern sequencing technology revolutionizes the genomic research and triggers explosive growth of DNA, RNA, and protein sequences. How to infer the structure and function from biological sequences is a fundamentally important task in genomics and proteomics fields. With the development of statistical and machine learning methods, an integrated and user-friendly tool containing the state-of-the-art data mining methods are needed. Here, we propose SeqFea-Learn, a comprehensive Python pipeline that integrating multiple steps: feature extraction, dimensionality reduction, feature selection, predicting model constructions based on machine learning and deep learning approaches to analyze sequences. We used enhancers, RNA N6- methyladenosine sites and …


Impact Of Seasonal And Host-Related Factors On The Intestinal Microbiome And Cestode Community Of Sorex Cinereus And Sorex Monticola, Katelyn D. Cranmer Jan 2019

Impact Of Seasonal And Host-Related Factors On The Intestinal Microbiome And Cestode Community Of Sorex Cinereus And Sorex Monticola, Katelyn D. Cranmer

Electronic Theses and Dissertations

The intestinal microbiome of mammals plays a significant role in host health and response to environmental stimuli and can include both beneficial native bacteria as well as parasitic worms. In this study, I examined the intestinal cestode and bacterial communities of two closely related species of shrew, Sorex monticola and Sorex cinereus, over a six month period in 2016. Specimens were collected approximately every three weeks from May to October from the Sangre de Cristo Mountains in Cowles, New Mexico. A total of 79 shrews were prepared with the gastrointestinal tracts removed and flash-frozen in liquid nitrogen. An additional …


Functional Consequence Of Psat1 Association On Pkm2'S Inherent Enzymatic Activity., Alexis Avidan Vega Dec 2018

Functional Consequence Of Psat1 Association On Pkm2'S Inherent Enzymatic Activity., Alexis Avidan Vega

Electronic Theses and Dissertations

Pyruvate kinase M2 (PKM2) is predominantly found in tumors, where it allows the cancer cell to adapt to metabolic conditions through allosteric regulation of its activity. We recently discovered that phosphoserine aminotransferase 1 (PSAT1) associates with and activates PKM2. Here, I sought to affirm PSAT1's ability to increase PKM2 activity through kinetic and association studies of wild-type or mutant PKM2 enzymes. I demonstrate that His-tagged WT and mutant PKM2 enzymes are active, exhibit different kinetics, yet cannot be activated by PSAT1. Comparison studies using untagged WT PKM2 suggest that inclusion of the His-tag disrupts PSAT1 association. In support, pull-down strategies …


Thermal And Microbial Effects On Brown Macroalgae: Heat Acclimation And The Biodiversity Of The Microbiome, Charlotte Tc Quigley Nov 2018

Thermal And Microbial Effects On Brown Macroalgae: Heat Acclimation And The Biodiversity Of The Microbiome, Charlotte Tc Quigley

Electronic Theses and Dissertations

This dissertation examines effects of stress on brown algal biology from a macroscopic scale by examining the whole aquaculture crops, and at a microscopic level by examining the macroalgal microbiome, across the vertical stress gradient of the intertidal zone and across the latitudes of their biogeographic ranges. Thermal stress negatively affected seedstock gametophytes of the kelp Alaria esculenta isolated from northern and southern locations in Maine. However, previous thermal stress had a positive effect on growth of the next-generation sporophytes. Alaria esculenta has potential as a kelp crop in Maine’s sea vegetable aquaculture sector and implementing this protocol may allow …


Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor Aug 2018

Bayesian Analytical Approaches For Metabolomics : A Novel Method For Molecular Structure-Informed Metabolite Interaction Modeling, A Novel Diagnostic Model For Differentiating Myocardial Infarction Type, And Approaches For Compound Identification Given Mass Spectrometry Data., Patrick J. Trainor

Electronic Theses and Dissertations

Metabolomics, the study of small molecules in biological systems, has enjoyed great success in enabling researchers to examine disease-associated metabolic dysregulation and has been utilized for the discovery biomarkers of disease and phenotypic states. In spite of recent technological advances in the analytical platforms utilized in metabolomics and the proliferation of tools for the analysis of metabolomics data, significant challenges in metabolomics data analyses remain. In this dissertation, we present three of these challenges and Bayesian methodological solutions for each. In the first part we develop a new methodology to serve a basis for making higher order inferences in metabolomics, …


Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle Aug 2018

Deciphering The Role Of Human Arylamine N-Acetyltransferase 1 (Nat1) In Breast Cancer Cell Metabolism Using A Systems Biology Approach., Samantha Marie Carlisle

Electronic Theses and Dissertations

Background: Human arylamine N-acetyltransferase 1 (NAT1) is a phase II xenobiotic metabolizing enzyme found in almost all tissues. NAT1 can additionally hydrolyze acetyl-coenzyme A (acetyl-CoA) in the absence of an arylamine substrate. NAT1 expression varies inter-individually and is elevated in several cancers including estrogen receptor positive (ER+) breast cancers. Additionally, multiple studies have shown the knockdown of NAT1, by both small molecule inhibition and siRNA methods, in breast cancer cells leads to decreased invasive ability and proliferation and decreased anchorage-independent colony formation. However, the exact mechanism by which NAT1 expression affects cancer risk and progression remains unclear. Additionally, consequences …


Investigating The Interaction Of Monoamines And Diel Rhythmicity On Anti-Predator Behavior In An Orb-Weaving Spider, Larinioides Cornutus (Araneae: Araneae), Rebecca Wilson Aug 2018

Investigating The Interaction Of Monoamines And Diel Rhythmicity On Anti-Predator Behavior In An Orb-Weaving Spider, Larinioides Cornutus (Araneae: Araneae), Rebecca Wilson

Electronic Theses and Dissertations

Circadian rhythms are ubiquitous among organisms, influencing a wide array of physiological processes and behaviors including aggression. While many neurophysiological mechanisms are involved in the regulation of aggressive behaviors, relatively few studies have investigated the underlying components involved in the interplay between circadian rhythms and aggression. Spiders are an ideal model system for studying circadian regulation of aggression as they are ecologically both predators and prey. Recent studies have revealed a nocturnal orb- weaving spider Larinioides cornutus exhibits a diel and circadian rhythm in anti-predator behavior (i.e. boldness) that can be manipulated by administration of octopamine (OA) and serotonin (5- …


Changes In The Microbial Community Of Lubomirskia Baicalensis Affected By Red Sponge Disease, Colin Rorex May 2018

Changes In The Microbial Community Of Lubomirskia Baicalensis Affected By Red Sponge Disease, Colin Rorex

Electronic Theses and Dissertations

Lake Baikal is the oldest known lake and a unique ecosystem, home to several species of fresh water sponge. A disease outbreak affecting the dominant species, Lubormirskia baialensis, was recently reported. The cause of the disease has not been determined but one of the current hypothesis is that the increase in methane concentration is correlated to the disease outbreak. This pilot study characterized the microbiomes of sick and healthy sponges through the use of 16S rRNA sequencing. Sick sponge microbiomes shared a conserved group of taxa while the healthy sponge microbiomes had greater diversity. Indicator species analysis identified two significant …


Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka May 2018

Region Based Gene Expression Via Reanalysis Of Publicly Available Microarray Data Sets., Ernur Saka

Electronic Theses and Dissertations

A DNA microarray is a high-throughput technology used to identify relative gene expression. One of the most widely used platforms is the Affymetrix® GeneChip® technology which detects gene expression levels based on probe sets composed of a set of twenty-five nucleotide probes designed to hybridize with specific gene targets. Given a particular Affymetrix® GeneChip® platform, the design of the probes is fixed. However, the method of analysis is dynamic in nature due to the ability to annotate and group probes into uniquely defined groupings. This is particularly important since publicly available repositories of microarray datasets, such as ArrayExpress and NCBI’s …


Tissue Specificity Of Sex-Biased Gene Expression And The Development Of Sexual Dimorphism, Albert K. Chung Jan 2018

Tissue Specificity Of Sex-Biased Gene Expression And The Development Of Sexual Dimorphism, Albert K. Chung

Electronic Theses and Dissertations

One prominent form of phenotypic diversity in nature is the dramatic difference between males and females within a single species. A central genetic obstacle which must be overcome is that two distinct phenotypes must be produced from a single, shared genome. One genetic mechanism that is of particular import that would allow sexes to overcome the limitation of a shared genome is sex-specific regulation of gene expression. Although sex-biased gene expression is generally predicted to increase over ontogeny as male and female phenotypes diverge, this pattern should be most pronounced in tissues that contribute to the most extreme aspects of …


Transcriptome Analysis Of Root Development In Wheat Triticum Aestivum Using High Throughtput Sequencing Technologies, Ghana Challa Jan 2018

Transcriptome Analysis Of Root Development In Wheat Triticum Aestivum Using High Throughtput Sequencing Technologies, Ghana Challa

Electronic Theses and Dissertations

Root provides plant water, nutrients and anchorage from soil. Most our knowledge of molecular mechanisms of root development is from the dicot model plant Arabidopsis, but very few studies have done in monocot crop systems like rice, maize, and wheat. We are studying very short root (VSR) phenotype in wheat, and lack of a sequenced reference genome in wheat prompted us to sequence and assemble the root transcriptome of the reference cultivar Chinese Spring (CS). A root transcriptome was assembled from the sequenced reads generated from root tip and the mature root tissues of CS. Approximately 169 million reads were …


Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick Dec 2017

Functional Data Analysis Methods For Predicting Disease Status., Sarah Kendrick

Electronic Theses and Dissertations

Introduction: Differential scanning calorimetry (DSC) is used to determine thermally-induced conformational changes of biomolecules within a blood plasma sample. Recent research has indicated that DSC curves (or thermograms) may have different characteristics based on disease status and, thus, may be useful as a monitoring and diagnostic tool for some diseases. Since thermograms are curves measured over a range of temperature values, they are often considered as functional data. In this dissertation we propose and apply functional data analysis (FDA) techniques to analyze DSC data from the Lupus Family Registry and Repository (LFRR). The aim is to develop FDA methods to …


Expression Profiling Of Non-Coding Rna By Environmental Interactions In Innate Immunity, Jacob R. Longfellow Aug 2017

Expression Profiling Of Non-Coding Rna By Environmental Interactions In Innate Immunity, Jacob R. Longfellow

Electronic Theses and Dissertations

Cystic fibrosis (CF) is a genetic disorder that affects 30,000 people in the United States and currently has no cure. Although CF affects all of the body’s systems, it is largely characterized as a lung disease. CF is caused by a mutation in both copies of the gene for cystic fibrosis transmembrane conductance regulator (CFTR). A mutation in the CFTR gene leads to improper movement of chloride ions and water into the airways, which dysregulates the airway surface liquid volume and composition. Individuals with CF are prone to lung infections due to inefficient bacterial clearance and by the age of …


Algorithms For Automated Assignment Of Solution-State And Solid-State Protein Nmr Spectra., Andrey Smelter Aug 2017

Algorithms For Automated Assignment Of Solution-State And Solid-State Protein Nmr Spectra., Andrey Smelter

Electronic Theses and Dissertations

Protein nuclear magnetic resonance spectroscopy (Protein NMR) is an invaluable analytical technique for studying protein structure, function, and dynamics. There are two major types of NMR spectroscopy that are used for investigation of protein structure – solution-state and solid-state NMR. Solution-based NMR spectroscopy is typically applied to proteins of small and medium size that are soluble in water. Solid-state NMR spectroscopy is amenable for proteins that are insoluble in water. In the vast majority NMR-based protein studies, the first step after experiment optimization is the assignment of protein resonances via the association of chemical shift values to specific atoms in …


Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah May 2017

Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah

Electronic Theses and Dissertations

Despite considerable advances in high throughput technology over the last decade, new challenges have emerged related to the analysis, interpretation, and integration of high-dimensional data. The arrival of omics datasets has contributed to the rapid improvement of systems biology, which seeks the understanding of complex biological systems. Metabolomics is an emerging omics field, where mass spectrometry technologies generate high dimensional datasets. As advances in this area are progressing, the need for better analysis methods to provide correct and adequate results are required. While in other omics sectors such as genomics or proteomics there has and continues to be critical understanding …


Structure-Function Analysis And Characterization Of Metalloproteins., Sen Yao Aug 2016

Structure-Function Analysis And Characterization Of Metalloproteins., Sen Yao

Electronic Theses and Dissertations

Metalloproteins are proteins that can bind at least one metal ion as a cofactor. They utilize metal ions for a variety of biological purposes, and are essential for all domains of life. Due to the ubiquity of metalloprotein’s involvement across these processes across all domains of life, how proteins coordinate metal ions for different biochemical functions is of great relevance to understanding the implementation of these biological processes. One of the most important aspects of metal binding is its coordination geometry (CG), which often implies functional activities. Most of the current studies are based on the assumption of previously reported …


Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang May 2016

Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang

Electronic Theses and Dissertations

MicroRNAs (miRNAs) are a large number of small endogenous non-coding RNA molecules (18-25 nucleotides in length) which regulate expression of genes post-transcriptionally. While a variety of algorithms exist for determining the targets of miRNAs, they are generally based on sequence information and frequently produce lists consisting of thousands of genes. Canonical correlation analysis (CCA) is a multivariate statistical method that can be used to find linear relationships between two data sets, and here we apply CCA to find the linear combination of differentially expressed miRNAs and their corresponding target genes having maximal negative correlation. Due to the high dimensionality, sparse …


Expression Of Genes For Peptide/Protein Hormones And Their Cognate Receptors In Breast Carcinomas As Biomarkers Predicting Risk Of Recurrence., Michael Wesley Daniels May 2016

Expression Of Genes For Peptide/Protein Hormones And Their Cognate Receptors In Breast Carcinomas As Biomarkers Predicting Risk Of Recurrence., Michael Wesley Daniels

Electronic Theses and Dissertations

Certain hormones and/or receptors influencing normal cellular pathways were detected in breast cancers. The hypothesis is that gene subsets predict risk of breast carcinoma recurrence in patients with primary disease. Gene expression of 55 hormones and 73 receptors were determined by microarray with LCM-procured carcinoma cells of 247 de-identified biopsies. Univariate and multivariate Cox regressions were determined using expression levels of each hormone/receptor gene, individually or as a pair. Significant genes derived for each subset were analyzed to predict risk of cancer recurrence with 1000 LASSO training/test sets. A 14-gene molecular signature was identified for predicting clinical outcome without regard …


A Hierarchical Graph For Nucleotide Binding Domain 2, Samuel Kakraba May 2015

A Hierarchical Graph For Nucleotide Binding Domain 2, Samuel Kakraba

Electronic Theses and Dissertations

One of the most prevalent inherited diseases is cystic fibrosis. This disease is caused by a mutation in a membrane protein, the cystic fibrosis transmembrane conductance regulator (CFTR). CFTR is known to function as a chloride channel that regulates the viscosity of mucus that lines the ducts of a number of organs. Generally, most of the prevalent mutations of CFTR are located in one of two nucleotide binding domains, namely, the nucleotide binding domain 1 (NBD1). However, some mutations in nucleotide binding domain 2 (NBD2) can equally cause cystic fibrosis. In this work, a hierarchical graph is built for NBD2. …


Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula May 2015

Optcluster : An R Package For Determining The Optimal Clustering Algorithm And Optimal Number Of Clusters., Michael N. Sekula

Electronic Theses and Dissertations

Determining the best clustering algorithm and ideal number of clusters for a particular dataset is a fundamental difficulty in unsupervised clustering analysis. In biological research, data generated from Next Generation Sequencing technology and microarray gene expression data are becoming more and more common, so new tools and resources are needed to group such high dimensional data using clustering analysis. Different clustering algorithms can group data very differently. Therefore, there is a need to determine the best groupings in a given dataset using the most suitable clustering algorithm for that data. This paper presents the R package optCluster as an efficient …


Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990- May 2015

Summary Of Survival Analysis With Sas Procedures., Derek Duane Childers 1990-

Electronic Theses and Dissertations

The research conducted for this thesis was performed to summarize some of the most commonly used survival analysis techniques as well as to create one macro that will provide the solutions for these techniques. Some of the techniques that this thesis focuses on are survival and hazard functions, mean and median survival times, life table, log rank test, proportional hazards/model building, and competing risk. To further analyze these survival analysis techniques I will use the Bone Marrow Transplantation for Leukemia dataset. This trial consists of either acute myelocytic leukemia (AML 99 patients) or acute lymphoblastic leukemia (ALL 38 patients). There …


Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984- May 2014

Statistical Methods For Assessing Treatment Effects For Observational Studies., Kristopher C. Gardner 1984-

Electronic Theses and Dissertations

Though randomized clinical (RCTs) trials are the gold standard for comparing treatments, they are often infeasible or exclude clinically important subjects, or generally represent an idealized medical setting rather than real practice. Observational data provide an opportunity to study practice-based evidence, but also present challenges for analysis. Traditional statistical methods which are suitable for RCTs may be inadequate for the observational studies. In this project, four of the most popular statistical methods for observational studies: ANCOVA, propensity score matching, regression with the propensity score as a covariate, and instrumental variables (IV) are investigated through application to MarketScan insurance claims data. …


Molecular Phylogenetic Relationships Of North American Dermacentor Ticks Using Mitochondrial Gene Sequences, Kayla L. Perry Jan 2014

Molecular Phylogenetic Relationships Of North American Dermacentor Ticks Using Mitochondrial Gene Sequences, Kayla L. Perry

Electronic Theses and Dissertations

Dermacentor is a recently evolved genus of hard ticks (Family Ixodiae) that includes 36 known species worldwide. Despite the importance of Dermacentor species as vectors of human and animal disease, the systematics of the genus remain largely unresolved. This study focuses on phylogenetic relationships of the eight North American Nearctic Dermacentor species: D. albipictus, D. variabilis, D. occidentalis, D. halli, D. parumapertus, D. hunteri, and D. andersoni, and the recently re-established species D. kamshadalus, as well as two of the Neotropical Dermacentor species D. nitens and D. dissimilis (both formerly Anocentor). We sequenced portions of the mitochondrial …


A Pairwise Feature Selection Method For Gene Data Using Information Gain, Tian Gui Jan 2014

A Pairwise Feature Selection Method For Gene Data Using Information Gain, Tian Gui

Electronic Theses and Dissertations

The current technical practice for doing classification has limitations when using gene expression microarray data. For example, the robustness of top scoring pairs does not extend to some datasets involving small data size and the gene set with best discrimination power may not be involve a combination of genes. Hence, it is necessary to construct a discriminative and stable classifier that generates highly informative gene sets. As we know, not all the features will be active in a biological process. So a good feature selector should be robust with respect to noise and outliers; the challenge is to select the …


Compound Identification Using Penalized Linear Regression., Ruiqi Liu May 2013

Compound Identification Using Penalized Linear Regression., Ruiqi Liu

Electronic Theses and Dissertations

In this study, we propose a new method for compound identification using penalized linear regression. Compound identification is often achieved by matching the experimental mass spectra to the mass spectra stored in a reference library based on mass spectral similarity. In the context of the linear regression, the response variable is an experimental mass spectrum (i.e., query) and all the compounds in the reference library are the independent variables. However, the number of compounds in the reference library is much larger than the range of m/z values so that the data become high dimensional data with suffering from singularity. For …


Protein Kinases: Structure Modeling, Inhibition, And Protein-Protein Interactions, Khaled M. Elokely Jan 2013

Protein Kinases: Structure Modeling, Inhibition, And Protein-Protein Interactions, Khaled M. Elokely

Electronic Theses and Dissertations

Human protein kinases belong to a large and diverse enzyme family that contains more than 500 members. Deregulation of protein kinases is associated with many disorders, and this is why protein kinases are attractive targets for drug discovery. Due to the high conservation of the ATP binding pocket among this family, designing specific and/or selective inhibitors against certain member(s) is challenging. Several studies have been conducted on protein kinases to validate them as suitable drug targets. Although there are numerous target-validated protein kinases, the efforts to develop small molecule inhibitors have so far led to only a limited number of …


Effects Of Habitat Quality On Reproduction In Two Georgia Populations Of Gopherus Polyphemus, Jaqueline W. Entz Jan 2009

Effects Of Habitat Quality On Reproduction In Two Georgia Populations Of Gopherus Polyphemus, Jaqueline W. Entz

Electronic Theses and Dissertations

Author's Abstract: The purpose of this study was to examine differences in maternal investment by examining variation in the habitat structure and reproductive parameters for two populations of Gopherus polyphemus in Southeast GA. Both habitat structure and reproductive parameters for these populations are known from a previous study, thus this study expands upon the previous one and addresses four main questions. (1) Has habitat quality changed in the past ten years within and between population sites? (2) Could a change of habitat have affected female morphology or female reproductive parameters within or between populations? (3) Is female body size shaping …