Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

Theses/Dissertations

2016

Institution
Keyword
Publication

Articles 1 - 19 of 19

Full-Text Articles in Physical Sciences and Mathematics

A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis Dec 2016

A Framework For The Statistical Analysis Of Mass Spectrometry Imaging Experiments, Kyle Bemis

Open Access Dissertations

Mass spectrometry (MS) imaging is a powerful investigation technique for a wide range of biological applications such as molecular histology of tissue, whole body sections, and bacterial films , and biomedical applications such as cancer diagnosis. MS imaging visualizes the spatial distribution of molecular ions in a sample by repeatedly collecting mass spectra across its surface, resulting in complex, high-dimensional imaging datasets. Two of the primary goals of statistical analysis of MS imaging experiments are classification (for supervised experiments), i.e. assigning pixels to pre-defined classes based on their spectral profiles, and segmentation (for unsupervised experiments), i.e. assigning pixels to newly …


Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour Dec 2016

Stage-Specific Predictive Models For Cancer Survivability, Elham Sagheb Hossein Pour

Theses and Dissertations

Survivability of cancer strongly depends on the stage of cancer. In most previous works, machine learning survivability prediction models for a particular cancer, were trained and evaluated together on all stages of the cancer. In this work, we trained and evaluated survivability prediction models for five major cancers, together on all stages and separately for every stage. We named these models joint and stage-specific models respectively. The obtained results for the cancers which we investigated reveal that, the best model to predict the survivability of the cancer for one specific stage is the model which is specifically built for that …


Network Inference Driven Drug Discovery, Gergely Zahoránszky-Kőhalmi, Tudor I. Oprea Md, Phd, Cristian G. Bologa Phd, Subramani Mani Md, Phd, Oleg Ursu Phd Nov 2016

Network Inference Driven Drug Discovery, Gergely Zahoránszky-Kőhalmi, Tudor I. Oprea Md, Phd, Cristian G. Bologa Phd, Subramani Mani Md, Phd, Oleg Ursu Phd

Biomedical Sciences ETDs

The application of rational drug design principles in the era of network-pharmacology requires the investigation of drug-target and target-target interactions in order to design new drugs. The presented research was aimed at developing novel computational methods that enable the efficient analysis of complex biomedical data and to promote the hypothesis generation in the context of translational research. The three chapters of the Dissertation relate to various segments of drug discovery and development process.

The first chapter introduces the integrated predictive drug discovery platform „SmartGraph”. The novel collaborative-filtering based algorithm „Target Based Recommender (TBR)” was developed in the framework of this …


Algorithms For Glycan Structure Identification With Tandem Mass Spectrometry, Weiping Sun Sep 2016

Algorithms For Glycan Structure Identification With Tandem Mass Spectrometry, Weiping Sun

Electronic Thesis and Dissertation Repository

Glycosylation is a frequently observed post-translational modification (PTM) of proteins. It has been estimated over half of eukaryotic proteins in nature are glycoproteins. Glycoprotein analysis plays a vital role in drug preparation. Thus, characterization of glycans that are linked to proteins has become necessary in glycoproteomics. Mass spectrometry has become an effective analytical technique for glycoproteomics analysis because of its high throughput and sensitivity. The large amount of spectral data collected in a mass spectrometry experiment makes manual interpretation impossible and requires effective computational approaches for automated analysis. Different algorithmic solutions have been proposed to address the challenges in glycoproteomics …


Bayesian Networks To Assess The Newborn Stool Microbiome, William E. Bennett Jr. Aug 2016

Bayesian Networks To Assess The Newborn Stool Microbiome, William E. Bennett Jr.

McKelvey School of Engineering Theses & Dissertations

In human stool, a large population of bacterial genes and transcripts from hundreds of genera coexist with host genes and transcripts. Assessments of the metagenome and transcriptome are particularly challenging, since there is a great deal of sequence overlap among related species and related genes. We sequenced the total RNA content from stool samples in a neonate using previously-described methods. We then performed stepwise alignment of different populations of RNA sequence reads to different indices, including ribosomal databases, the human genome, and all sequenced bacterial genomes. Each pool of RNA at each alignment step was subjected to compression to assess …


Metabolomics Approaches To Decipher The Antibacterial Mechanisms Of Yerba Mate (Ilex Paraguariensis) Against Staphylococcus Aureus And Salmonella Enterica Serovar Typhimurium, Caroline Sue Rempe Aug 2016

Metabolomics Approaches To Decipher The Antibacterial Mechanisms Of Yerba Mate (Ilex Paraguariensis) Against Staphylococcus Aureus And Salmonella Enterica Serovar Typhimurium, Caroline Sue Rempe

Doctoral Dissertations

The increasing prevalence of drug-resistant pathogens is an urgent problem that requires novel methods of bacterial control. Plant extracts inhibit bacterial pathogens and could contain antibacterial compounds with novel mechanisms of action. Yerba mate, a common South American beverage made from Ilex paraguariensis, has antibiotic activity against a broad range of bacterial pathogens. In this work, an attempt was first made to characterize the antibacterial source of an aqueous yerba mate extract by generating a series of extract fractions, collecting GC-MS and antibacterial activity profiles, and then ranking the hundreds of compounds by their presence in fractions with high antibacterial …


Mhealth Technology: Towards A New Persuasive Mobile Application For Caregivers That Addresses Motivation And Usability, Suboh M. Alkhushayni Aug 2016

Mhealth Technology: Towards A New Persuasive Mobile Application For Caregivers That Addresses Motivation And Usability, Suboh M. Alkhushayni

Theses and Dissertations

With the increasing use of mobile technologies and smartphones, new methods of promoting personal health have been developed. For example, there is now software for recording and tracking one's exercise activity or blood pressure. Even though there are already many of these services, the mobile health field still presents many opportunities for new research.

One apparent area of need would be software to support the efforts of caregivers for the elderly, especially those who suffer from multiple chronic conditions, such as cognitive impairment, chronic heart failure or diabetes. Very few mobile applications (apps) have been created that target caregivers of …


Protein Residue-Residue Contact Prediction Using Stacked Denoising Autoencoders, Joseph Bailey Luttrell Iv Aug 2016

Protein Residue-Residue Contact Prediction Using Stacked Denoising Autoencoders, Joseph Bailey Luttrell Iv

Honors Theses

Protein residue-residue contact prediction is one of many areas of bioinformatics research that aims to assist researchers in the discovery of structural features of proteins. Predicting the existence of such structural features can provide a starting point for studying the tertiary structures of proteins. This has the potential to be useful in applications such as drug design where tertiary structure predictions may play an important role in approximating the interactions between drugs and their targets without expending the monetary resources necessary for preliminary experimentation. Here, four different methods involving deep learning, support vector machines (SVMs), and direct coupling analysis were …


Gene Set Enrichment And Projection: A Computational Tool For Knowledge Discovery In Transcriptomes, Karl Douglas Stamm Jul 2016

Gene Set Enrichment And Projection: A Computational Tool For Knowledge Discovery In Transcriptomes, Karl Douglas Stamm

Dissertations (1934 -)

Explaining the mechanism behind a genetic disease involves two phases, collecting and analyzing data associated to the disease, then interpreting those data in the context of biological systems. The objective of this dissertation was to develop a method of integrating complementary datasets surrounding any single biological process, with the goal of presenting the response to a signal in terms of a set of downstream biological effects. This dissertation specifically tests the hypothesis that computational projection methods overlaid with domain expertise can direct research towards relevant systems-level signals underlying complex genetic disease. To this end, I developed a software algorithm named …


Machine Learning Methods For Brain Image Analysis, Ahmed Fakhry Jul 2016

Machine Learning Methods For Brain Image Analysis, Ahmed Fakhry

Computer Science Theses & Dissertations

Understanding how the brain functions and quantifying compound interactions between complex synaptic networks inside the brain remain some of the most challenging problems in neuroscience. Lack or abundance of data, shortage of manpower along with heterogeneity of data following from various species all served as an added complexity to the already perplexing problem. The ability to process vast amount of brain data need to be performed automatically, yet with an accuracy close to manual human-level performance. These automated methods essentially need to generalize well to be able to accommodate data from different species. Also, novel approaches and techniques are becoming …


A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang Jul 2016

A Computational Framework For Learning From Complex Data: Formulations, Algorithms, And Applications, Wenlu Zhang

Computer Science Theses & Dissertations

Many real-world processes are dynamically changing over time. As a consequence, the observed complex data generated by these processes also evolve smoothly. For example, in computational biology, the expression data matrices are evolving, since gene expression controls are deployed sequentially during development in many biological processes. Investigations into the spatial and temporal gene expression dynamics are essential for understanding the regulatory biology governing development. In this dissertation, I mainly focus on two types of complex data: genome-wide spatial gene expression patterns in the model organism fruit fly and Allen Brain Atlas mouse brain data. I provide a framework to explore …


Uusing The Kdj As A Trading Strategy On Biotech Companies, Shijie Zha May 2016

Uusing The Kdj As A Trading Strategy On Biotech Companies, Shijie Zha

Theses

Mean Reversion is the most commonly used model in quantitative trading. This model is associated with several factors, like ma5 and ma10 line. These factors are the most significant in stock markets. However, the disadvantages of this model are lag and inaccuracy.

In this research, we get the historical and current stock data by web crawler, analyze the quantitative data and build a new model involved with the KDJ. Taking biotech companies marketed in the United States and B-share marketed in China as the research subjects, the result shows increased profits compared with the Mean Reversion model. It also shows …


Gene Network Understanding And Analysis, Maria E. Somoza May 2016

Gene Network Understanding And Analysis, Maria E. Somoza

Theses

Gene regulatory network (GRN) is a collection of regulators that interact with each other in the cell to govern the gene expression levels of mRNA and proteins. These regulators can either be DNA, RNA, protein and their complex. Transcriptional gene regulation is an important mechanisms in which an in-depth study can lead to various practical applications, and a greater understanding of how organisms control their cellular behavior. One of the most widely studied organisms in gene regulatory networks are the Mycobacterium tuberculosis and Corynebacterium glutamicum ATCC 13032.

Gene co-expression networks are of biological interests due to co-expressed genes which are …


Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang May 2016

Integrated Analysis Of Mirna/Mrna Expression And Gene Methylation Using Sparse Canonical Correlation Analysis., Dake Yang

Electronic Theses and Dissertations

MicroRNAs (miRNAs) are a large number of small endogenous non-coding RNA molecules (18-25 nucleotides in length) which regulate expression of genes post-transcriptionally. While a variety of algorithms exist for determining the targets of miRNAs, they are generally based on sequence information and frequently produce lists consisting of thousands of genes. Canonical correlation analysis (CCA) is a multivariate statistical method that can be used to find linear relationships between two data sets, and here we apply CCA to find the linear combination of differentially expressed miRNAs and their corresponding target genes having maximal negative correlation. Due to the high dimensionality, sparse …


Assessing Accuracies And Improving Efficiency For Segmentation-Based Rna Secondary Structure Prediction Methods, Gerardo A. Cardenas Jan 2016

Assessing Accuracies And Improving Efficiency For Segmentation-Based Rna Secondary Structure Prediction Methods, Gerardo A. Cardenas

Open Access Theses & Dissertations

RNA secondary structure prediction has become an important area of interest in biology and medicine because it helps in understanding the mechanisms of many biological processes such as gene regulation and viral replication, and in designing RNA-based therapies to treat various diseases such as cancers and AIDS. Different thermodynamics-based computational algorithms for RNA structure prediction exist, and have been used to help understand the disease mechanisms and design treatments. However, most of these computational tools that can predict complex pseudoknot structures have a sequence length limitation of few hundred nucleotide bases due to their high demands of computer resources. Yet, …


Identifying Non-Classical Active Sites As A Tool For Enzyme Inhibition, Marisol Serrano Jan 2016

Identifying Non-Classical Active Sites As A Tool For Enzyme Inhibition, Marisol Serrano

Open Access Theses & Dissertations

Chagas disease, caused by the parasite Trypanosoma cruzi, is an endemic life-threatening disease that affects mainly the heart. It remains the leading cause of heart failure in Latin American countries. Since current treatments against this parasite are highly toxic and somewhat ineffective, novel and more efficacious types of interventions are desired. Cruzain, identified as the major cathepsin for T. cruzi, plays a major role in the parasite's life cycle; making this enzyme very attractive for potential trypanocidal drugs discovery. The recombinant cruzain is synthesized as a zymogenic pro-protein (PCZN) which possesses a pro domain and a catalytic domain. In this …


Exploring Factors Influencing Information Technology Portfolio Selection Process In Government-Funded Bioinformatics Projects, Braulio J. Cabral Jan 2016

Exploring Factors Influencing Information Technology Portfolio Selection Process In Government-Funded Bioinformatics Projects, Braulio J. Cabral

Walden Dissertations and Doctoral Studies

In 2012, the National Cancer Institute's (NCI) Board of Scientific Advisors (BSA) conducted a review of the Center for Biomedical Informatics and Information Technology's (CBIIT) bioinformatics program. The BSA suggested that the lack of a formal project selection process made it difficult to determine the alignment of projects with the mission of the organization. The problem addressed by this study was that CBIIT did not have an in-depth understanding of the project selection process and the factors influencing the process. The purpose of this study was to understand the project selection process at CBIIT. The research methodology was an exploratory …


Resolving Gnetum Evolutionary History, Angela Mcfadden Jan 2016

Resolving Gnetum Evolutionary History, Angela Mcfadden

All Master's Theses

Gnetum are non-flowering seed plants of the tropics, indigenous to South America, Africa, and Asia. This group of about 40 species is fascinating to botanists because it shares distinctive morphological characteristics with flowering plants, such as broad leaves, woody stems, and flower-like strobili. There are still questions surrounding the relationships within the genus of Gnetum. With that in mind, I focused my work on generating phylogenetic hypotheses, using two molecular data sets: a concatenation of over 60 different chloroplast genes (66,815 base pairs), and the whole chloroplast genome (128,772 base pairs). This allowed me to compare the two phylogenies …


Evaluating And Improving The Efficiency Of Software And Algorithms For Sequence Data Analysis, Hugh L. Eaves Jan 2016

Evaluating And Improving The Efficiency Of Software And Algorithms For Sequence Data Analysis, Hugh L. Eaves

Theses and Dissertations

With the ever-growing size of sequence data sets, data processing and analysis are an increasingly large portion of the time and money spent on nucleic acid sequencing projects. Correspondingly, the performance of the software and algorithms used to perform that analysis has a direct effect on the time and expense involved. Although the analytical methods are widely varied, certain types of software and algorithms are applicable to a number of areas. Targeting improvements to these common elements has the potential for wide reaching rewards. This dissertation research consisted of several projects to characterize and improve upon the efficiency of several …