Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

2017

Bioinformatics

Discipline
Institution
Publication
File Type

Articles 1 - 29 of 29

Full-Text Articles in Entire DC Network

Applying Computational Solutions For Solving Problems In Mammalian Gene Family Evolution And Single Cell Gene Expression Analysis, Ajay Obla Dec 2017

Applying Computational Solutions For Solving Problems In Mammalian Gene Family Evolution And Single Cell Gene Expression Analysis, Ajay Obla

Doctoral Dissertations

Archival abstract submitted


Focus: A Graph Approach For Data-Mining And Domain-Specific Assembly Of Next Generation Sequencing Data, Julia Sommer Dec 2017

Focus: A Graph Approach For Data-Mining And Domain-Specific Assembly Of Next Generation Sequencing Data, Julia Sommer

Theses & Dissertations

Next Generation Sequencing (NGS) has emerged as a key technology leading to revolutionary breakthroughs in numerous biomedical research areas. These technologies produce millions to billions of short DNA reads that represent a small fraction of the original target DNA sequence. These short reads contain little information individually but are produced at a high coverage of the original sequence such that many reads overlap. Overlap relationships allow for the reads to be linearly ordered and merged by computational programs called assemblers into long stretches of contiguous sequence called contigs that can be used for research applications. Although the assembly of the …


Identification Of Prognostic Cancer Biomarkers Through The Application Of Rna-Seq Technologies And Bioinformatics, Nathan Wong Dec 2017

Identification Of Prognostic Cancer Biomarkers Through The Application Of Rna-Seq Technologies And Bioinformatics, Nathan Wong

McKelvey School of Engineering Theses & Dissertations

MicroRNAs (miRNAs) are short single-stranded RNAs that function as the guide sequence of the post-transcriptional regulatory process known as the RNA-induced silencing complex (RISC), which targets mRNA sequences for degradation through complementary binding to the guide miRNA. Changes in miRNA expression have been reported as correlated with numerous biological processes, including embryonic development, cellular differentiation, and disease manifestation. In the latter case, dysregulation has been observed in response to infection by human papillomavirus (HPV), which has also been established as both oncogenic in cervical cancers and oropharyngeal cancers and favorable for overall patient survival after tumor formation. The identification of …


Fungi Of Forests: Examining The Diversity Of Root-Associated Fungi And Their Responses To Acid Deposition, Donald Jay Nelsen Dec 2017

Fungi Of Forests: Examining The Diversity Of Root-Associated Fungi And Their Responses To Acid Deposition, Donald Jay Nelsen

Graduate Theses and Dissertations

Global importance of forests is difficult to overestimate, given their role in oxygen production, ecological roles in nutrient cycling and supporting numerous living species, and economic value for industry and as recreational zones. Fitness of the forest-forming trees strongly depends on microbial communities associated with tree roots. In particular, fungi impact tree fitness: mycorrhizal species provide water and nutrients for the trees in exchange for C, endophytic fungi play key roles in host defense against pathogenic organisms, and saprotrophic fungi decompose dead organic matter and facilitate nutrient cycling. In addition, pathogenic fungal species strongly affect forest fitness. Despite their importance, …


Computational Identification Of Noncoding Driver Mutations Based On Impact On Rna Processing, Kevin Zhu Dec 2017

Computational Identification Of Noncoding Driver Mutations Based On Impact On Rna Processing, Kevin Zhu

Dissertations & Theses (Open Access)

Despite the prevalence of mutations in the noncoding regions of the DNA, their effects on cancer development remain largely uninvestigated. This is especially evident when compared to coding mutations, which have been relatively well-studied and, in certain cases, been identified as driver mutations for cancer. Recent studies, however, have identified noncoding mutations that frequently appear in certain types of cancer, which may be evidence that those mutations are important to cancer development. Nonetheless, the role of noncoding mutations in cancer remains unclear. A potential vector for understanding this mechanism is through observing the relation between noncoding mutations and functional RNA …


Software For Sequence Analysis Of Variants In Functional Screening Libraries And Personalized Genome Files, Jacklyn Michelle Newsome Dec 2017

Software For Sequence Analysis Of Variants In Functional Screening Libraries And Personalized Genome Files, Jacklyn Michelle Newsome

UNLV Theses, Dissertations, Professional Papers, and Capstones

Detailed knowledge of protein function is critical for both the study of protein interactions and the development of drugs which target specific proteins. Currently, there are few techniques that directly examine protein function. The techniques that are available are time consuming and can only address one variant of a protein at a time. Our laboratory has designed 3 high throughput protein function screens. We hypothesize that these will address this shortfall.

The first screen is the Chimeric Minimotif Decoy (CMD) Assay. For this screen, we constructed red fluorescent proteins with one or more C-terminal minimotifs. Minimotifs are short, contiguous amino …


Error Correction And De Novo Genome Assembly Of Dna Sequencing Data, Michael Z. Molnar Nov 2017

Error Correction And De Novo Genome Assembly Of Dna Sequencing Data, Michael Z. Molnar

Electronic Thesis and Dissertation Repository

The ability to obtain the genetic code of any species has caused a revolution in biological sciences. Current technologies are capable of sequencing short pieces of DNA with very high quality. These short pieces of DNA determint the sequence of bases in the genome of any species. This information is key in understanding many of the aspects of how life functions.

The accuracy of sequencing is extremely important since the differences between individuals of the same species are caused by very few changes. All sequencing technologies make errors, and before the data can be used for downstream applications it is …


Biased Genetic Screen Identifies Novel Genes Involved In Antiviral Defense, Tianyun Long Nov 2017

Biased Genetic Screen Identifies Novel Genes Involved In Antiviral Defense, Tianyun Long

LSU Doctoral Dissertations

ABSTRACT

RNA interference (RNAi) mediates potent antiviral response across kingdoms. In Caenorhabditis elegans nematodes, antiviral RNAi requires a virus sensor that is conserved in mammals and is amplified by secondary small interfering RNAs that are produced in a Dicer-independent manner.

To better understand worm antiviral RNAi, I carried out a biased genetic screen, aiming to identify novel antiviral RNAi genes. To speed up the gene discovery process, the reporter worms used for this genetic screen were engineered to contain extra copies of 4 known antiviral RNAi genes. Therefore, genetic alleles derived from these 4 genes will be automatically rejected during …


Bioinformatics And Next Generation Sequencing: Applications Of Arthropod Genomes, Zaichao Zhang Sep 2017

Bioinformatics And Next Generation Sequencing: Applications Of Arthropod Genomes, Zaichao Zhang

Electronic Thesis and Dissertation Repository

Over the past decade, the Next Generation Sequencing (NGS) technology has been broadly applied in many areas such as genomics, medical diagnosis, biotechnology, virology, biological systematics, forensic biology, and anthropology. Taken together, it has offered us brilliant insights into life sciences. Most of the work presented in this thesis describes NGS applications on genome assembly, genome annotation, and comparative genomics, using arthropods as case studies: (1) by sequencing and analyzing the genomes of three Tetranychus spider mites with three completely different feeding behaviors, we uncovered genomic signature variations and indicative of pest adaptations; (2) we sequenced, assembled and annotated five …


Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal Aug 2017

Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal

University of New Orleans Theses and Dissertations

Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a cell that …


Computational Interrogation Of Transcriptional And Post-Transcriptional Mechanisms Regulating Dendritic Development, Surajit Bhattacharya Aug 2017

Computational Interrogation Of Transcriptional And Post-Transcriptional Mechanisms Regulating Dendritic Development, Surajit Bhattacharya

Biology Dissertations

The specification and modulation of cell-type specific dendritic morphologies plays a pivotal role in nervous system development, connectivity, structural plasticity, and function. Regulation of gene expression is controlled by a wide variety of cellular and molecular mechanisms, of which two major types are transcription factors (TFs) and microRNAs (miRNAs). In Drosophila, dendritic complexity of dendritic arborization (da) sensory neurons of the peripheral nervous system are known to be regulated by two transcription factors Cut and Knot, although much remains unknown about the molecular mechanisms and regulatory networks via which they regulate the final arbor shape through spatio-temporal modulation of …


Histone Modification Chip-Seq Algorithm Engineering And High Performance Bioinformatics Graphics And Analysis Software For Computational Epigenetics, Bohdan Bohdanovich Khomtchouk Jul 2017

Histone Modification Chip-Seq Algorithm Engineering And High Performance Bioinformatics Graphics And Analysis Software For Computational Epigenetics, Bohdan Bohdanovich Khomtchouk

Open Access Dissertations

Novel algorithm design, implementation, and optimization in histone modification ChIP-seq analysis of broad chromatin mark data is the subject of part I of this dissertation, focusing on data-driven precision medicine computational strategies for mapping ChIP-seq peaks to genomic features (and biological function) as well as coverage island analysis of low-sample size ChIP-seq experiments within individual biological replicates. Part II of this dissertation focuses on novel algorithm design, implementation, and analysis of high performance visualization techniques for histone modification ChIP-seq data using static and interactive biological gene expression heatmaps.


An Integrated Bioinformatic/Experimental Approach For Discovering Novel Type Ii Polyketides Encoded In Actinobacterial Genomes, Wubin Gao Jul 2017

An Integrated Bioinformatic/Experimental Approach For Discovering Novel Type Ii Polyketides Encoded In Actinobacterial Genomes, Wubin Gao

Chemistry and Chemical Biology ETDs

Discovery of new natural products (NPs) is critical both for diseases treatment and crops protection. Numerous NP biosynthetic gene clusters (BGCs) in sequenced microbial genomes allow identification of new NPs through genome mining. Developing an integrated bioinformatic/experimental approach for discovering novel type II polyketides (PK-IIs) facilitates investigation of this family of NPs in an efficient, systematic way. Here, we developed an approach to analyze ketosynthase α/β (KSα/β) gene sequences to predict PK-II core structures, allowing us to target novel PK-II BGCs either from isolated genomic DNA or genomes from the NCBI databank, and to isolate novel PK-IIs produced by these …


The Population Genomics Of Human Microrna Gene Copy Number Variation, Julianne Murphy Jun 2017

The Population Genomics Of Human Microrna Gene Copy Number Variation, Julianne Murphy

Biology

Copy number variation (CNV) is a class of small structural variation defined as loci that vary in their number of copies between individuals due to duplication or deletion. CNV is pervasive in the human genome and can influence phenotype. However, little is known about CNV of genes encoding regulatory microRNAs (miRNAs). We developed a computational method based on variation in read depth to estimate miRNA copy number. This approach was used to quantify the copy number of 1,805 miRNA loci across 161 Yoruban (YRI) and European (CEU) genomes. The vast majority of autosomal miRNA encoding genes were present at a …


Population Genetics Of Freshwater Pearl Mussel (Margaritifera Margaritifera) In Central Massachusetts And Implications For Conservation, Stefanie Farrington Jun 2017

Population Genetics Of Freshwater Pearl Mussel (Margaritifera Margaritifera) In Central Massachusetts And Implications For Conservation, Stefanie Farrington

Biology

The freshwater pearl mussel Margaritifera margaritifera is an ecologically-important globally-endangered species, yet little is known about biodiversity and population genetics in North American populations. We focused our study on M. margaritifera from central and eastern Massachusetts, USA, to better understand the historical impact of damming and habitat fragmentation on local population structure and genetic diversity. In order to examine the local population genetics of M. margaritifera, we generated ~300 informative single nucleotide polymorphisms (SNPs) from 59 individuals across 6 geographic locations, using the RAD-seq approach. We also gleaned genotypes from publicly available RNA-seq data of 23 French M. margaritifera samples. …


Population Genomics Reveals Loss Of Odorant Receptor Gene Repertoire During Polar Bear (Ursus Maritimus) Evolution, Natalya Katerina Specian Jun 2017

Population Genomics Reveals Loss Of Odorant Receptor Gene Repertoire During Polar Bear (Ursus Maritimus) Evolution, Natalya Katerina Specian

Biology

The polar bear (Ursus maritimus) and brown bear ( Ursus arctos) are a recently diverged species pair but are morphologically, behaviorally, and physiologically distinct. These phenotypic differences reflect adaptation to local environments. Previous research aimed to identify the genetic underpinnings of adaptive traits focused mainly on single nucleotide polymorphisms (SNPs), while ignoring copy number variation (CNV). CNV refers to loci that vary in their number of copies between individuals due to duplication or deletion. Here, we computationally predicted whole genome copy number profiles across 17 polar bear and 9 brown bear genomes using FREEC. We identified hundreds of genes overlapping …


Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe May 2017

Predicting Pancreatic Cancer Using Support Vector Machine, Akshay Bodkhe

Master's Projects

This report presents an approach to predict pancreatic cancer using Support Vector Machine Classification algorithm. The research objective of this project it to predict pancreatic cancer on just genomic, just clinical and combination of genomic and clinical data. We have used real genomic data having 22,763 samples and 154 features per sample. We have also created Synthetic Clinical data having 400 samples and 7 features per sample in order to predict accuracy of just clinical data. To validate the hypothesis, we have combined synthetic clinical data with subset of features from real genomic data. In our results, we observed that …


Repeaterator: A Tool For Visualizing Dna Repeat Motifs In Actinobacteriophage Genomes, Grant A. Rybnicky May 2017

Repeaterator: A Tool For Visualizing Dna Repeat Motifs In Actinobacteriophage Genomes, Grant A. Rybnicky

Senior Honors Projects, 2010-2019

Horizontal gene transfer plays a large role in microbial genetic diversity. Bacteriophages can mediate diversity within their hosts through transduction, the uptake and dispersal of microbial host DNA between bacterial hosts. However, bacteriophages themselves experience horizontal gene transfer through mobile genetic elements and recombination. Unlike their hosts, bacteriophages cannot easily be mapped onto a phylogenic tree as they do not all possess a common trait like the 16s RNA gene. However, their genomes are typically small enough to be analyzed usingThere are tools to compare bacteriophages such as Gepard and Phamerator that compare nucleotide identity across bacteriophage entire genomes. However, …


Utilization Of Phylogenetic Analysis Methods To Understand Deep Phylogeny, Jeffrey M. O'Brien May 2017

Utilization Of Phylogenetic Analysis Methods To Understand Deep Phylogeny, Jeffrey M. O'Brien

Master's Theses

This thesis describes two analyses conducted to learn more about the early evolution of life and characteristics of the last universal common ancestor (LUCA). In chapter one, tree metric analysis methods were employed to determine if protein families identified in a previous analysis should be attributed to LUCA. It was found that many of the protein families identified by the previous analysis were likely the result of methodological errors, and either should not be attributed to LUCA or do not represent the full scope of diversity within the given protein family. Chapter two presents data from an ATP synthase catalytic …


The Microbial Ecology Of Bacterial Lignocellulosic Degradation In The Ocean, Hannah Laing Yee Woo May 2017

The Microbial Ecology Of Bacterial Lignocellulosic Degradation In The Ocean, Hannah Laing Yee Woo

Doctoral Dissertations

The overarching theme of my dissertation is to study the role of bacteria in lignocellulose degradation. In recent years, more research has investigated the biodegradability of lignocellulose for biofuel production. The components of the lignocellulosic plant cell wall are considered intrinsically recalcitrant due to their structure. However, we hypothesize that these components are not intrinsically recalcitrant but their biodegradation is contingent on the environmental conditions, particularly the bacterial diversity. We believe bacteria will become especially important in lignocellulose degradation in conditions that are unfavorable for white-rot fungi. Therefore, we investigated the potential for lignin degradation by bacteria in the ocean …


Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah May 2017

Novel Statistical Approaches For Missing Values In Truncated High-Dimensional Metabolomics Data With A Detection Threshold., Jasmit Sureshkumar Shah

Electronic Theses and Dissertations

Despite considerable advances in high throughput technology over the last decade, new challenges have emerged related to the analysis, interpretation, and integration of high-dimensional data. The arrival of omics datasets has contributed to the rapid improvement of systems biology, which seeks the understanding of complex biological systems. Metabolomics is an emerging omics field, where mass spectrometry technologies generate high dimensional datasets. As advances in this area are progressing, the need for better analysis methods to provide correct and adequate results are required. While in other omics sectors such as genomics or proteomics there has and continues to be critical understanding …


Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki Apr 2017

Development And Evaluation Of Machine Learning Algorithms For Biomedical Applications, Turki Talal Turki

Dissertations

Gene network inference and drug response prediction are two important problems in computational biomedicine. The former helps scientists better understand the functional elements and regulatory circuits of cells. The latter helps a physician gain full understanding of the effective treatment on patients. Both problems have been widely studied, though current solutions are far from perfect. More research is needed to improve the accuracy of existing approaches.

This dissertation develops machine learning and data mining algorithms, and applies these algorithms to solve the two important biomedical problems. Specifically, to tackle the gene network inference problem, the dissertation proposes (i) new techniques …


Genomic Evaluation Of Male Reproductive Adaptations And Responses To Dehydration In Peromyscus Eremicus (Cactus Mouse), Lauren Kordonowy Jan 2017

Genomic Evaluation Of Male Reproductive Adaptations And Responses To Dehydration In Peromyscus Eremicus (Cactus Mouse), Lauren Kordonowy

Doctoral Dissertations

Research elucidating the genetic architecture of physiological mechanisms enabling survival and reproduction in extreme environments is becoming prominent in evolutionary biology. The desert, in particular, poses numerous challenges for its endemic species, and mammals (and often, rodents) have been the focus for survival adaptations pertaining to water-limitation. However, desert rodent adaptation research has focused predominantly on survival, while potential physiological reproductive adaptations to dehydration have received less attention, aside from research evaluating water as reproductive cue. The fact that we do not know the physiological mechanisms enabling reproduction during dehydration is surprising, as desert rodents must possess adaptations to successfully …


Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera Jan 2017

Integrative Pathway Analysis Pipeline For Mirna And Mrna Data, Diana Mabel Diaz Herrera

Wayne State University Theses

The identification of pathways that are involved in a particular phenotype helps us understand the underlying biological processes. Traditional pathway analysis techniques aim to infer the impact on individual pathways using only mRNA levels. However, recent studies showed that gene expression alone is unable to capture the whole picture of biological phenomena. At the same time, MicroRNAs (miRNAs) are newly discovered gene regulators that have shown to play an important role in diagnosis, and prognosis for different types of diseases. Current pathway analysis techniques do not take miRNAs into consideration. In this project, we investigate the effect of integrating miRNA …


Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri Jan 2017

Network Analytics For The Mirna Regulome And Mirna-Disease Interactions, Joseph Jayakar Nalluri

Theses and Dissertations

miRNAs are non-coding RNAs of approx. 22 nucleotides in length that inhibit gene expression at the post-transcriptional level. By virtue of this gene regulation mechanism, miRNAs play a critical role in several biological processes and patho-physiological conditions, including cancers. miRNA behavior is a result of a multi-level complex interaction network involving miRNA-mRNA, TF-miRNA-gene, and miRNA-chemical interactions; hence the precise patterns through which a miRNA regulates a certain disease(s) are still elusive. Herein, I have developed an integrative genomics methods/pipeline to (i) build a miRNA regulomics and data analytics repository, (ii) create/model these interactions into networks and use optimization techniques, motif …


The Paladin Suite: Multifaceted Characterization Of Whole Metagenome Shotgun Sequences, Anthony Stephen Westbrook Jan 2017

The Paladin Suite: Multifaceted Characterization Of Whole Metagenome Shotgun Sequences, Anthony Stephen Westbrook

Master's Theses and Capstones

Whole metagenome shotgun sequencing is a powerful approach for assaying many aspects of microbial communities, including the functional and symbiotic potential of each contributing community member. The research community currently lacks tools that efficiently align DNA reads against protein references, the technique necessary for constructing functional profiles. This thesis details the creation of PALADIN – a novel modification of the Burrows-Wheeler Aligner that provides orders-of-magnitude improved efficiency by directly mapping in protein space. In addition to performance considerations, utilizing PALADIN and associated tools as the foundation of metagenomic pipelines also allows for novel characterization and downstream analysis.

The accuracy and …


Metabolomic Profiling Of Chiari Malformation Type I: Comparison Of Bioinformatic Programs For Untargeted Analysis, Hunter W. Korsmo Jan 2017

Metabolomic Profiling Of Chiari Malformation Type I: Comparison Of Bioinformatic Programs For Untargeted Analysis, Hunter W. Korsmo

Williams Honors College, Honors Research Projects

Chiari Malformation Type I is a neurodegenerative trait that can result from disease or from acquiring. Metabolomic analysis was done on normal pressure hydrocephalous and Chiari CSF samples using LC-MS and multiple bioinformatic programs. After analysis from multiple programs, we were able to analyze the stringency in statistical algorithms done by each program and determined qualities that are shared between programs that offer multiple details. We identified dysregulation in glucuronated metabolites in CSF of Chiari versus NPH. Using LC-MS, we established the experimental MS/MS of glucuronic acid in attempt to identify similarities in mass-to-charge features primarily identified. We could not …


Structure-Based Prediction Of Protein-Protein Interaction Networks Across Proteomes, Surabhi Maheshwari Jan 2017

Structure-Based Prediction Of Protein-Protein Interaction Networks Across Proteomes, Surabhi Maheshwari

LSU Doctoral Dissertations

Protein-protein interactions (PPIs) orchestrate virtually all cellular processes, therefore, their exhaustive exploration is essential for the comprehensive understanding of cellular networks. Significant efforts have been devoted to expand the coverage of the proteome-wide interaction space at molecular level. A number of experimental techniques have been developed to discover PPIs, however these approaches have some limitations such as the high costs and long times of experiments, noisy data sets, and often high false positive rate and inter-study discrepancies. Given experimental limitations, computational methods are increasingly becoming important for detection and structural characterization of PPIs. In that regard, we have developed a …


Identification Of Novel Sleep Related Genes From Large Scale Phenotyping Experiments In Mice, Shreyas Joshi Jan 2017

Identification Of Novel Sleep Related Genes From Large Scale Phenotyping Experiments In Mice, Shreyas Joshi

Theses and Dissertations--Biology

Humans spend a third of their lives sleeping but very little is known about the physiological and genetic mechanisms controlling sleep. Increased data from sleep phenotyping studies in mouse and other species, genetic crosses, and gene expression databases can all help improve our understanding of the process. Here, we present analysis of our own sleep data from the large-scale phenotyping program at The Jackson Laboratory (JAX), to identify the best gene candidates and phenotype predictors for influencing sleep traits.

The original knockout mouse project (KOMP) was a worldwide collaborative effort to produce embryonic stem (ES) cell lines with one of …