Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

PDF

Iowa State University

Bioinformatics

Keyword
Publication Year
Publication
Publication Type

Articles 1 - 30 of 358

Full-Text Articles in Entire DC Network

Pyrpipe: A Python Package For Rna-Seq Workflows, Urminder Singh, Jing Li, Arun Seetharam, Eve Syrkin Wurtele Mar 2020

Pyrpipe: A Python Package For Rna-Seq Workflows, Urminder Singh, Jing Li, Arun Seetharam, Eve Syrkin Wurtele

Genetics, Development and Cell Biology Publications

Implementing RNA-Seq analysis pipelines is challenging as data gets bigger and more complex. With the availability of terabytes of RNA-Seq data and continuous development of analysis tools, there is a pressing requirement for frameworks that allow for fast and efficient development, modification, sharing and reuse of workflows. Scripting is often used, but it has many challenges and drawbacks. We have developed a python package, python RNA-Seq Pipeliner (pyrpipe) that enables straightforward development of flexible, reproducible and easy-to-debug computational pipelines purely in python, in an object-oriented manner. pyrpipe provides high level APIs to popular RNA-Seq tools. Pipelines can be customized by ...


De Novo Transcriptome Of Phakopsora Pachyrhizi Uncovers Putative Effector Repertoire During Infection, Manjula G. Elmore, Sagnik Banerjee, Kerry F. Pedley, Amy Ruck, Steven A. Whitham Jan 2020

De Novo Transcriptome Of Phakopsora Pachyrhizi Uncovers Putative Effector Repertoire During Infection, Manjula G. Elmore, Sagnik Banerjee, Kerry F. Pedley, Amy Ruck, Steven A. Whitham

Plant Pathology and Microbiology Publications

Phakopsora pachyrhizi, which causes Asian soybean rust (ASR), secretes effector proteins to manipulate host immunity and promote disease. To date, only a small number of effectors have been identified from transcriptome studies. To obtain a more comprehensive understanding of P. pachyrhizi candidate secreted effector proteins (CSEPs), we sequenced the transcriptome using two next-generation sequencing technologies. Short-read Illumina RNA-Seq data was used for reducing base-calling errors for long-read PacBio Iso-Seq. After initial de novo assemblies for RNA-seq and error correction of transcripts for Iso-Seq followed by filtering, we obtained 8,528, 27,647, 26,895, and 17,141 non-plant, non-soybean transcripts ...


A Meta-Analysis Of Global Fungal Distribution Reveals Climate-Driven Patterns, Tomáš Větrovský, Petr Kohout, Martin Kopecký, Antonin Machac, Matěj Man, Barbara Doreen Bahnmann, Vendula Brabcová, Jinlyung Choi, Lenka Meszárošová, Zander Rainier Human, Clémentine Lepinay, Salvador Lladó, Rubén López-Mondéjar, Tijana Martinović, Tereza Mašínová, Daniel Morais, Diana Navrátilová, Iñaki Odriozola, Martina Štursová, Karel Švec, Vojtěch Tláskal, Michaela Urbanová, Joe Wan, Lucia Žifčáková, Adina Howe, Joshua Ladau, Kabir Gabriel Peay, David Storch, Jan Wild, Petr Baldrian Nov 2019

A Meta-Analysis Of Global Fungal Distribution Reveals Climate-Driven Patterns, Tomáš Větrovský, Petr Kohout, Martin Kopecký, Antonin Machac, Matěj Man, Barbara Doreen Bahnmann, Vendula Brabcová, Jinlyung Choi, Lenka Meszárošová, Zander Rainier Human, Clémentine Lepinay, Salvador Lladó, Rubén López-Mondéjar, Tijana Martinović, Tereza Mašínová, Daniel Morais, Diana Navrátilová, Iñaki Odriozola, Martina Štursová, Karel Švec, Vojtěch Tláskal, Michaela Urbanová, Joe Wan, Lucia Žifčáková, Adina Howe, Joshua Ladau, Kabir Gabriel Peay, David Storch, Jan Wild, Petr Baldrian

Agricultural and Biosystems Engineering Publications

The evolutionary and environmental factors that shape fungal biogeography are incompletely understood. Here, we assemble a large dataset consisting of previously generated mycobiome data linked to specific geographical locations across the world. We use this dataset to describe the distribution of fungal taxa and to look for correlations with different environmental factors such as climate, soil and vegetation variables. Our meta-study identifies climate as an important driver of different aspects of fungal biogeography, including the global distribution of common fungi as well as the composition and diversity of fungal communities. In our analysis, fungal diversity is concentrated at high latitudes ...


Investigating The Dispersal Of Antibiotic Resistance Associated Genes From Manure Application To Soil And Drainage Waters In Simulated Agricultural Farmland Systems, Schuyler D. Smith, Phillip Colgan, Fan Yang, Elizabeth L. Rieke, Michelle L. Soupir, Thomas B. Moorman, Heather K. Allen, Adina Howe Sep 2019

Investigating The Dispersal Of Antibiotic Resistance Associated Genes From Manure Application To Soil And Drainage Waters In Simulated Agricultural Farmland Systems, Schuyler D. Smith, Phillip Colgan, Fan Yang, Elizabeth L. Rieke, Michelle L. Soupir, Thomas B. Moorman, Heather K. Allen, Adina Howe

Agricultural and Biosystems Engineering Publications

Manure from animals that have been treated with antibiotics is often used to fertilize agricultural soils and its application has previously been shown to enrich for genes associated with antibiotic resistance in agroecosystems. To investigate the magnitude of this effect, we designed a column experiment simulating manure-treated agricultural soil that utilizes artificial subsurface drainage to determine the duration and extent which this type of manure fertilization impacts the set of genes associated with antibiotic resistance in drainage water. We classified ARGs in manure-treated drainage effluent water by its source of origin. Overall, we found that 61% and 7% of the ...


The Gene Sculpt Suite: A Set Of Tools For Genome Editing, Carla M. Mann, Gabriel Martínez-Gálvez, Jordan M. Welker, Wesley A. Wierson, Hirotaka Ata, Maira P. Almeida, Karl J. Clark, Jeffrey J. Essner, Maura Mcgrail, Stephen C. Ekker, Drena Dobbs Jul 2019

The Gene Sculpt Suite: A Set Of Tools For Genome Editing, Carla M. Mann, Gabriel Martínez-Gálvez, Jordan M. Welker, Wesley A. Wierson, Hirotaka Ata, Maira P. Almeida, Karl J. Clark, Jeffrey J. Essner, Maura Mcgrail, Stephen C. Ekker, Drena Dobbs

Genetics, Development and Cell Biology Publications

The discovery and development of DNA-editing nucleases (Zinc Finger Nucleases, TALENs, CRISPR/Cas systems) has given scientists the ability to precisely engineer or edit genomes as never before. Several different platforms, protocols and vectors for precision genome editing are now available, leading to the development of supporting web-based software. Here we present the Gene Sculpt Suite (GSS), which comprises three tools: (i) GTagHD, which automatically designs and generates oligonucleotides for use with the GeneWeld knock-in protocol; (ii) MEDJED, a machine learning method, which predicts the extent to which a double-stranded DNA break site will utilize the microhomology-mediated repair pathway; and ...


Coupling Dynamics And Evolutionary Information With Structure To Identify Protein Regulatory And Functional Binding Sites, Sambit Kumar Mishra, Gaurav Kandoi, Robert L. Jernigan May 2019

Coupling Dynamics And Evolutionary Information With Structure To Identify Protein Regulatory And Functional Binding Sites, Sambit Kumar Mishra, Gaurav Kandoi, Robert L. Jernigan

Biochemistry, Biophysics and Molecular Biology Publications

Binding sites in proteins can be either specifically functional binding sites (active sites) that bind specific substrates with high affinity or regulatory binding sites (allosteric sites), that modulate the activity of functional binding sites through effector molecules. Owing to their significance in determining protein function, the identification of protein functional and regulatory binding sites is widely acknowledged as an important biological problem. In this work, we present a novel binding site prediction method, AR-Pred (Active and Regulatory site Prediction), which supplements protein geometry, evolutionary and physicochemical features with information about protein dynamics to predict putative active and allosteric site residues ...


Stress Response To Co2 Deprivation By Arabidopsis Thaliana In Plant Cultures, Souvik Banerjee, Oskar Siemianowski, Meiling Liu, Kara R. Lind, Xinchun Tian, Dan Nettleton, Ludovico Cademartiri Mar 2019

Stress Response To Co2 Deprivation By Arabidopsis Thaliana In Plant Cultures, Souvik Banerjee, Oskar Siemianowski, Meiling Liu, Kara R. Lind, Xinchun Tian, Dan Nettleton, Ludovico Cademartiri

Statistics Publications

After being the standard plant propagation protocol for decades, cultures of Arabidopsis thaliana sealed with Parafilm remain common today out of practicality, habit, or necessity (as in co-cultures with microorganisms). Regardless of concerns over the aeration of these cultures, no investigation has explored the CO2 transport inside these cultures and its effect on the plants. Thereby, it was impossible to assess whether Parafilm-seals used today or in thousands of older papers in the literature constitute a treatment, and whether this treatment could potentially affect the study of other treatments.For the first time we report the CO2concentrations in Parafilm-sealed cultures ...


Transcriptional And Chemical Changes In Soybean Leaves In Response To Long-Term Aphid Colonization, Jessica D. Hohenstein, Matthew Studham, Adam Klein, Nik Kovinich, Kia Barry, Young-Jin Lee, Gustavo C. Macintosh Mar 2019

Transcriptional And Chemical Changes In Soybean Leaves In Response To Long-Term Aphid Colonization, Jessica D. Hohenstein, Matthew Studham, Adam Klein, Nik Kovinich, Kia Barry, Young-Jin Lee, Gustavo C. Macintosh

Chemistry Publications

Soybean aphids (Aphis glycines Matsumura) are specialized insects that feed on soybean (Glycine max) phloem sap. Transcriptome analyses have shown that resistant soybean plants mount a fast response that limits aphid feeding and population growth. Conversely, defense responses in susceptible plants are slower and it is hypothesized that aphids block effective defenses in the compatible interaction. Unlike other pests, aphids can colonize plants for long periods of time; yet the effect on the plant transcriptome after long-term aphid feeding has not been analyzed for any plant–aphid interaction. We analyzed the susceptible and resistant (Rag1) transcriptome response to aphid feeding ...


Algorithms For Synteny-Based Phylostratigraphy And Gene Origin Classification, Zebulun Arendsee Jan 2019

Algorithms For Synteny-Based Phylostratigraphy And Gene Origin Classification, Zebulun Arendsee

Graduate Theses and Dissertations

With every newly sequenced species we discover hundreds of novel protein coding genes. Many of these "orphan" genes have been experimentally proven to have dramatic functions in development, sexual dimorphism, pathogen resistance, and social traits like symbiosis. Whereas in the past, researchers viewed genes as the product of continuous variation acting on ancient material, we now know that novel genes may arise de novo from non-genic sequence. Thus evolutionary experimentation is not limited to tweaking existing genes or their regulatory patterns. Any orphan genes that arose in the distant past, should appear today as lineage-specific genes (or gene families). The ...


Genetics And Transcriptomics Of Host Response To Prrs In Nursery Pigs, Qian Dong Jan 2019

Genetics And Transcriptomics Of Host Response To Prrs In Nursery Pigs, Qian Dong

Graduate Theses and Dissertations

The overall objective of this dissertation was to investigate the genetic and molecular mechanisms of host response to porcine reproductive and respiratory syndrome (PRRS) virus (PRRSV), and to identify biomarkers in pigs to improve host response to PRRS and reduce PRRSV persistence in pigs. Because pigs that are only infected with PRRSV rarely exist in the industry, host transcriptome responses to vaccination with a PRRS modified-live virus (MLV) and to co-infection with PRRS and porcine circovirus type 2b (PCV2b), with or without prior, were also investigated. The first study reported in this dissertation was designed to investigate mechanisms of PRRSV ...


Machine Learning Tools For Mrna Isoform Function Prediction, Gaurav Kandoi Jan 2019

Machine Learning Tools For Mrna Isoform Function Prediction, Gaurav Kandoi

Graduate Theses and Dissertations

This dissertation is focused on improving mRNA isoform characterization in terms of functional networks, function prediction and tissue-specificity. There are three major challenges in solving these problems. The first is the unavailability of mRNA isoform level functional data which is required to develop machine learning tools. However, the available data, even at the gene level doesn’t include all genes, further complicating the matter. The second challenge is the lack of information about tissue-specificity in functional databases such as Gene Ontology, Kyoto Encyclopedia of Genes and Genomes and UniProt. The third challenge is the lack of mRNA isoform level “ground ...


Applications Of Machine Learning To Solve Biological Puzzles, Carla M. Mann Jan 2019

Applications Of Machine Learning To Solve Biological Puzzles, Carla M. Mann

Graduate Theses and Dissertations

The era of “big data” has led to the generation of more biological data than any human could hope to process. This flood of data has necessitated the development of computational methods to assist in analysis, and has made it possible to begin to model complex biological systems. Machine learning methods represent one avenue for modeling, and allow for the identification of intricate and often cryptic sequence signals underlying many biological processes.

In this dissertation, I present two machine learning models, RPIDisorder and MEDJED, which were developed to predict RNA-protein interaction partners (RPIPs) and DNA double-strand break (DSB) repair by ...


Hierarchical Phylogeny Construction, Anindya Das Jan 2019

Hierarchical Phylogeny Construction, Anindya Das

Graduate Theses and Dissertations

Construction of a phylogenetic tree for a number of species from their genome sequence is very important for understanding the evolutionary history of those species. Rapid improvements in DNA sequencing technology have generated sequence data for huge number of similar isolates with a wide range of single nucleotide polymorphism (SNP) rates, where the SNP rate among some isolates can be thousands of times lower than the others. This kind of genome sequences are difficult for the existing methods because the subtree(s) (or clade) consisting of species or isolates with very low SNP rates may have a very low level ...


The Evolution Of The Mitochondrial Proteome In Animals, Viraj Muthye Jan 2019

The Evolution Of The Mitochondrial Proteome In Animals, Viraj Muthye

Graduate Theses and Dissertations

Mitochondria are subcellular organelles in eukaryotes which possess their own genome. While they are most well-known for their role in energy metabolism via oxidative phosphorylation, research has shown that mitochondria are involved in diverse critical cellular functions like Fe/S cluster biosynthesis, apoptosis, signaling, etc. In mammals, over 1,500 proteins carry out these functions in the mitochondria. A small portion of these proteins ( ~ 1%) is contributed by the mitochondrial genome, whereas the vast majority (~ 99%) are encoded in the nuclear genome and transported into the organelle. This set of nuclear-encoded mitochondrial proteins is defined as the "mitochondrial proteome". The ...


Distinct Teosinte Hybrid Zones And Genomic Architectures Of Hybridization, David Edward Hufnagel Jan 2019

Distinct Teosinte Hybrid Zones And Genomic Architectures Of Hybridization, David Edward Hufnagel

Graduate Theses and Dissertations

Hybridization is a major force of evolution and has profound consequences due to increased heterozygosity and the creation of novel allele combinations. Hybrid zones form when allopatric taxa meet in secondary contact and hybridize. These allele combinations can be random, but sometimes alleles are retained nonrandomly leading to genomic architectures of hybridization. Genomic architectures of hybridization can be the product of natural selection, especially when present at significantly higher levels than expected by chance. Here we used three SNP data sets to identify hybrids in four genotypically distinct putative hybrid zones with unique environments. We identified genomic architectures of hybridization ...


Transfer Learning Towards Combating Antibiotic Resistance, Md Nafiz Hamid Jan 2019

Transfer Learning Towards Combating Antibiotic Resistance, Md Nafiz Hamid

Graduate Theses and Dissertations

Transfer learning with deep neural networks has revolutionized the fields of computer vision and natural language processing in the last decade. This is especially significant for fields such as biology where we usually have small labeled data but an abundance of unlabeled data. Using abundant unlabeled data to enhance performance on a small labeled dataset is the hallmark of transfer learning. In this dissertation, I tap into the potential of transfer learning to solve critical problems in the antibiotic resistance domain. Antibiotic resistance occurs when bacteria gain functionality to thwart mechanisms through which antibiotics work to kill or inhibit bacteria ...


Searching For The Origin Of Protein Conformational Changes: Protein Responses To Specific Forces In Simulations, Yuan Wang Jan 2019

Searching For The Origin Of Protein Conformational Changes: Protein Responses To Specific Forces In Simulations, Yuan Wang

Graduate Theses and Dissertations

It is widely accepted that the structure of a protein and its motions are critical for a protein’s function, and that protein functions are usually accompanied by highly specific conformational changes. However, in many cases it is still unclear how the details of motion relate to a protein’s functionality and especially what causes conformational changes, despite having a significant number of proteins with multiple experimentally determined conformations. Here we investigate the conformational changes in proteins by collecting ensembles of different conformations of the same protein structure and simulate the application of external forces originating from exothermic chemical reactions ...


Interrogating The Development Of Enteric Nervous System In Zebrafish Using Transcriptomics, Sweta Roy-Carson Jan 2019

Interrogating The Development Of Enteric Nervous System In Zebrafish Using Transcriptomics, Sweta Roy-Carson

Graduate Theses and Dissertations

The enteric nervous system (ENS) is the set of neurons that control the activity of the gastrointestinal system. These activities include secretion of digestive juices, absorption of food, and motility of the gut. The enteric neurons are derived from the neural crest cells (NCC) which migrate to the gut during development. We have a sparse knowledge of the genes and the signaling pathways that are known to be involved in the migration, specification, and differentiation of the enteric neurons from neural crest precursors. Malfunction in any of these processes hampers normal ENS development and can result in a variety of ...


Domain-Specific Language And Infrastructure For Genomics, Hamid Bagheri Jan 2019

Domain-Specific Language And Infrastructure For Genomics, Hamid Bagheri

Graduate Theses and Dissertations

Creating a scalable computational infrastructure to analyze the wealth of information contained in data repositories is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared data science infrastructures are needed to efficiently process and parse data contained in large data repositories. This thesis introduces Boag, Boa for genomics. The main features of Boa are inspired from existing languages for data-intensive computing and can easily integrate data from biological data repositories.

As a proof of concept, Boag, has been implemented to analyze RefSeq's 153,848 annotation (GFF) and assembly (FASTA) file metadata. Boa provides a massive ...


In-Silico Guided Identification Of Ciliogenesis Candidate Genes In A Non-Conventional Animal Model, Natalia I. Acevedo Luna Jan 2019

In-Silico Guided Identification Of Ciliogenesis Candidate Genes In A Non-Conventional Animal Model, Natalia I. Acevedo Luna

Graduate Theses and Dissertations

The annelid Platynereis dumerilii is increasingly used as a model organism for developmental comparative studies due to its phylogenetic position and the accessibility of embryos that exhibit a stereotypic cleavage pattern and invariant cell lineages with predictable cell fates. To develop this unconventional model we established PdumBase, a comprehensive data base and intuitive online user interface based on stage specific transcriptomic data that allows genome wide identification of gene families contributing to particular biological processes during early developmental stages. One such important biological process is ciliogenesis, the formation of cilia, organelles associated with a variety of cellular roles such as ...


Arabidopsis Bioinformatics Resources: The Current State, Challenges, And Priorities For The Future, Colleen Doherty, Justin Walley, Eve Wurtele, Et Al. Jan 2019

Arabidopsis Bioinformatics Resources: The Current State, Challenges, And Priorities For The Future, Colleen Doherty, Justin Walley, Eve Wurtele, Et Al.

Genetics, Development and Cell Biology Publications

Effective research, education, and outreach efforts by the Arabidopsis thalianacommunity, as well as other scientific communities that depend on Arabidopsis resources, depend vitally on easily available and publicly‐shared resources. These resources include reference genome sequence data and an ever‐increasing number of diverse data sets and data types. TAIR (The Arabidopsis Information Resource) and Araport (originally named the Arabidopsis Information Portal) are community informatics resources that provide tools, data, and applications to the more than 30,000 researchers worldwide that use in their work either Arabidopsis as a primary system of study or data derived from Arabidopsis. Four ...


Crowdsourcing Image Analysis For Plant Phenomics To Generate Ground Truth Data For Machine Learning, Naihui Zhou, Zachary D. Siegel, Scott Zarecor, Nigel Lee, Darwin A. Campbell, Carson M. Andorf, Dan Nettleton, Carolyn J. Lawrence-Dill, Baskar Ganapathysubramanian, Jonathan W. Kelly, Iddo Friedberg Jul 2018

Crowdsourcing Image Analysis For Plant Phenomics To Generate Ground Truth Data For Machine Learning, Naihui Zhou, Zachary D. Siegel, Scott Zarecor, Nigel Lee, Darwin A. Campbell, Carson M. Andorf, Dan Nettleton, Carolyn J. Lawrence-Dill, Baskar Ganapathysubramanian, Jonathan W. Kelly, Iddo Friedberg

Mechanical Engineering Publications

The accuracy of machine learning tasks critically depends on high quality ground truth data. Therefore, in many cases, producing good ground truth data typically involves trained professionals; however, this can be costly in time, effort, and money. Here we explore the use of crowdsourcing to generate a large number of training data of good quality. We explore an image analysis task involving the segmentation of corn tassels from images taken in a field setting. We investigate the accuracy, speed and other quality metrics when this task is performed by students for academic credit, Amazon MTurk workers, and Master Amazon MTurk ...


Protein Dynamic Communities From Elastic Network Models Align Closely To The Communities Defined By Molecular Dynamics, Sambit Kumar Mishra, Robert L. Jernigan Jun 2018

Protein Dynamic Communities From Elastic Network Models Align Closely To The Communities Defined By Molecular Dynamics, Sambit Kumar Mishra, Robert L. Jernigan

Biochemistry, Biophysics and Molecular Biology Publications

Dynamic communities in proteins comprise the cohesive structural units that individually exhibit rigid body motions. These can correspond to structural domains, but are usually smaller parts that move with respect to one another in a protein’s internal motions, key to its functional dynamics. Previous studies emphasized their importance to understand the nature of ligand-induced allosteric regulation. These studies reported that mutations to key community residues can hinder transmission of allosteric signals among the communities. Usually molecular dynamic (MD) simulations (~ 100 ns or longer) have been used to identify the communities—a demanding task for larger proteins. In the present ...


Response To Persistent Er Stress In Plants: A Multiphasic Process That Transitions Cells From Prosurvival Activities To Cell Death, Renu Srivastava, Zhaoxia Li, Giulia Russo, Jie Tang, Ran Bi, Usha Muppirala, Sivanandan Chudalayandi, Andrew J. Severin, Mingze He, Samuel I. Vaitkevicius, Carolyn J. Lawrence-Dill, Peng Liu, Ann E. Stapleton, Diane C. Bassham, Federica Brandizzi, Stephen H. Howell May 2018

Response To Persistent Er Stress In Plants: A Multiphasic Process That Transitions Cells From Prosurvival Activities To Cell Death, Renu Srivastava, Zhaoxia Li, Giulia Russo, Jie Tang, Ran Bi, Usha Muppirala, Sivanandan Chudalayandi, Andrew J. Severin, Mingze He, Samuel I. Vaitkevicius, Carolyn J. Lawrence-Dill, Peng Liu, Ann E. Stapleton, Diane C. Bassham, Federica Brandizzi, Stephen H. Howell

Office of Biotechnology Publications

The unfolded protein response (UPR) is a highly conserved response that protects plants from adverse environmental conditions. The UPR is elicited by endoplasmic reticulum (ER) stress, in which unfolded and misfolded proteins accumulate within the ER. Here, we induced the UPR in maize (Zea mays) seedlings to characterize the molecular events that occur over time during persistent ER stress. We found that a multiphasic program of gene expression was interwoven among other cellular events, including the induction of autophagy. One of the earliest phases involved the degradation by regulated IRE1-dependent RNA degradation (RIDD) of RNA transcripts derived from a family ...


Comparisons Of Protein Dynamics From Experimental Structure Ensembles, Molecular Dynamics Ensembles, And Coarse-Grained Elastic Network Models, Kannan Sankar, Sambit K. Mishra, Robert L. Jernigan Jan 2018

Comparisons Of Protein Dynamics From Experimental Structure Ensembles, Molecular Dynamics Ensembles, And Coarse-Grained Elastic Network Models, Kannan Sankar, Sambit K. Mishra, Robert L. Jernigan

Biochemistry, Biophysics and Molecular Biology Publications

Predicting protein motions is important for bridging the gap between protein structure and function. With growing numbers of structures of the same, or closely related proteins becoming available, it is now possible to understand more about the intrinsic dynamics of a protein with principal component analysis (PCA) of the motions apparent within ensembles of experimental structures. In this paper, we compare the motions extracted from experimental ensembles of 50 different proteins with the modes of motion predicted by several types of coarse-grained elastic network models (ENMs) which additionally take into account more details of either the protein geometry or the ...


Visualization Methods For Genealogical And Rna-Sequencing Studies: Pertinence, Software, And Applications, Lindsay Rutter Jan 2018

Visualization Methods For Genealogical And Rna-Sequencing Studies: Pertinence, Software, And Applications, Lindsay Rutter

Graduate Theses and Dissertations

As is the case in many fields, biological disciplines are now facing the challenges of increasingly large and complex data. Biologists must now process and meaningfully interpret a deluge of data, and one necessary approach toward accomplishing this goal is through the use of visualization. Ultimately, the objective of developing visualization tools for biological data is to provide biologists with enhanced insight into the processes within organelles, cells, organs, and even whole organisms. R is a free interpretive programming language for statistical computing and graphics. It is widely used by statisticians to develop statistical software and data analysis tools, and ...


Shared Data Science Infrastructure For Genomics Data, Hamid Bagheri, Usha Muppirala, Andrew J. Severin, Hridesh Rajan Jan 2018

Shared Data Science Infrastructure For Genomics Data, Hamid Bagheri, Usha Muppirala, Andrew J. Severin, Hridesh Rajan

Office of Biotechnology Publications

Creating a computational infrastructure to analyze the wealth of information contained in data repositories that scales well is difficult due to significant barriers in organizing, extracting and analyzing relevant data. Shared Data Science Infrastructures like Boa can be used to more efficiently process and parse data contained in large data repositories. The main features of Boa are inspired from existing languages for data intensive computing and can easily integrate data from biological data repositories. Here, we present an implementation of Boa for Genomic research (BoaG) on a relatively small data repository: RefSeq's 97,716 annotation (GFF) and assembly (FASTA ...


Dissecting Complex Phenotypes Via Multiple Transcriptome-Based Gwas, Hung-Ying Lin Jan 2018

Dissecting Complex Phenotypes Via Multiple Transcriptome-Based Gwas, Hung-Ying Lin

Graduate Theses and Dissertations

Genome-Wide Association Study (GWAS) have been widely used to detect the QTLs based on Linkage Disequilibrium (LD) relationships between SNPs and QTLs. However, in conventional GWAS false positive results cause serious concerns. In this research, we developed three different transcriptome-based GWAS approaches which are complementary to conventional SNP-based GWAS. The ability to identify trait-associated genes in these three different methods are supported by cross-validation, transposon knockout mutants, and the analysis of a gene regulatory networks. In summary, we provide novel methods of detecting trait associated loci to further understand the complex gene regulatory systems which will benefit plants, animals, and ...


Experimental And Computational Methods To Assign Gene Function To Maize Genes, Wimalanathan Kokulapalan Jan 2018

Experimental And Computational Methods To Assign Gene Function To Maize Genes, Wimalanathan Kokulapalan

Graduate Theses and Dissertations

Maize is an important crop species and is the highest produced cereal crop in the world as well as a model species for genetics and genomics research. For this reason, researchers have been very successful in translating understanding of basic biological processes into improved crops for over 100 years. Maize researchers have a long history of utilizing genetic techniques to dissect the function of genes that control biological processes. Characterizing and cloning mutants precisely defines gene function but is a slow process that can take years to accomplish. Alternatively, computational methods provide a faster way to assign predicted function to ...


Using Evolutionary Covariance To Infer Protein Sequence-Structure Relationships, Kejue Jia Jan 2018

Using Evolutionary Covariance To Infer Protein Sequence-Structure Relationships, Kejue Jia

Graduate Theses and Dissertations

During the last half century, a deep knowledge of the actions of proteins has emerged from a broad range of experimental and computational methods. This means that there are now many opportunities for understanding how the varieties of proteins affect larger scale behaviors of organisms, in terms of phenotypes and diseases. It is broadly acknowledged that sequence, structure and dynamics are the three essential components for understanding proteins. Learning about the relationships among protein sequence, structure and dynamics becomes one of the most important steps for understanding the mechanisms of proteins. Together with the rapid growth in the efficiency of ...