Comparative Phylogeographic, Population Genomic, And Selection Inference With Development Of Hierarchical Co-Demographic Models, 2017 The Graduate Center, City University of New York
Comparative Phylogeographic, Population Genomic, And Selection Inference With Development Of Hierarchical Co-Demographic Models, Alexander Xue
All Graduate Works by Year: Dissertations, Theses, and Capstone Projects
Comparing demographic histories across assemblages of populations, species, and sister pairs has been a focus in phylogeography since its inception. Initial approaches utilized organelle genetic data and involved qualitative comparisons of genetic patterns for evaluating hypotheses of shared evolutionary responses to past environmental changes. This endeavor has progressed with coalescent model-based statistical techniques and advances in next-generation sequencing, yet there remains a need for methods that can analyze aggregated genomic-scale data from non-model organisms within a unified framework that considers individual taxon uncertainty and variance. To this end, the aggregate site frequency spectrum (aSFS), an expansion of the site frequency ...
Regulatory Rna: Session Introduction, 2017 Iowa State University
Regulatory Rna: Session Introduction, Drena Dobbs, Steven E. Brenner, Vasant G. Honavar, Robert L. Jernigan, Alain Laederach, Quaid Morris
Advances in both experimental and computational approaches to genome-wide analysis of RNA transcripts have dramatically expanded our understanding of the ubiquitous and diverse roles of regulatory non-coding RNAs. This conference session includes presentations exploring computational approaches for detecting regulatory RNAs in RNA-Seq data, for analyzing in vivo CLIP data on RNA-protein interactions, and for predicting interfacial residues involved in RNA-protein recognition in RNA–protein complexes and interaction networks.
Elastic Network Models Capture The Motions Apparent Within Ensembles Of Rna Structures, 2017 Iowa State University
Elastic Network Models Capture The Motions Apparent Within Ensembles Of Rna Structures, Michael T. Zimmermann, Robert L. Jernigan
The role of structure and dynamics in mechanisms for RNA becomes increasingly important. Computational approaches using simple dynamics models have been successful at predicting the motions of proteins and are often applied to ribonucleo-protein complexes but have not been thoroughly tested for well-packed nucleic acid structures. In order to characterize a true set of motions, we investigate the apparent motions from 16 ensembles of experimentally determined RNA structures. These indicate a relatively limited set of motions that are captured by a small set of principal components (PCs). These limited motions closely resemble the motions computed from low frequency normal modes ...
Prediction Of Rna Binding Sites In Proteins From Amino Acid Sequence, 2017 Iowa State University
Prediction Of Rna Binding Sites In Proteins From Amino Acid Sequence, Michael Terribilini, Jae-Hyung Lee, Changhui Yan, Robert L. Jernigan, Vasant Honavar, Drena Dobbs
RNA–protein interactions are vitally important in a wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses. We have developed a computational tool for predicting which amino acids of an RNA binding protein participate in RNA–protein interactions, using only the protein sequence as input. RNABindR was developed using machine learning on a validated nonredundant data set of interfaces from known RNA–protein complexes in the Protein Data Bank. It generates a classifier that captures primary sequence signals sufficient for predicting which amino acids in a given protein are located ...
Predicting Dna-Binding Sites Of Proteins From Amino Acid Sequence, 2017 Utah State University
Predicting Dna-Binding Sites Of Proteins From Amino Acid Sequence, Changhui Yan, Michael Terribilini, Feihong Wu, Robert L. Jernigan, Drena Dobbs, Vasant Honavar
Understanding the molecular details of protein-DNA interactions is critical for deciphering the mechanisms of gene regulation. We present a machine learning approach for the identification of amino acid residues involved in protein-DNA interactions.
We start with a Naïve Bayes classifier trained to predict whether a given amino acid residue is a DNA-binding residue based on its identity and the identities of its sequence neighbors. The input to the classifier consists of the identities of the target residue and 4 sequence neighbors on each side of the target residue. The classifier is trained and evaluated (using leave-one-out cross-validation) on ...
Identifying Interaction Sites In "Recalcitrant" Proteins: Predicted Protein And Rna Binding Sites In Rev Proteins Of Hiv-1 And Eiav Agree With Experimental Data, Michael Terribilini, Jae-Hyung Lee, Changhui Yan, Robert L. Jernigan, Susan Carpenter, Vasant Honavar, Drena Dobbs
Protein-protein and protein nucleic acid interactions are vitally important for a wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses. We have developed machine learning approaches for predicting which amino acids of a protein participate in its interactions with other proteins and/or nucleic acids, using only the protein sequence as input. In this paper, we describe an application of classifiers trained on datasets of well-characterized protein-protein and protein-RNA complexes for which experimental structures are available. We apply these classifiers to the problem of predicting protein and RNA binding sites ...
Rnabindr: A Server For Analyzing And Predicting Rna-Binding Sites In Proteins, 2017 Iowa State University
Rnabindr: A Server For Analyzing And Predicting Rna-Binding Sites In Proteins, Michael Terribilini, Jeffry D. Sander, Jae-Hyung Lee, Peter Zaback, Robert L. Jernigan, Vasant Honavar, Drena Dobbs
Understanding interactions between proteins and RNA is key to deciphering the mechanisms of many important biological processes. Here we describe RNABindR, a web-based server that identifies and displays RNA-binding residues in known protein–RNA complexes and predicts RNA-binding residues in proteins of unknown structure. RNABindR uses a distance cutoff to identify which amino acids contact RNA in solved complex structures (from the Protein Data Bank) and provides a labeled amino acid sequence and a Jmol graphical viewer in which RNA-binding residues are displayed in the context of the three-dimensional structure. Alternatively, RNABindR can use a Naive Bayes classifier trained on ...
Predicting Binding Sites Of Hydrolase-Inhibitor Complexes By Combining Several Methods, 2017 Iowa State University
Predicting Binding Sites Of Hydrolase-Inhibitor Complexes By Combining Several Methods, Taner Z. Sen, Andrzej Kloczkowski, Robert L. Jernigan, Changhui Yan, Vasant Honovar, Kai-Ming Ho, Cai-Zhuang Wang, Yungok Ihm, Haibo Cao, Xun Gu, Drena Dobbs
Protein-protein interactions play a critical role in protein function. Completion of many genomes is being followed rapidly by major efforts to identify interacting protein pairs experimentally in order to decipher the networks of interacting, coordinated-in-action proteins. Identification of protein-protein interaction sites and detection of specific amino acids that contribute to the specificity and the strength of protein interactions is an important problem with broad applications ranging from rational drug design to the analysis of metabolic and signal transduction networks.
In order to increase the power of predictive methods for protein-protein interaction sites, we have developed a consensus methodology ...
Characterization Of Protein–Protein Interfaces, 2017 Utah State University
Characterization Of Protein–Protein Interfaces, Changhui Yan, Feihong Wu, Robert L. Jernigan, Drena Dobbs, Vasant Honavar
We analyze the characteristics of protein–protein interfaces using the largest datasets available from the Protein Data Bank (PDB). We start with a comparison of interfaces with protein cores and noninterface surfaces. The results show that interfaces differ from protein cores and non-interface surfaces in residue composition, sequence entropy, and secondary structure. Since interfaces, protein cores, and non-interface surfaces have different solvent accessibilities, it is important to investigate whether the observed differences are due to the differences in solvent accessibility or differences in functionality. We separate out the effect of solvent accessibility by comparing interfaces with a set of residues ...
Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, 2017 University of New Orleans, New Orleans
Machine Learning Based Protein Sequence To (Un)Structure Mapping And Interaction Prediction, Sumaiya Iqbal
University of New Orleans Theses and Dissertations
Proteins are the fundamental macromolecules within a cell that carry out most of the biological functions. The computational study of protein structure and its functions, using machine learning and data analytics, is elemental in advancing the life-science research due to the fast-growing biological data and the extensive complexities involved in their analyses towards discovering meaningful insights. Mapping of protein’s primary sequence is not only limited to its structure, we extend that to its disordered component known as Intrinsically Disordered Proteins or Regions in proteins (IDPs/IDRs), and hence the involved dynamics, which help us explain complex interaction within a ...
Transcriptomic Differentiation Underlying Marine‐To‐Freshwater Transitions In The South American Silversides Odontesthes Argentinensis And O. Bonariensis (Atheriniformes), 2017 George Washington University
Transcriptomic Differentiation Underlying Marine‐To‐Freshwater Transitions In The South American Silversides Odontesthes Argentinensis And O. Bonariensis (Atheriniformes), Lily Hughes, Gustavo Somoza, Bryan Nguyen, James Bernot, Mariano Gonzalez-Castro, Juan Martin Diaz De Astarloa, Guillermo Orti
Computational Biology Institute
Salinity gradients are critical habitat determinants for freshwater organisms. Silverside fishes in the genus Odontesthes have recently and repeatedly transitioned from marine to freshwater habitats, overcoming a strong ecological barrier. Genomic and transcriptomic changes involved in this kind of transition are only known for a few model species. We present new data and analyses of gene expression and microbiome composition in the gills of two closely related silverside species, marine O. argentinensis and freshwater O. bonariensis and find more than three thousand transcripts differentially expressed, with osmoregulatory/ion transport genes and immune genes showing very different expression patterns across species ...
Strand-Specific Libraries For High Throughput Rna Sequencing (Rna-Seq) Prepared Without Poly(A) Selection, 2017 University of Massachusetts Medical School
Strand-Specific Libraries For High Throughput Rna Sequencing (Rna-Seq) Prepared Without Poly(A) Selection, Zhao Zhang, William E. Theurkauf, Zhiping Weng, Phillip D. Zamore
BACKGROUND: High throughput DNA sequencing technology has enabled quantification of all the RNAs in a cell or tissue, a method widely known as RNA sequencing (RNA-Seq). However, non-coding RNAs such as rRNA are highly abundant and can consume >70% of sequencing reads. A common approach is to extract only polyadenylated mRNA; however, such approaches are blind to RNAs with short or no poly(A) tails, leading to an incomplete view of the transcriptome. Another challenge of preparing RNA-Seq libraries is to preserve the strand information of the RNAs. DESIGN: Here, we describe a procedure for preparing RNA-Seq libraries from 1 ...
Engineering And Verifying Requirements For Programmable Self-Assembling Nanomachines, 2017 Iowa State University
Engineering And Verifying Requirements For Programmable Self-Assembling Nanomachines, Robyn Lutz, Jack Lutz, James Lathrop, Titus Klinge, Eric Henderson, Davita Mathur, Dalia Abo Sheasha
We propose an extension of van Lamsweerde's goal-oriented requirements engineering to the domain of programmable DNA nanotechnology. This is a domain in which individual devices (agents) are at most a few dozen nanometers in diameter. These devices are programmed to assemble themselves from molecular components and perform their assigned tasks. The devices carry out their tasks in the probabilistic world of chemical kinetics, so they are individually error-prone. However, the number of devices deployed is roughly on the order of a nanomole (a 6 followed by fourteen 0s), and some goals are achieved when enough of these agents achieve ...
Automated Requirements Analysis For A Molecular Watchdog Timer, 2017 Iowa State University
Automated Requirements Analysis For A Molecular Watchdog Timer, Samuel J. Ellis, Eric R. Henderson, Titus H. Klinge, James I. Lathrop, Jack H. Lutz, Robyn R. Lutz, Divita Mathur, Andrew S. Miner
Dynamic systems in DNA nanotechnology are often programmed using a chemical reaction network (CRN) model as an intermediate level of abstraction. In this paper, we design and analyze a CRN model of a watchdog timer, a device commonly used to monitor the health of a safety critical system. Our process uses incremental design practices with goal-oriented requirements engineering, software verification tools, and custom software to help automate the software engineering process. The watchdog timer is comprised of three components: an absence detector, a threshold filter, and a signal amplifier. These components are separately designed and verified, and only then composed ...
Requirements Analysis For A Product Family Of Dna Nanodevices, 2017 Iowa State University
Requirements Analysis For A Product Family Of Dna Nanodevices, Robyn R. Lutz, Jack H. Lutz, James I. Lathrop, Titus H. Klinge, Divita Mathur, D. M. Stull, Taylor G. Bergquist, Eric R. Henderson
DNA nanotechnology uses the information processing capabilities of nucleic acids to design self-assembling, programmable structures and devices at the nanoscale. Devices developed to date have been programmed to implement logic circuits and neural networks, capture or release specific molecules, and traverse molecular tracks and mazes. Here we investigate the use of requirements engineering methods to make DNA nanotechnology more productive, predictable, and safe. We use goal-oriented requirements modeling to identify, specify, and analyze a product family of DNA nanodevices, and we use PRISM model checking to verify both common properties across the family and properties that are specific to individual ...
Crispr/Cas9-Mediated Genome Editing Induces Exon Skipping By Alternative Splicing Or Exon Deletion, 2017 University of Massachusetts Medical School
Crispr/Cas9-Mediated Genome Editing Induces Exon Skipping By Alternative Splicing Or Exon Deletion, Haiwei Mou, Jordan L. Smith, Lingtao Peng, Jill Moore, Xiao-Ou Zhang, Chun-Qing Song, Ankur Sheel, Deniz M. Ozata, Yingxiang Li, Charles P. Emerson Jr., Erik J. Sontheimer, Melissa J. Moore, Zhiping Weng, Wen Xue
Program in Bioinformatics and Integrative Biology Publications and Presentations
CRISPR is widely used to disrupt gene function by inducing small insertions and deletions. Here, we show that some single-guide RNAs (sgRNAs) can induce exon skipping or large genomic deletions that delete exons. For example, CRISPR-mediated editing of beta-catenin exon 3, which encodes an autoinhibitory domain, induces partial skipping of the in-frame exon and nuclear accumulation of beta-catenin. A single sgRNA can induce small insertions or deletions that partially alter splicing or unexpected larger deletions that remove exons. Exon skipping adds to the unexpected outcomes that must be accounted for, and perhaps taken advantage of, in CRISPR experiments.
Rnaseq Analysis Of The Drosophila Response To The Entomopathogenic Nematode Steinernema., 2017 George Washington University
Rnaseq Analysis Of The Drosophila Response To The Entomopathogenic Nematode Steinernema., Shruti Yadav, Sean Daugherty, Amol Carl Shetty, Ioannis Eleftherianos
Computational Biology Institute
Drosophila melanogaster is an outstanding model to study the molecular and functional basis of host-pathogen interactions. Currently, our knowledge of microbial infections in D. melanogaster is well understood; however, the response of flies to nematode infections is still in its infancy. Here, we have used the potent parasitic nematode Steinernema carpocapsae, which lives in mutualism with its endosymbiotic bacteria Xenorhabdus nematophila, to examine the transcriptomic basis of the interaction between D. melanogaster and entomopathogenic nematodes. We have employed next-generation RNA sequencing (RNAseq) to investigate the transcriptomic profile of D. melanogaster larvae in response to infection by S. carpocapsae symbiotic (carrying ...
Surveillance For Sulfadoxine-Pyrimethamine Resistant Malaria Parasites In The Lake And Southern Zones, Tanzania, Using Pooling And Next-Generation Sequencing, Jeremiah M. Ngondi, Nicholas J. Hathaway, Jeffrey A. Bailey, Julie Gutman
Program in Bioinformatics and Integrative Biology Publications and Presentations
BACKGROUND: Malaria in pregnancy (MiP) remains a major public health challenge in areas of high malaria transmission. Intermittent preventive treatment in pregnancy (IPTp) with sulfadoxine-pyrimethamine (SP) is recommended to prevent the adverse consequences of MiP. The effectiveness of SP for IPTp may be reduced in areas where the dhps581 mutation (a key marker of high level SP resistance) is found; this mutation was previously reported to be common in the Tanga Region of northern Tanzania, but there are limited data from other areas. The frequency of molecular markers of SP resistance was investigated in malaria parasites from febrile patients at ...
Ncbi-Blast Programs Optimization On Xsede Resources For Sustainable Aquaculture, 2017 Iowa State University
Ncbi-Blast Programs Optimization On Xsede Resources For Sustainable Aquaculture, Arun S. Seetharam, Antonio Gomez, Catherine M. Purcell, John R. Hyde, Philip D. Blood, Andrew J. Severin
The development of genomic resources of non-model organisms is now becoming commonplace as the cost of sequencing continues to decrease. The Genome Informatics Facility in collaboration with the Southwest Fisheries Science Center (SWFSC), NOAA is creating these resources for sustainable aquaculture in Seriola lalandi. Gene prediction and annotation are common steps in the pipeline to generate genomic resources, which are computationally intense and time consuming. In our steps to create genomic resources for Seriola lalandi, we found BLAST to be one of our most rate limiting steps. Therefore, we took advantage of our XSEDE Extended Collaborative Support Services (ECSS) to ...
Crispr-Cas9 Nuclear Dynamics And Target Recognition In Living Cells, 2017 University of Massachusetts Medical School
Crispr-Cas9 Nuclear Dynamics And Target Recognition In Living Cells, Hanhui Ma, Li-Chun Tu, Ardalan Naseri, Maximiliaan Huisman, Shaojie Zhang, David Grünwald, Thoru Pederson
The bacterial CRISPR-Cas9 system has been repurposed for genome engineering, transcription modulation, and chromosome imaging in eukaryotic cells. However, the nuclear dynamics of clustered regularly interspaced short palindromic repeats (CRISPR)-associated protein 9 (Cas9) guide RNAs and target interrogation are not well defined in living cells. Here, we deployed a dual-color CRISPR system to directly measure the stability of both Cas9 and guide RNA. We found that Cas9 is essential for guide RNA stability and that the nuclear Cas9-guide RNA complex levels limit the targeting efficiency. Fluorescence recovery after photobleaching measurements revealed that single mismatches in the guide RNA seed ...