Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 18 of 18

Full-Text Articles in Genetics and Genomics

A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti Oct 2017

A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti

Konstantin Läufer

As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …


A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer Oct 2017

A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer

Konstantin Läufer

RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …


Phagephisher: A Pipeline For The Discovery Of Covert Viral Sequences In Complex Genomic Datasets, Thomas Hatzopoulos, Siobhan C. Watkins, Catherine Putonti Sep 2017

Phagephisher: A Pipeline For The Discovery Of Covert Viral Sequences In Complex Genomic Datasets, Thomas Hatzopoulos, Siobhan C. Watkins, Catherine Putonti

Catherine Putonti

Obtaining meaningful viral information from large sequencing datasets presents unique challenges distinct from prokaryotic and eukaryotic sequencing efforts. The difficulties surrounding this issue can be ascribed in part to the genomic plasticity of viruses themselves as well as the scarcity of existing information in genomic databases. The open-source software PhagePhisher (http://www.putonti-lab.com/phagephisher) has been designed as a simple pipeline to extract relevant information from complex and mixed datasets, and will improve the examination of bacteriophages, viruses, and virally related sequences, in a range of environments. Key aspects of the software include speed and ease of use; PhagePhisher can be used with …


Hash-Map-Eradicator: Filtering Non-Target Sequences From Next Generation Sequencing Reads, Jonathon Brenner, Catherine Putonti Sep 2017

Hash-Map-Eradicator: Filtering Non-Target Sequences From Next Generation Sequencing Reads, Jonathon Brenner, Catherine Putonti

Catherine Putonti

Contemporary DNA sequencing technologies are continuously increasing throughput at ever decreasing costs. Moreover, due to recent advances in sequencing technology new platforms are emerging. As such computational challenges persist. The average read length possible has taken a giant leap forward with the PacBio and Nanopore solutions. Regardless of the platform used, impurities within the DNA preparation of the sample - be it from unintentional contaminants or pervasive symbiots - remains an issue. We have developed a new tool, HAsh-MaP-ERadicator (HAMPER), for the detection and removal of non-target, contaminating DNA sequences. Integrating hash-based and mapping-based strategies, HAMPER is both memory and …


A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti Sep 2017

A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti

Catherine Putonti

As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …


Clusters Of Alpha Satellite On Human Chromosome 21 Are Dispersed Far Onto The Short Arm And Lack Ancient Layers, William Ziccardi, Chongjian Zhao, Valery Shepelev, Lev Uralsky, Ivan Alexandrov, Tatyana Andreeva, Evgeny Rogaev, Christopher Bun, Emily Miller, Catherine Putonti, Jeffrey Doering Sep 2017

Clusters Of Alpha Satellite On Human Chromosome 21 Are Dispersed Far Onto The Short Arm And Lack Ancient Layers, William Ziccardi, Chongjian Zhao, Valery Shepelev, Lev Uralsky, Ivan Alexandrov, Tatyana Andreeva, Evgeny Rogaev, Christopher Bun, Emily Miller, Catherine Putonti, Jeffrey Doering

Catherine Putonti

Human alpha satellite (AS) sequence domains that currently function as centromeres are typically flanked by layers of evolutionarily older AS that presumably represent the remnants of earlier primate centromeres. Studies on several human chromosomes reveal that these older AS arrays are arranged in an age gradient, with the oldest arrays farthest from the functional centromere and arrays progressively closer to the centromere being progressively younger. The organization of AS on human chromosome 21 (HC21) has not been well-characterized. We have used newly available HC21 sequence data and an HC21p YAC map to determine the size, organization, and location of the …


Finding Function In The Unknown, Kelly Boyd, Emma Highland, Amanda Misch, Amber Hu, Sushma Reddy, Catherine Putonti Sep 2017

Finding Function In The Unknown, Kelly Boyd, Emma Highland, Amanda Misch, Amber Hu, Sushma Reddy, Catherine Putonti

Catherine Putonti

Through high-throughput RNA sequencing (RNAseq), transcriptomes for a single cell, tissue, or organism(s) can be ascertained at a high resolution. While a number of bioinformatic tools have been developed for transcriptome analyses, significant challenges exist for studies of non-model organisms. Without a reference sequence available, raw reads must first be assembled de novo followed by the tedious task of BLAST searches and data mining for functional information. We have created a pipeline, PyRanger, to automate this process. The pipeline includes functionality to assess a single transcriptome and also facilitate comparative transcriptomic studies.


Genomes Of Gardnerella Strains Reveal An Abundance Of Prophages Within The Bladder Microbiome, Kema Malki, Jason W. Shapiro, Travis Kyle Price, Evann Elizabeth Hilt, Krystal Thomas-White, Trina Sircar, Amy B. Rosenfeld, Michael J. Zilliox, Alan J. Wolfe, Catherine Putonti Sep 2017

Genomes Of Gardnerella Strains Reveal An Abundance Of Prophages Within The Bladder Microbiome, Kema Malki, Jason W. Shapiro, Travis Kyle Price, Evann Elizabeth Hilt, Krystal Thomas-White, Trina Sircar, Amy B. Rosenfeld, Michael J. Zilliox, Alan J. Wolfe, Catherine Putonti

Catherine Putonti

Bacterial surveys of the vaginal and bladder human microbiota have revealed an abundance of many similar bacterial taxa. As the bladder was once thought to be sterile, the complex interactions between microbes within the bladder have yet to be characterized. To initiate this process, we have begun sequencing isolates, including the clinically relevant genus Gardnerella. Herein, we present the genomic sequences of four Gardnerella strains isolated from the bladders of women with symptoms of urgency urinary incontinence; these are the first Gardnerella genomes produced from this niche. Congruent to genomic characterization of Gardnerella isolates from the reproductive tract, isolates from …


Bacteriophages Isolated From Lake Michigan Demonstrate Broad Host-Range Across Several Bacterial Phyla, Kema Malki, Alex Kula, Katherine Bruder, Emily Sible, Thomas Hatzopoulos, Stephanie Steidel, Siobhan C. Watkins, Catherine Putonti Sep 2017

Bacteriophages Isolated From Lake Michigan Demonstrate Broad Host-Range Across Several Bacterial Phyla, Kema Malki, Alex Kula, Katherine Bruder, Emily Sible, Thomas Hatzopoulos, Stephanie Steidel, Siobhan C. Watkins, Catherine Putonti

Catherine Putonti

BACKGROUND:

The study of bacteriophages continues to generate key information about microbial interactions in the environment. Many phenotypic characteristics of bacteriophages cannot be examined by sequencing alone, further highlighting the necessity for isolation and examination of phages from environmental samples. While much of our current knowledge base has been generated by the study of marine phages, freshwater viruses are understudied in comparison. Our group has previously conducted metagenomics-based studies samples collected from Lake Michigan - the data presented in this study relate to four phages that were extracted from the same samples.

FINDINGS:

Four phages were extracted from Lake Michigan …


A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer Sep 2017

A Polyglot Approach To Bioinformatics Data Integration: Phylogenetic Analysis Of Hiv-1, Steven Reisman, Catherine Putonti, George K. Thiruvathukal, Konstantin Läufer

Catherine Putonti

RNA-interference has potential therapeutic use against HIV-1 by targeting highly-functional mRNA sequences that contribute to the virulence of the virus. Empirical work has shown that within cell lines, all of the HIV-1 genes are affected by RNAi-induced gene silencing. While promising, inherent in this treatment is the fact that RNAi sequences must be highly specific. HIV, however, mutates rapidly, leading to the evolution of viral escape mutants. In fact, such strains are under strong selection to include mutations within the targeted region, evading the RNAi therapy and thus increasing the virus’ fitness in the host. Taking a phylogenetic approach, we …


Uploading Data To The Ncbi Sra Database, Ray A. Enke Jun 2017

Uploading Data To The Ncbi Sra Database, Ray A. Enke

Ray Enke Ph.D.

This in class exercise focuses on uploading FASTQ files sequencing data to the NCBI SRA database


Strand-Specific Libraries For High Throughput Rna Sequencing (Rna-Seq) Prepared Without Poly(A) Selection, Zhao Zhang, William E. Theurkauf, Zhiping Weng, Phillip D. Zamore Jun 2017

Strand-Specific Libraries For High Throughput Rna Sequencing (Rna-Seq) Prepared Without Poly(A) Selection, Zhao Zhang, William E. Theurkauf, Zhiping Weng, Phillip D. Zamore

Zhao Zhang

BACKGROUND: High throughput DNA sequencing technology has enabled quantification of all the RNAs in a cell or tissue, a method widely known as RNA sequencing (RNA-Seq). However, non-coding RNAs such as rRNA are highly abundant and can consume >70% of sequencing reads. A common approach is to extract only polyadenylated mRNA; however, such approaches are blind to RNAs with short or no poly(A) tails, leading to an incomplete view of the transcriptome. Another challenge of preparing RNA-Seq libraries is to preserve the strand information of the RNAs. DESIGN: Here, we describe a procedure for preparing RNA-Seq libraries from 1 to …


Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley Jan 2017

Using Phylogenetically-Informed Annotation (Pia) To Search For Light-Interacting Genes In Transcriptomes From Non-Model Organisms, Daniel I. Speiser, M. Sabrina Pankey, Alexander K. Zaharoff, Barbara A. Battelle, Heather D. Bracken-Grissom, Jesse W. Breinholt, Seth M. Bybee, Thomas W. Cronin, Anders Garm, Annie R. Lindgren, Nipam H. Patel, Megan L. Porter, Meredith E. Protas, Anja S. Rivera, Jeanne M. Serb, Kirk S. Zigler, Keith A. Crandall, Todd H. Oakley

Meredith Protas

Background: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to …


A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti Jan 2017

A Polyglot Approach To Bioinformatics Data Integration: A Phylogenetic Analysis Of Hiv-1, Steven Reisman, Thomas Hatzopoulos, Konstantin Laufer, George K. Thiruvathukal, Catherine Putonti

George K. Thiruvathukal

As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 …


Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani Dec 2016

Statistical Contributions To Bioinformatics: Design, Modeling, Structure Learning, And Integration, Jeffrey S. Morris, Veera Baladandayuthapani

Jeffrey S. Morris

The advent of high-throughput multi-platform genomics technologies providing whole-genome molecular summaries of biological samples has revolutionalized biomedical research. These technologies yield highly structured big data, whose analysis poses significant quantitative challenges. The field of Bioinformatics has emerged to deal with these challenges, and is comprised of many quantitative and biological scientists working together to eectively process these data and extract the treasure trove of information they contain. Statisticians, with their deep understanding of variability and uncertainty quantification, play a key role in these efforts. In this article, we attempt to summarize some of the key contributions of statisticians to bioinformatics, …


Genomics Rna-Seq Analysis Part 1-Cyverse De, Fastqc, Trimmomatic (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad Dec 2016

Genomics Rna-Seq Analysis Part 1-Cyverse De, Fastqc, Trimmomatic (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad

Ray Enke Ph.D.

This in class exercise is designed to teach students about the basic features of CyVerse Discovery Environment (DE) as well as how to run FastQC and Trimmomatic steps of a general eukaryotic RNA-seq analysis pipeline using CyVerse DE apps.


Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad Dec 2016

Genomics Rna-Seq Analysis Part 2_ Kallisto Indexing And Quantification (Updated 11/17), Ray A. Enke, Melika Rahmani-Mofrad

Ray Enke Ph.D.

This in class exercise is a hands on activity designed to teach students about how to run Kallisto indexing quantification using CyVerse DE apps as part of a eukaryotic RNA-seq analysis pipeline.


Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker Dec 2016

Genomics Rna-Seq Analysis Part 3-Sleuth Data Visualization (Updated 11/17), Ray A. Enke, Scott Schumacker

Ray Enke Ph.D.

This in class exercise is a hands on activity designed to teach students about how to run Sleuth statistical modeling and RStudio data visualization package using Kallisto pseudoalignment output files as part of a eukaryotic RNA-seq analysis pipeline.