Open Access. Powered by Scholars. Published by Universities.®

Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Genomics

These Are Not The K-Mers You Are Looking For: Efficient Online K-Mer Counting Using A Probabilistic Data Structure, Qingpeng Zhang, Jason Pell, Rosangela Canino-Koning, Adina Chuang Howe, C. Titus Brown Jul 2014

These Are Not The K-Mers You Are Looking For: Efficient Online K-Mer Counting Using A Probabilistic Data Structure, Qingpeng Zhang, Jason Pell, Rosangela Canino-Koning, Adina Chuang Howe, C. Titus Brown

Adina Howe

K-mer abundance analysis is widely used for many purposes in nucleotide sequence analysis, including data preprocessing for de novo assembly, repeat detection, and sequencing coverage estimation. We present the khmer software package for fast and memory efficient online counting of k-mers in sequencing data sets. Unlike previous methods based on data structures such as hash tables, suffix arrays, and trie structures, khmer relies entirely on a simple probabilistic data structure, a Count-Min Sketch. The Count-Min Sketch permits online updating and retrieval of k-mer counts in memory which is necessary to support online k-mer analysis algorithms. On sparse data sets this …


Revealing The Bacterial Butyrate Synthesis Pathways By Analyzing (Meta)Genomic Data, Marius Vital, Adina Chuang Howe, James M. Tiedje Apr 2014

Revealing The Bacterial Butyrate Synthesis Pathways By Analyzing (Meta)Genomic Data, Marius Vital, Adina Chuang Howe, James M. Tiedje

Adina Howe

Butyrate-producing bacteria have recently gained attention, since they are important for a healthy colon and when altered contribute to emerging diseases, such as ulcerative colitis and type II diabetes. This guild is polyphyletic and cannot be accurately detected by 16S rRNA gene sequencing. Consequently, approaches targeting the terminal genes of the main butyrate-producing pathway have been developed. However, since additional pathways exist and alternative, newly recognized enzymes catalyzing the terminal reaction have been described, previous investigations are often incomplete. We undertook a broad analysis of butyrate-producing pathways and individual genes by screening 3,184 sequenced bacterial genomes from the Integrated Microbial …


The Genome And Developmental Transcriptome Of The Strongylid Nematode Haemonchus Contortus, Erich M. Schwarz, Pasi K. Korhonen, Bronwyn E. Campbell, Neil D. Young, Aaron R. Jex, Abdul Jabbar, Ross S. Hall, Alinda Mondal, Adina C. Howe, Jason Pell, Andreas Hofmann, Peter R. Boag, Xing-Quan Zhu, T. Ryan Gregory, Alex Loukas, Brian A. Williams, Igor Antoshechkin, C. Titus Brown, Paul W. Sternberg, Robin B. Gasser Aug 2013

The Genome And Developmental Transcriptome Of The Strongylid Nematode Haemonchus Contortus, Erich M. Schwarz, Pasi K. Korhonen, Bronwyn E. Campbell, Neil D. Young, Aaron R. Jex, Abdul Jabbar, Ross S. Hall, Alinda Mondal, Adina C. Howe, Jason Pell, Andreas Hofmann, Peter R. Boag, Xing-Quan Zhu, T. Ryan Gregory, Alex Loukas, Brian A. Williams, Igor Antoshechkin, C. Titus Brown, Paul W. Sternberg, Robin B. Gasser

Adina Howe

Background The barber's pole worm, Haemonchus contortus, is one of the most economically important parasites of small ruminants worldwide. Although this parasite can be controlled using anthelmintic drugs, resistance against most drugs in common use has become a widespread problem. We provide a draft of the genome and the transcriptomes of all key developmental stages of H. contortus to support biological and biotechnological research areas of this and related parasites. Results The draft genome of H. contortus is 320 Mb in size and encodes 23,610 protein-coding genes. On a fundamental level, we elucidate transcriptional alterations taking place throughout the life …


A Reference-Free Algorithm For Computational Normalization Of Shotgun Sequencing Data, C. Titus Brown, Adina Howe, Qingpeng Zhang, Alexis B. Pyrkosz, Timothy H. Brom May 2012

A Reference-Free Algorithm For Computational Normalization Of Shotgun Sequencing Data, C. Titus Brown, Adina Howe, Qingpeng Zhang, Alexis B. Pyrkosz, Timothy H. Brom

Adina Howe

Deep shotgun sequencing and analysis of genomes, transcriptomes, amplified single-cell genomes, and metagenomes has enabled investigation of a wide range of organisms and ecosystems. However, sampling variation in short-read data sets and high sequencing error rates of modern sequencers present many new computational challenges in data interpretation. These challenges have led to the development of new classes of mapping tools and {\em de novo} assemblers. These algorithms are challenged by the continued improvement in sequencing throughput. We here describe digital normalization, a single-pass computational algorithm that systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, …


Illumina Sequencing Artifacts Revealed By Connectivity Analysis Of Metagenomic Datasets, Adina Chuang Howe, Jason Pell, Rosangela Canino-Koning, Rachel Mackelprang, Susanna Tringe, Janet Jansson, James M. Tiedje, C. Titus Brown Jan 2012

Illumina Sequencing Artifacts Revealed By Connectivity Analysis Of Metagenomic Datasets, Adina Chuang Howe, Jason Pell, Rosangela Canino-Koning, Rachel Mackelprang, Susanna Tringe, Janet Jansson, James M. Tiedje, C. Titus Brown

Adina Howe

Sequencing errors and biases in metagenomic datasets affect coverage-based assemblies and are often ignored during analysis. Here, we analyze read connectivity in metagenomes and identify the presence of problematic and likely a-biological connectivity within metagenome assembly graphs. Specifically, we identify highly connected sequences which join a large proportion of reads within each real metagenome. These sequences show position-specific bias in shotgun reads, suggestive of sequencing artifacts, and are only minimally incorporated into contigs by assembly. The removal of these sequences prior to assembly results in similar assembly content for most metagenomes and enables the use of graph partitioning to decrease …


Assembling Large, Complex Environmental Metagenomes, Adina Howe, Janet Jansson, Stephanie A. Malfatti, James M. Tiedje, C. Titus Brown Jan 2012

Assembling Large, Complex Environmental Metagenomes, Adina Howe, Janet Jansson, Stephanie A. Malfatti, James M. Tiedje, C. Titus Brown

Adina Howe

The large volumes of sequencing data required to sample complex environments deeply pose new challenges to sequence analysis approaches. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires significant computational resources. We apply two pre-assembly filtering approaches, digital normalization and partitioning, to make large metagenome assemblies more comput\ ationaly tractable. Using a human gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes from matched Iowa corn and native prairie soils. The predicted functional content and phylogenetic …