Open Access. Powered by Scholars. Published by Universities.®

Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Genomics

These Are Not The K-Mers You Are Looking For: Efficient Online K-Mer Counting Using A Probabilistic Data Structure, Qingpeng Zhang, Jason Pell, Rosangela Canino-Koning, Adina Chuang Howe, C. Titus Brown Jul 2014

These Are Not The K-Mers You Are Looking For: Efficient Online K-Mer Counting Using A Probabilistic Data Structure, Qingpeng Zhang, Jason Pell, Rosangela Canino-Koning, Adina Chuang Howe, C. Titus Brown

Adina Howe

K-mer abundance analysis is widely used for many purposes in nucleotide sequence analysis, including data preprocessing for de novo assembly, repeat detection, and sequencing coverage estimation. We present the khmer software package for fast and memory efficient online counting of k-mers in sequencing data sets. Unlike previous methods based on data structures such as hash tables, suffix arrays, and trie structures, khmer relies entirely on a simple probabilistic data structure, a Count-Min Sketch. The Count-Min Sketch permits online updating and retrieval of k-mer counts in memory which is necessary to support online k-mer analysis algorithms. On sparse data sets this …


Revealing The Bacterial Butyrate Synthesis Pathways By Analyzing (Meta)Genomic Data, Marius Vital, Adina Chuang Howe, James M. Tiedje Apr 2014

Revealing The Bacterial Butyrate Synthesis Pathways By Analyzing (Meta)Genomic Data, Marius Vital, Adina Chuang Howe, James M. Tiedje

Adina Howe

Butyrate-producing bacteria have recently gained attention, since they are important for a healthy colon and when altered contribute to emerging diseases, such as ulcerative colitis and type II diabetes. This guild is polyphyletic and cannot be accurately detected by 16S rRNA gene sequencing. Consequently, approaches targeting the terminal genes of the main butyrate-producing pathway have been developed. However, since additional pathways exist and alternative, newly recognized enzymes catalyzing the terminal reaction have been described, previous investigations are often incomplete. We undertook a broad analysis of butyrate-producing pathways and individual genes by screening 3,184 sequenced bacterial genomes from the Integrated Microbial …