Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics

Selected Works

Iowa State University

2012

Articles 1 - 3 of 3

Full-Text Articles in Life Sciences

A Reference-Free Algorithm For Computational Normalization Of Shotgun Sequencing Data, C. Titus Brown, Adina Howe, Qingpeng Zhang, Alexis B. Pyrkosz, Timothy H. Brom May 2012

A Reference-Free Algorithm For Computational Normalization Of Shotgun Sequencing Data, C. Titus Brown, Adina Howe, Qingpeng Zhang, Alexis B. Pyrkosz, Timothy H. Brom

Adina Howe

Deep shotgun sequencing and analysis of genomes, transcriptomes, amplified single-cell genomes, and metagenomes has enabled investigation of a wide range of organisms and ecosystems. However, sampling variation in short-read data sets and high sequencing error rates of modern sequencers present many new computational challenges in data interpretation. These challenges have led to the development of new classes of mapping tools and {\em de novo} assemblers. These algorithms are challenged by the continued improvement in sequencing throughput. We here describe digital normalization, a single-pass computational algorithm that systematizes coverage in shotgun sequencing data sets, thereby decreasing sampling variation, discarding redundant data, …


Illumina Sequencing Artifacts Revealed By Connectivity Analysis Of Metagenomic Datasets, Adina Chuang Howe, Jason Pell, Rosangela Canino-Koning, Rachel Mackelprang, Susanna Tringe, Janet Jansson, James M. Tiedje, C. Titus Brown Jan 2012

Illumina Sequencing Artifacts Revealed By Connectivity Analysis Of Metagenomic Datasets, Adina Chuang Howe, Jason Pell, Rosangela Canino-Koning, Rachel Mackelprang, Susanna Tringe, Janet Jansson, James M. Tiedje, C. Titus Brown

Adina Howe

Sequencing errors and biases in metagenomic datasets affect coverage-based assemblies and are often ignored during analysis. Here, we analyze read connectivity in metagenomes and identify the presence of problematic and likely a-biological connectivity within metagenome assembly graphs. Specifically, we identify highly connected sequences which join a large proportion of reads within each real metagenome. These sequences show position-specific bias in shotgun reads, suggestive of sequencing artifacts, and are only minimally incorporated into contigs by assembly. The removal of these sequences prior to assembly results in similar assembly content for most metagenomes and enables the use of graph partitioning to decrease …


Assembling Large, Complex Environmental Metagenomes, Adina Howe, Janet Jansson, Stephanie A. Malfatti, James M. Tiedje, C. Titus Brown Jan 2012

Assembling Large, Complex Environmental Metagenomes, Adina Howe, Janet Jansson, Stephanie A. Malfatti, James M. Tiedje, C. Titus Brown

Adina Howe

The large volumes of sequencing data required to sample complex environments deeply pose new challenges to sequence analysis approaches. De novo metagenomic assembly effectively reduces the total amount of data to be analyzed but requires significant computational resources. We apply two pre-assembly filtering approaches, digital normalization and partitioning, to make large metagenome assemblies more comput\ ationaly tractable. Using a human gut mock community dataset, we demonstrate that these methods result in assemblies nearly identical to assemblies from unprocessed data. We then assemble two large soil metagenomes from matched Iowa corn and native prairie soils. The predicted functional content and phylogenetic …