Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computational Biology

Genomics

Institution
Publication Year
Publication
Publication Type

Articles 1 - 30 of 35

Full-Text Articles in Genetics and Genomics

Investigating The Impact Of Transcription On Mutation Rates, Sarah Patterson Dec 2023

Investigating The Impact Of Transcription On Mutation Rates, Sarah Patterson

Theses and Dissertations

tRNA genes are highly transcribed and perform one of the most fundamental cellular functions. Although a universal pattern observed across all three domains of life is that highly transcribed genes tend to evolve slowly, tRNA genes have been shown previously to evolve rapidly. This rapid sequence evolution could result from relaxed selection, increased mutation rate, or a combination of both. Here, we use mutation-accumulation line sequencing data to show that tRNA genes accumulate more mutations than other gene types. Our results indicate that this elevated mutation rate is a consequence of both elevated transcription-associated mutagenesis and a lack of transcription-coupled …


The Detection Of Putative Recessive Lethal Haplotypes In Irish Sheep Populations, Rory Mcauley Nov 2023

The Detection Of Putative Recessive Lethal Haplotypes In Irish Sheep Populations, Rory Mcauley

ORBioM (Open Research BioSciences Meeting)

In livestock populations, recessive lethal alleles are a known contributor to poor reproductive performance due to embryonic death in homozygous individuals. Despite their lethal effect in the recessive form, these alleles may be maintained at high frequencies among carrier animals because of their positive pleiotropic effects on economically important traits. Although several such recessive alleles have been identified in cattle and pig populations, limited studies have been completed in sheep, and none within Irish sheep populations. Genotype data for 69,034 animals from five major Irish sheep breeds genotyped on a variety of panels was available for this study. Only animals …


The Genomics Of Autism-Related Genes Il1rapl1 And Il1rapl2: Insights Into Their Cortical Distribution, Cell-Type Specificity, And Developmental Trajectories, Jacob Weaver Apr 2023

The Genomics Of Autism-Related Genes Il1rapl1 And Il1rapl2: Insights Into Their Cortical Distribution, Cell-Type Specificity, And Developmental Trajectories, Jacob Weaver

MUSC Theses and Dissertations

Neuropsychiatric disorders have a significant impact on modern society. These disorders affect a large percentage of the population: schizophrenia has a world-wide prevalence of 1% and autism spectrum disorders (ASD) affects 1 in 59 school-aged children in the US. There is substantial evidence that most neuropsychiatric disorders have a genetic component. Thus, with the advent of high throughput sequencing much effort has gone into identifying genetic variants associated with these disorders. The emerging picture from these studies is a complex one where hundreds of genes with small effects interact with a varied landscape of common variants to result in disease. …


Methods And Tools To Improve Performance Of Plant Genome Analysis, Drew Ferrell Aug 2022

Methods And Tools To Improve Performance Of Plant Genome Analysis, Drew Ferrell

Theses and Dissertations

Multi -omics data analysis and integration facilitates hypothesis building toward an understanding of genes and pathway responses driven by environments. Methods designed to estimate and analyze gene expression, with regard to treatments or conditions, can be leveraged to understand gene-level responses in the cell. However, genes often interact and signal within larger structures such as pathways and networks. Complex studies guided toward describing dynamic genetic pathways and networks require algorithms or methods designed for inference based on gene interactions and related topologies. Classes of algorithms and methods may be integrated into generalized workflows for comparative genomics studies, as multi -omics …


A Genomic Investigation Of Divergence Between Tuna Species, Pavel V. Dimens Aug 2022

A Genomic Investigation Of Divergence Between Tuna Species, Pavel V. Dimens

Dissertations

Effective management and conservation of marine pelagic fishes is heavily dependent on a robust understanding of their population structure, their evolutionary history, and the delineation of appropriate management units. The Yellowfin tuna (Thunnus albacares) and the Blackfin tuna (Thunnus atlanticus) are two exploited epipelagic marine species with overlapping ranges in the tropical and sub-tropical Atlantic Ocean. This work analyzed genome-wide genetic variation of both species in the Atlantic basin to investigate the occurrence of population subdivision and adaptive variation. A de novo assembly of the Blackfin tuna genome was generated using Illumina paired-end sequencing data and …


Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty Aug 2022

Development Of Graphical Models And Statistical Physics Motivated Approaches To Genomic Investigations, Yashwanth Lagisetty

Dissertations & Theses (Open Access)

Identifying genes involved in disease pathology has been a goal of genomic research since the early days of the field. However, as technology improves and the body of research grows, we are faced with more questions than answers. Among these is the pressing matter of our incomplete understanding of the genetic underpinnings of complex diseases. Many hypotheses offer explanations as to why direct and independent analyses of variants, as done in genome-wide association studies (GWAS), may not fully elucidate disease genetics. These range from pointing out flaws in statistical testing to invoking the complex dynamics of epigenetic processes. In the …


Mechanisms By Which Xenorhabdus Nematophila Interacts With Hosts Using Integrated -Omics Approaches, Nicholas C. Mucci May 2022

Mechanisms By Which Xenorhabdus Nematophila Interacts With Hosts Using Integrated -Omics Approaches, Nicholas C. Mucci

Doctoral Dissertations

Nearly all organisms exist in proximity to microbes. These microbes perform most of the essential metabolic processes necessary for homeostasis, forming the nearly hidden support system of Earth. Microbial symbiosis, which is defined as the long-term physical association between host and microbes, relies on communication between the microbial community and their host organism. These interactions among higher order organisms (such as animals, plants, and fungi) and their bacteria links metabolic processes between interkingdom consortia. Many questions on microbial behavior within a host remain poorly understood, such as the colonization efficiency among different microbial species, or how environmental context changes their …


An Investigation Of Epigenetic Mechanisms Driving The Biology Of Head And Neck Squamous Cell Carcinoma, Scot Carson Callahan May 2022

An Investigation Of Epigenetic Mechanisms Driving The Biology Of Head And Neck Squamous Cell Carcinoma, Scot Carson Callahan

Dissertations & Theses (Open Access)

Head and neck squamous cell carcinoma (HNSCC) is the 6th most common cancer worldwide and is associated with significant morbidity and mortality. To date, the majority of work in the field has focused on genomic alterations such as mutations and copy number alterations. However, the clinical success of targeted therapies that exploit known genomic alterations, such as EGFR mutations, has remained mixed. Over the past decade, the importance of epigenetic regulators has come to the forefront, with the realization that many of these genes are mutated in cancer. Despite this realization, the role of epigenetics in regulating tumorigenesis, progression and …


Impact Of Intratumor Heterogeneity And The Tumor Microenvironment In Shaping Tumor Evolution And Response To Therapy, Akash Mitra Jun 2021

Impact Of Intratumor Heterogeneity And The Tumor Microenvironment In Shaping Tumor Evolution And Response To Therapy, Akash Mitra

Dissertations & Theses (Open Access)

Intratumor heterogeneity (ITH) is a crucial challenge in cancer treatment. The genotypic and phenotypic heterogeneity underlying diverse cancer types leads to subclonal variation, which may result in mixed or failed response to therapy. The heterogeneity at the tumor level, along with the tumor microenvironment (TME), often shapes tumor evolution and ultimately clinical outcome. Given that modern treatment paradigms increasingly expose patients with metastatic disease to multiple treatment modalities through the course of their disease, there exists a need to characterize robust and predictive biomarkers of response to therapy. In order to accurately characterize tumor evolution, we need to account for …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani Jan 2021

Composition And Homology In The Taxonomic Classification Of Escherichia Coli, Tanya Irani

Theses and Dissertations (Comprehensive)

As new techniques have been introduced, specifically the possibility of complete genome sequencing, better methods of defining bacterial species have also been proposed. One of the most recently proposed methods, using bioinformatic techniques, is to calculate the average nucleotide identity (ANI) between the homologous genome segments of different isolates. Another method for species discrimination that has been tested successfully is the similarity of DNA compositional signatures. However, in a recent update, DNA signatures split the available Escherichia coli complete genomes into three groups. To check if this result was consistent with such genomes belonging to different species, we tested methods …


3d Genome Architecture Under Stress: A Survey Of Ionizing Radiation, Progeria, And Osmotic Stress, Jacob Tyler Sanders Dec 2020

3d Genome Architecture Under Stress: A Survey Of Ionizing Radiation, Progeria, And Osmotic Stress, Jacob Tyler Sanders

Doctoral Dissertations

The human nucleus contains 2 meters of DNA which is intricately folded into a three-dimensional (3D) structure. It has become increasingly clear that this 3D structure plays an important role in the expression of genes. Proper gene expression is necessary for cellular homeostasis, cell state, and response to environmental/physical perturbations. Faithful repair of damage DNA damage is necessary to prevent genomic aberrations, such as translocations, which may lead to misregulation of gene expression. Hi-C, a sequencing technique that labels proximal chromatin interactions, provides a clearer picture of how the genome is spatially organized within the nucleus. Here, we discuss the …


Investigation Of Proliferation Suppressors In Genetic Fitness Screens, Walter Frank Lenoir Iv Dec 2020

Investigation Of Proliferation Suppressors In Genetic Fitness Screens, Walter Frank Lenoir Iv

Dissertations & Theses (Open Access)

Innovation of CRISPR gene-editing technology has provided scientists genome manipulation tools that allowed rapid advancement of scientific capabilities and thus improved our ability to systematically study mammalian genetic functional profiles. Genome-wide CRISPR knockout screens conducted in collections of human cell lines can knock out genes at multiple loci, and have provided new insights into functional roles for independent genes. This method has launched massive efforts in looking across genetic backgrounds for context specific genetic vulnerabilities within cancer. Much of the research effort thus far has been spent on optimizing phenotype distinctions between essential, genes required for cell fitness, and non-essential, …


Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna Jan 2019

Genome-Wide Systems Genetics Of Alcohol Consumption And Dependence, Kristin Mignogna

Theses and Dissertations

Widely effective treatment for alcohol use disorder is not yet available, because the exact biological mechanisms that underlie this disorder are not completely understood. One way to gain a better understanding of these mechanisms is to examine the genetic frameworks that contribute to the risk for developing this disorder. This dissertation examines genetic association data in combination with gene expression networks in the brain to identify functional groups of genes associated with alcohol consumption and dependence.

The first study took advantage of the behavioral complexity of human samples, and experimental capabilities provided by mouse models, by co-analyzing gene expression networks …


Intraspecific Variation In Dehydration Tolerance: Insights From The Tropical Plant Marchantia Inflexa, Rose A. Marks Jan 2019

Intraspecific Variation In Dehydration Tolerance: Insights From The Tropical Plant Marchantia Inflexa, Rose A. Marks

Theses and Dissertations--Biology

Plants are threatened by global change, increasing variability in weather patterns, and associated abiotic stress. Consequently, there is an urgent need to enhance our ability to predict plant community dynamics, shifts in species distributions, and physiological responses to environmental challenges. By building a fundamental understanding of plant stress tolerance, it may be possibly to protect the ecological services, economic industries, and communities that depend on plants. Dehydration tolerance (DhT) is an important mechanism of water stress tolerance with promising translational applications. Here, I take advantage natural variation in DhT to gain a deeper insight into this complex trait. In addition, …


Gene-Based Association Study For Lipid Traits In Diverse Cohorts Implicates Bace1 And Sidt2 Regulation In Triglyceride Levels, Angela Andaleon, Lauren S. Mogil, Heather Wheeler Jan 2018

Gene-Based Association Study For Lipid Traits In Diverse Cohorts Implicates Bace1 And Sidt2 Regulation In Triglyceride Levels, Angela Andaleon, Lauren S. Mogil, Heather Wheeler

Bioinformatics Faculty Publications

Plasma lipid levels are risk factors for cardiovascular disease, a leading cause of death worldwide. While many studies have been conducted on lipid genetics, they mainly focus on Europeans and thus their transferability to diverse populations is unclear. We performed SNP- and gene-level genome-wide association studies (GWAS) of four lipid traits in cohorts from Nigeria and the Philippines and compared them to the results of larger, predominantly European meta-analyses. Two previously implicated loci met genome-wide significance in our SNP-level GWAS in the Nigerian cohort, rs34065661 in CETP associated with HDL cholesterol (P = 9.0 × 10−10) and …


Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin Dec 2017

Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin

Biostatistics Faculty Publications

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, so misuse of statistical methods is a growing concern. Hypothesis and significance testing, and the accompanying P-values are being scrutinized as representing the most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of credibility for scientific hypotheses. It has also been suggested that while P-values …


Integrative Cancer Immunogenomic Analysis Of Serial Melanoma Biopsies Reveals Correlates Of Response And Resistance To Sequential Ctla-4 And Pd-1 Blockade Treatment, Whijae Roh Dec 2017

Integrative Cancer Immunogenomic Analysis Of Serial Melanoma Biopsies Reveals Correlates Of Response And Resistance To Sequential Ctla-4 And Pd-1 Blockade Treatment, Whijae Roh

Dissertations & Theses (Open Access)

Melanoma is the most malignant form of skin cancer. The five-year survival rate for metastatic melanoma is 19.9%. Although targeted therapy of BRAF and MEK inhibitors were developed for melanoma, resistance to therapy is inevitable. Immune checkpoint blockade, which reverses the suppression of the immune system, on the other hand, has shown a durable response in 20-30% of patients with metastatic melanoma. However, more predictive and robust biomarkers of response to this therapy are still needed, and resistance mechanisms remain incompletely understood. To address this, we examined a cohort of metastatic melanoma patients treated with sequential checkpoint blockade against cytotoxic …


A Longitudinal Cline Characterizes The Genetic Structure Of Human Populations In The Tibetan Plateau, Choongwon Jeong, Benjamin M. Peter, Buddha Basnyat, Maniraj Neupane, Geoff Childs, Sienna Craig, John Novembre, Anna Di Rienzo Apr 2017

A Longitudinal Cline Characterizes The Genetic Structure Of Human Populations In The Tibetan Plateau, Choongwon Jeong, Benjamin M. Peter, Buddha Basnyat, Maniraj Neupane, Geoff Childs, Sienna Craig, John Novembre, Anna Di Rienzo

Dartmouth Scholarship

Indigenous populations of the Tibetan plateau have attracted much attention for their good performance at extreme high altitude. Most genetic studies of Tibetan adaptations have used genetic variation data at the genome scale, while genetic inferences about their de- mography and population structure are largely based on uniparental markers. To provide genome-wide information on population structure, we analyzed new and published data of 338 individuals from indigenous populations across the plateau in conjunction with world- wide genetic variation data. We found a clear signal of genetic stratification across the east- west axis within Tibetan samples. Samples from more eastern locations …


Characterization Of A Large Vertebrate Genome And Homomorphic Sex Chromosomes In The Axolotl, Ambystoma Mexicanum, Melissa Keinath Jan 2017

Characterization Of A Large Vertebrate Genome And Homomorphic Sex Chromosomes In The Axolotl, Ambystoma Mexicanum, Melissa Keinath

Theses and Dissertations--Biology

Changes in the structure, content and morphology of chromosomes accumulate over evolutionary time and contribute to cell, developmental and organismal biology. The axolotl (Ambystoma mexicanum) is an important model for studying these changes because: 1) it provides important phylogenetic perspective for reconstructing the evolution of vertebrate genomes and amphibian karyotypes, 2) its genome has evolved to a large size (~10X larger than human) but has maintained gene orders, and 3) it possesses potentially young sex chromosomes that have not undergone extensive differentiation in the structure that is typical of many other vertebrate sex chromosomes (e.g. mammalian XY chromosomes …


Gene Discovery In Mendelian And Complex Diseases, Sali Farhan Aug 2016

Gene Discovery In Mendelian And Complex Diseases, Sali Farhan

Electronic Thesis and Dissertation Repository

Through the Finding of Rare Disease Genes in Canada (FORGE Canada) initiative, individuals affected with rare Mendelian diseases were clinically ascertained with a goal of identifying the genetic origin of their disease. Herein, I describe the methods for identifying the genetic basis of four Mendelian diseases. The application of next generation sequencing led to the discovery of non-synonymous variation in the DNA of individuals affected by rare diseases. The effects of the candidate variants were assessed using a series of functional experiments to complement the human genetics data. The variants observed in patients’ cells are extremely rare, were consistently predicted …


Development Of An In Silico Kir Genotyping Algorithm And Its Application To Population And Cancer Immunogenetic Analyses, Howard Rosoff Aug 2016

Development Of An In Silico Kir Genotyping Algorithm And Its Application To Population And Cancer Immunogenetic Analyses, Howard Rosoff

Dissertations & Theses (Open Access)

Gene content determination and variant calling in the complex KIR genomic region are useful for immune system function analysis, pathogenesis and disease risk factor elucidation, immunotherapy development, evolutionary investigations, and human migration modeling. Sequence-specific oligonucleotide and sequence-specific primer PCR methods are the de facto standards for KIR presence/absence identification, but the current platforms are unsuitable for SNP calling, impractical for KIR typing large cohorts of DNA samples, and inapplicable for typing repositories in which sequence data, but not cells or cell analytes, are available. Alternative typing methods, such as in silico sequence-based typing, can address the problems associated with amplicon-based …


Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris Jan 2016

Functional Car Models For Spatially Correlated Functional Datasets, Lin Zhang, Veerabhadran Baladandayuthapani, Hongxiao Zhu, Keith A. Baggerly, Tadeusz Majewski, Bogdan Czerniak, Jeffrey S. Morris

Jeffrey S. Morris

We develop a functional conditional autoregressive (CAR) model for spatially correlated data for which functions are collected on areal units of a lattice. Our model performs functional response regression while accounting for spatial correlations with potentially nonseparable and nonstationary covariance structure, in both the space and functional domains. We show theoretically that our construction leads to a CAR model at each functional location, with spatial covariance parameters varying and borrowing strength across the functional domain. Using basis transformation strategies, the nonseparable spatial-functional model is computationally scalable to enormous functional datasets, generalizable to different basis functions, and can be used on …


Employing Limited Next Generation Sequence Data For The Development Of Genetic Loci Of Phylogenetic And Population Genetic Utility, Lauren Evenstone Jul 2015

Employing Limited Next Generation Sequence Data For The Development Of Genetic Loci Of Phylogenetic And Population Genetic Utility, Lauren Evenstone

FIU Electronic Theses and Dissertations

Massively parallel high throughput sequencers are transforming the scientific research by reducing the cost and time necessary to sequence entire genomes. The goal of this project is to produce preliminary genome assemblies of calliphorid flies using Life Technologies’ Ion Torrent sequencing and Illumina’s MiSeq sequencing. I located, assembled, and annotated a novel mitochondrial genome for one such fly, the little studied Chrysomya pacifica that is central to one hypothesis about blow fly evolution. With sequencing data from Chrysomya megacephala, its forensically relevant sister species, much insight can be gained by alignments, sequence and protein analysis, and many more tools …


Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull Jan 2015

Ordinal Probit Wavelet-Based Functional Models For Eqtl Analysis, Mark J. Meyer, Jeffrey S. Morris, Craig P. Hersh, Jarret D. Morrow, Christoph Lange, Brent A. Coull

Jeffrey S. Morris

Current methods for conducting expression Quantitative Trait Loci (eQTL) analysis are limited in scope to a pairwise association testing between a single nucleotide polymorphism (SNPs) and expression probe set in a region around a gene of interest, thus ignoring the inherent between-SNP correlation. To determine association, p-values are then typically adjusted using Plug-in False Discovery Rate. As many SNPs are interrogated in the region and multiple probe-sets taken, the current approach requires the fitting of a large number of models. We propose to remedy this by introducing a flexible function-on-scalar regression that models the genome as a functional outcome. The …


A Systems Biology Approach To Detect Eqtls Associated With Mirna And Mrna Co-Expression Networks In The Nucleus Accumbens Of Chronic Alcoholic Patients, Mohammed Mamdani Jan 2014

A Systems Biology Approach To Detect Eqtls Associated With Mirna And Mrna Co-Expression Networks In The Nucleus Accumbens Of Chronic Alcoholic Patients, Mohammed Mamdani

Theses and Dissertations

Alcohol Dependence (AD) is a chronic substance use disorder with moderate heritability (60%). Linkage and genome-wide association studies (GWAS) have implicated a number of loci; however, the molecular mechanisms underlying AD are unclear. Advances in systems biology allow genome-wide expression data to be integrated with genetic data to detect expression quantitative trait loci (eQTL), polymorphisms that regulate gene expression levels, influence phenotypes and are significantly enriched among validated genetic signals for many commonly studied traits including AD.

We integrated genome-wide mRNA and miRNA expression data with genotypic data from the nucleus accumbens (NAc), a major addiction-related brain region, of 36 …


Small Rna Expression During Programmed Rearragement Of A Vertebrate Genome, Joseph R. Herdy Iii Jan 2014

Small Rna Expression During Programmed Rearragement Of A Vertebrate Genome, Joseph R. Herdy Iii

Theses and Dissertations--Biology

The sea lamprey (Petromyzon marinus) undergoes programmed genome rearrangements (PGRs) during embryogenesis that results in the deletion of ~0.5 Gb of germline DNA from the somatic lineage. The underlying mechanism of these rearrangements remains largely unknown. miRNAs (microRNAs) and piRNAs (PIWI interacting RNAs) are two classes of small noncoding RNAs that play important roles in early vertebrate development, including differentiation of cell lineages, modulation of signaling pathways, and clearing of maternal transcripts. Here, I utilized next generation sequencing to determine the temporal expression of miRNAs, piRNAs, and other small noncoding RNAs during the first five days of lamprey …


On The Origin Of Phenotypic Variation: Novel Technologies To Dissect Molecular Determinants Of Phenotype, Francesco Vallania Dec 2013

On The Origin Of Phenotypic Variation: Novel Technologies To Dissect Molecular Determinants Of Phenotype, Francesco Vallania

All Theses and Dissertations (ETDs)

This thesis describes the conception, design, and development of novel computational tools, theoretical models, and experimental techniques applied to the dissection of molecular factors underlying phenotypic variation. The first part of my work is focused on finding rare genetic variants in pooled DNA samples, leading to the development of a novel set of algorithms, SNPseeker and SPLINTER, applied to next-generation sequencing data. The second part of my work describes the creation of a reporter system for DNA methylation for the purpose of dissecting the genetic contribution of tissue-specific patterns of DNA methylation across the genome. Finally the last part of …


Pathoscope: Species Identification And Strain Attribution With Unassembled Sequencing Data., Owen E Francis, Matthew Bendall, Solaiappan Manimaran, Changjin Hong, Nathan L Clement, Eduardo Castro-Nallar, Quinn Snell, G Bruce Schaalje, Mark J Clement, Keith A Crandall, W Evan Johnson Oct 2013

Pathoscope: Species Identification And Strain Attribution With Unassembled Sequencing Data., Owen E Francis, Matthew Bendall, Solaiappan Manimaran, Changjin Hong, Nathan L Clement, Eduardo Castro-Nallar, Quinn Snell, G Bruce Schaalje, Mark J Clement, Keith A Crandall, W Evan Johnson

Computational Biology Institute

Emerging next-generation sequencing technologies have revolutionized the collection of genomic data for applications in bioforensics, biosurveillance, and for use in clinical settings. However, to make the most of these new data, new methodology needs to be developed that can accommodate large volumes of genetic data in a computationally efficient manner. We present a statistical framework to analyze raw next-generation sequence reads from purified or mixed environmental or targeted infected tissue samples for rapid species identification and strain attribution against a robust database of known biological agents. Our method, Pathoscope, capitalizes on a Bayesian statistical framework that accommodates information on sequence …


Phage Cluster Relationships Identified Through Single Gene Analysis., Kyle C Smith, Eduardo Castro-Nallar, Joshua Nb Fisher, Donald P Breakwell, Julianne H Grose, Sandra H Burnett Jun 2013

Phage Cluster Relationships Identified Through Single Gene Analysis., Kyle C Smith, Eduardo Castro-Nallar, Joshua Nb Fisher, Donald P Breakwell, Julianne H Grose, Sandra H Burnett

Computational Biology Institute

BACKGROUND: Phylogenetic comparison of bacteriophages requires whole genome approaches such as dotplot analysis, genome pairwise maps, and gene content analysis. Currently mycobacteriophages, a highly studied phage group, are categorized into related clusters based on the comparative analysis of whole genome sequences. With the recent explosion of phage isolation, a simple method for phage cluster prediction would facilitate analysis of crude or complex samples without whole genome isolation and sequencing. The hypothesis of this study was that mycobacteriophage-cluster prediction is possible using comparison of a single, ubiquitous, semi-conserved gene. Tape Measure Protein (TMP) was selected to test the hypothesis because it …