Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

2016

Discipline
Institution
Keyword
Publication
Publication Type
File Type

Articles 1 - 30 of 41

Full-Text Articles in Computational Biology

Novel Models Of Visual Topographic Map Alignment In The Superior Colliculus., Ruben A Tikidji-Hamburyan, Tarek A El-Ghazawi, Jason W. Triplett Dec 2016

Novel Models Of Visual Topographic Map Alignment In The Superior Colliculus., Ruben A Tikidji-Hamburyan, Tarek A El-Ghazawi, Jason W. Triplett

Pediatrics Faculty Publications

The establishment of precise neuronal connectivity during development is critical for sensing the external environment and informing appropriate behavioral responses. In the visual system, many connections are organized topographically, which preserves the spatial order of the visual scene. The superior colliculus (SC) is a midbrain nucleus that integrates visual inputs from the retina and primary visual cortex (V1) to regulate goal-directed eye movements. In the SC, topographically organized inputs from the retina and V1 must be aligned to facilitate integration. Previously, we showed that retinal input instructs the alignment of V1 inputs in the SC in a manner dependent on …


Punctuated Evolution Within A Eurythermic Genus (Mesenchytraeus) Of Segmented Worms: Genetic Modification Of The Glacier Ice Worm F1f0 Atp Synthase, Shirley A. Lang Dec 2016

Punctuated Evolution Within A Eurythermic Genus (Mesenchytraeus) Of Segmented Worms: Genetic Modification Of The Glacier Ice Worm F1f0 Atp Synthase, Shirley A. Lang

Graduate School of Biomedical Sciences Theses and Dissertations

Segmented worms (Annelida) are among the most successful animal inhabitants of extreme environments worldwide. An unusual group of Mesenchytraeus worms endemic to the Pacific Northwest of North America occupy geographically proximal ecozones ranging from low elevation temperate rainforests to high altitude glaciers. Along this altitudinal transect, Mesenchytraeus representatives from disparate habitat types were collected and subjected to deep mitochondrial and nuclear phylogenetic analyses. Evidence presented here employing modern bioinformatic analyses (i.e., maximum likelihood, Bayesian inference, multi-species coalescent) supports a Mesenchytraeus “explosion” in the upper Miocene (5-10 million years ago) that gave rise to ice, snow and terrestrial worms, derived from …


Design Of Novel Ion Channel Modulators, Vladimir Yarov-Yarovoy Nov 2016

Design Of Novel Ion Channel Modulators, Vladimir Yarov-Yarovoy

Science Seminar Series

Function and modulation of neuronal sodium channels are critical for the neuromodulation of electrical excitability and synaptic transmission in neurons - the basis for many aspects of signal transduction, learning, memory and physiological regulation. Mutations in neuronal voltage-gated sodium channel genes are responsible for various human neurological disorders. Furthermore, human neuronal voltage-gated sodium channels are primary targets of therapeutic drugs used as local anesthetics and for treatment of neurological and cardiac disorders. Yarov-Yarovoy's lab is working on rational design of novel therapeutically useful blockers of voltage-gated sodium channels for treatment of pain and epilepsy. Serious, chronic pain affects at least …


Sequence Annotation & Designing Gene-Specific Qpcr Primers (Computational), Ray A. Enke Oct 2016

Sequence Annotation & Designing Gene-Specific Qpcr Primers (Computational), Ray A. Enke

Ray Enke Ph.D.

This class tested protocol will guide students through the steps for the following activities:
  • Obtaining and annotating genomic DNA and mRNA sequence information
  • Designing primers for quantitative PCR (qPCR) analysis of a cDNA library


Qpcr Primer Standard Curve Assay (Wet Lab) + Kegg Pathway Analysis (Computational), Ray A. Enke Oct 2016

Qpcr Primer Standard Curve Assay (Wet Lab) + Kegg Pathway Analysis (Computational), Ray A. Enke

Ray Enke Ph.D.

This class tested protocol will guide students through the steps for the following activities:
  • analyzing qPCR standard curve data to determine primer efficiency
  • analyzing differential gene expression experimental qPCR data
  • applying KEGG pathway analysis of selected candidates genes


Comparative Population Genomics And Speciation Of Snakes Across The North American Deserts, Edward A. Myers Sep 2016

Comparative Population Genomics And Speciation Of Snakes Across The North American Deserts, Edward A. Myers

Dissertations, Theses, and Capstone Projects

Understanding the process of speciation is of central interest to evolutionary biologists. Speciation can be studied using a phylogeographic approach, by identifying regions that promote lineage divergence, addressing whether speciation has occurred with gene flow, and when extended to multiple taxa, addressing if the same patterns of speciation are shared across codistributed groups with different ecologies. Here I examine the comparative phylogeographic histories and population genomics of thirteen snake taxa that are widely distributed and co-occur across the arid southwest of North America. I first quantify the degree to which these species groups have a shared history of population divergence …


Identification Of Control Targets In Boolean Molecular Network Models Via Computational Algebra, David Murrugarra, Alan Veliz-Cuba, Boris Aguilar, Reinhard Laubenbacher Sep 2016

Identification Of Control Targets In Boolean Molecular Network Models Via Computational Algebra, David Murrugarra, Alan Veliz-Cuba, Boris Aguilar, Reinhard Laubenbacher

Mathematics Faculty Publications

Background: Many problems in biomedicine and other areas of the life sciences can be characterized as control problems, with the goal of finding strategies to change a disease or otherwise undesirable state of a biological system into another, more desirable, state through an intervention, such as a drug or other therapeutic treatment. The identification of such strategies is typically based on a mathematical model of the process to be altered through targeted control inputs. This paper focuses on processes at the molecular level that determine the state of an individual cell, involving signaling or gene regulation. The mathematical model type …


Rna2dnalign: Nucleotide Resolution Allele Asymmetries Through Quantitative Assessment Of Rna And Dna Paired Sequencing Data., Mercedeh Movassagh, Nawaf Alomran, Prakriti Mudvari, Merve Dede, Cem Dede, Kamran Kowsari, Paula Restrepo, Edmund Cauley, Sonali Bahl, Muzi Li, Wesley Waterhouse, Krasimira Tsaneva-Atanasova, Nathan Edwards, Anelia Horvath Aug 2016

Rna2dnalign: Nucleotide Resolution Allele Asymmetries Through Quantitative Assessment Of Rna And Dna Paired Sequencing Data., Mercedeh Movassagh, Nawaf Alomran, Prakriti Mudvari, Merve Dede, Cem Dede, Kamran Kowsari, Paula Restrepo, Edmund Cauley, Sonali Bahl, Muzi Li, Wesley Waterhouse, Krasimira Tsaneva-Atanasova, Nathan Edwards, Anelia Horvath

Biochemistry and Molecular Medicine Faculty Publications

We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching …


Modeling And Analysis Of Germ Layer Formations Using Finite Dynamical Systems, Alexander Garza, Megan Eberle, Eric A. Eager Aug 2016

Modeling And Analysis Of Germ Layer Formations Using Finite Dynamical Systems, Alexander Garza, Megan Eberle, Eric A. Eager

Spora: A Journal of Biomathematics

The development of an embryo from a fertilised egg to a multicellular organism proceeds through numerous steps, with the formation of the three germ layers (endoderm, mesoderm, ectoderm) being one of the first. In this paper we study the mesendoderm (the tissue that collectively gives rise to both mesoderm and endoderm) gene regulatory network for two species, \textit{Xenopus laevis} and the axolotl (\textit{Ambystoma mexicanum}) using Boolean networks. We find that previously-established bistability found in these networks can be reproduced using this Boolean framework, provided that some assumptions used in previously-published differential equations models are relaxed. We conclude by discussing our …


Gene Discovery In Mendelian And Complex Diseases, Sali Farhan Aug 2016

Gene Discovery In Mendelian And Complex Diseases, Sali Farhan

Electronic Thesis and Dissertation Repository

Through the Finding of Rare Disease Genes in Canada (FORGE Canada) initiative, individuals affected with rare Mendelian diseases were clinically ascertained with a goal of identifying the genetic origin of their disease. Herein, I describe the methods for identifying the genetic basis of four Mendelian diseases. The application of next generation sequencing led to the discovery of non-synonymous variation in the DNA of individuals affected by rare diseases. The effects of the candidate variants were assessed using a series of functional experiments to complement the human genetics data. The variants observed in patients’ cells are extremely rare, were consistently predicted …


A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im Aug 2016

A Gene-Based Association Method For Mapping Traits Using Reference Transcriptome Data, Eric R. Gamazon, Heather Wheeler, Kaanan P. Shah, Sahar V. Mozaffari, Keston Aquino-Michaels, Robert J. Carroll, Anne E. Eyler, Joshua C. Denny, Dan L. Nicolae, Nancy J. Cox, Hae Kyung Im

Heather Wheeler

Genome-wide association studies (GWAS) have identified thousands of variants robustly associated with complex traits. However, the biological mechanisms underlying these associations are, in general, not well understood. We propose a gene-based association method called PrediXcan that directly tests the molecular mechanisms through which genetic variation affects phenotype. The approach estimates the component of gene expression determined by an individual’s genetic profile and correlates ‘imputed’ gene expression with the phenotype under investigation to identify genes involved in the etiology of the phenotype. Genetically regulated gene expression is estimated using whole-genome tissue-dependent prediction models trained with reference transcriptome data sets. PrediXcan enjoys …


Optimization Of A Genomic Editing System Using Crispr/Cas9-Induced Site-Specific Gene Integration, Jillian L. Mccool Ms., Nick Hum, Gabriela G. Loots Aug 2016

Optimization Of A Genomic Editing System Using Crispr/Cas9-Induced Site-Specific Gene Integration, Jillian L. Mccool Ms., Nick Hum, Gabriela G. Loots

STAR Program Research Presentations

The CRISPR-Cas system is an adaptive immune system found in bacteria which helps protect against the invasion of other microorganisms. This system induces double stranded breaks at precise genomic loci (1) in which repairs are initiated and insertions of a target are completed in the process. This mechanism can be used in eukaryotic cells in combination with sgRNAs (1) as a tool for genome editing. By using this CRISPR-Cas system, in addition to the “safe harbor locus,” ROSAβ26, the incorporation of a target gene into a site that is not susceptible to gene silencing effects can be achieved through few …


Data Development And Analysis Pathways For Marine Mammals And Turtles: Creating A User Interface, Sarina Fernandez, Warren Asfazadour, Eric Archer, Lisa Komoroske Aug 2016

Data Development And Analysis Pathways For Marine Mammals And Turtles: Creating A User Interface, Sarina Fernandez, Warren Asfazadour, Eric Archer, Lisa Komoroske

STAR Program Research Presentations

A major obstacle in genetic research is developing streamlined methods for analyzing large amounts of data. The statistical computer programming language R provides users with the ability to develop packages containing specific functions in order to create more accessible data analysis pipelines. However, writing code in R can still be intimidating to those with little to no coding experience. Fortunately, the R package shiny provides a framework for developing web applications based on R functions. Using shiny, we developed a user-friendly web application containing functions of the R package strataG. The strataG package contains several functions for summarizing genetic data …


Incremental Phylogenetics By Repeated Insertions: An Evolutionary Tree Algorithm, Peter Revesz, Zhiqiang Li Aug 2016

Incremental Phylogenetics By Repeated Insertions: An Evolutionary Tree Algorithm, Peter Revesz, Zhiqiang Li

School of Computing: Faculty Publications

We introduce the idea of constructing hypothetical evolutionary trees using an incremental algorithm that inserts species one-by-one into the current evolutionary tree. The method of incremental phylogenetics by repeated insertions lead to an algorithm that can be used on DNA, RNA and amino acid sequences. According to experimental results on both synthetic and biological data, the new algorithm generates more accurate evolutionary trees than the UPGMA and the Neighbor Joining algorithms.


Development Of An In Silico Kir Genotyping Algorithm And Its Application To Population And Cancer Immunogenetic Analyses, Howard Rosoff Aug 2016

Development Of An In Silico Kir Genotyping Algorithm And Its Application To Population And Cancer Immunogenetic Analyses, Howard Rosoff

Dissertations & Theses (Open Access)

Gene content determination and variant calling in the complex KIR genomic region are useful for immune system function analysis, pathogenesis and disease risk factor elucidation, immunotherapy development, evolutionary investigations, and human migration modeling. Sequence-specific oligonucleotide and sequence-specific primer PCR methods are the de facto standards for KIR presence/absence identification, but the current platforms are unsuitable for SNP calling, impractical for KIR typing large cohorts of DNA samples, and inapplicable for typing repositories in which sequence data, but not cells or cell analytes, are available. Alternative typing methods, such as in silico sequence-based typing, can address the problems associated with amplicon-based …


Ten Simple Rules For Taking Advantage Of Git And Github, Yasset Perez-Riverol, Laurent Gatto, Rui Wang, Timo Sachsenberg, Julian Uszkoreit, Felipe Da Veiga Leprevost, Christian Fufezan, Tobias Ternent, Stephen J. Eglen, Daniel S. Katz, Tom J. Pollard, Alexander Konovalov, Robert M. Flight, Kai Blin, Juan Antonio Vizcaíno Jul 2016

Ten Simple Rules For Taking Advantage Of Git And Github, Yasset Perez-Riverol, Laurent Gatto, Rui Wang, Timo Sachsenberg, Julian Uszkoreit, Felipe Da Veiga Leprevost, Christian Fufezan, Tobias Ternent, Stephen J. Eglen, Daniel S. Katz, Tom J. Pollard, Alexander Konovalov, Robert M. Flight, Kai Blin, Juan Antonio Vizcaíno

Molecular and Cellular Biochemistry Faculty Publications

No abstract provided.


Machine Learning Meta-Analysis Of Large Metagenomic Datasets: Tools And Biological Insight, Edoardo Pasolli, Duy Tin Truong, Faizan Malik, Levi Waldron, Nicola Segata Jul 2016

Machine Learning Meta-Analysis Of Large Metagenomic Datasets: Tools And Biological Insight, Edoardo Pasolli, Duy Tin Truong, Faizan Malik, Levi Waldron, Nicola Segata

Publications and Research

Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction …


Comparative Genomics, Transcriptomics, And Physiology Distinguish Symbiotic From Free-Living Chlorella Strains, Cristian F. Quispe, Olivia Sonderman, Maya Khasin, Wayne R. Riekhof, James L. Van Etten, Kenneth Nickerson Jul 2016

Comparative Genomics, Transcriptomics, And Physiology Distinguish Symbiotic From Free-Living Chlorella Strains, Cristian F. Quispe, Olivia Sonderman, Maya Khasin, Wayne R. Riekhof, James L. Van Etten, Kenneth Nickerson

Kenneth Nickerson Papers

Most animal–microbe symbiotic interactions must be advantageous to the host and provide nutritional benefits to the endosymbiont. When the host provides nutrients, it can gain the capacity to control the interaction, promote self-growth, and increase its fitness. Chlorella-like green algae engage in symbiotic relationships with certain protozoans, a partnership that significantly impacts the physiology of both organisms. Consequently, it is often challenging to grow axenic Chlorella cultures after isolation from the host because they are nutrient fastidious and often susceptible to virus infection. We hypothesize that the establishment of a symbiotic relationship resulted in natural selection for nutritional and metabolic …


Identification Of Zika Virus And Dengue Virus Dependency Factors Using Functional Genomics, George Savidis, William M. Mcdougall, Paul Meraner, Jill Perreira, Jocelyn M. Portmann, Gaia Trincucci, Sinu P. John, Aaron M. Aker, Nicholas Renzette, Douglas R. Robbins, Zhiru Guo, Sharone Green, Timothy F. Kowalik, Abraham L. Brass Jun 2016

Identification Of Zika Virus And Dengue Virus Dependency Factors Using Functional Genomics, George Savidis, William M. Mcdougall, Paul Meraner, Jill Perreira, Jocelyn M. Portmann, Gaia Trincucci, Sinu P. John, Aaron M. Aker, Nicholas Renzette, Douglas R. Robbins, Zhiru Guo, Sharone Green, Timothy F. Kowalik, Abraham L. Brass

Sharone Green

The flaviviruses dengue virus (DENV) and Zika virus (ZIKV) are severe health threats with rapidly expanding ranges. To identify the host cell dependencies of DENV and ZIKV, we completed orthologous functional genomic screens using RNAi and CRISPR/Cas9 approaches. The screens recovered the ZIKV entry factor AXL as well as multiple host factors involved in endocytosis (RAB5C and RABGEF), heparin sulfation (NDST1 and EXT1), and transmembrane protein processing and maturation, including the endoplasmic reticulum membrane complex (EMC). We find that both flaviviruses require the EMC for their early stages of infection. Together, these studies generate a high-confidence, systems-wide view of human-flavivirus …


Identification, Characterization, And Life Cycle Of Intein-Associated Homing Endonucleases, Joshua J. Skydel Jun 2016

Identification, Characterization, And Life Cycle Of Intein-Associated Homing Endonucleases, Joshua J. Skydel

Honors Scholar Theses

Inteins are molecular parasites that have been identified in unicellular organisms from the three domains of life. The intein self-excises following translation of the host gene, and therefore incurs a fitness cost for its carrier. The symbiotic state of the intein to its host is dependent on the presence or absence of a homing endonuclease domain, which facilitates horizontal transfer of the molecule. Identification of this domain provides information on the evolutionary history of the intein, as well as patterns of horizontal gene transfer in microbial communities. I have therefore developed Hidden Markov Models (HMMs) to identify homing endonuclease domains …


Using Hadoop To Identify False Positives In Bacterial Strain Typing From Dna Fingerprints, Colin C. Adams Jun 2016

Using Hadoop To Identify False Positives In Bacterial Strain Typing From Dna Fingerprints, Colin C. Adams

Computer Science and Software Engineering

Pyroprinting is a novel technique used by the Department of Biological Sciences to obtain “fingerprints” from the DNA of E. coli isolates in order to categorize them into strains. To determine the number of false positives that occur in the pyroprinting process, isolates with the same pyroprints needed to be sequenced to see if their underlying alleles match. If they do match, this shows they are indeed the same strain and are a true positive. If the alleles don’t match, they are different strains and are a false positive. To do this 100 isolates with nucleotide identifiers were sequenced. Over …


Phylogenetic Analysis Of Human Cytomegalovirus Pus27 And Pus28: Ascertaining An Independent Or Linked Evolutionary History, Jessica A. Scarborough May 2016

Phylogenetic Analysis Of Human Cytomegalovirus Pus27 And Pus28: Ascertaining An Independent Or Linked Evolutionary History, Jessica A. Scarborough

Undergraduate Honors Theses

Human cytomegalovirus (HCMV) is a widespread pathogen that is particularly skilled at evading immune detection and defense mechanisms, largely due to extensive co-evolution with its host’s immune system. One aspect of this co-evolution involves the acquisition of four virally encoded GPCR chemokine receptor homologs, products of the US27, US28, UL33 and UL78 genes. G protein-coupled receptors (GPCR) are the largest family of cell surface proteins, found in organisms from yeast to humans. In this research, phylogenetic analysis was used to investigate the origins of the US27 and US28 genes, which are adjacent in the viral genome. The results indicate that …


In Silico Driven Metabolic Engineering Towards Enhancing Biofuel And Biochemical Production, Richard Adam Thompson May 2016

In Silico Driven Metabolic Engineering Towards Enhancing Biofuel And Biochemical Production, Richard Adam Thompson

Doctoral Dissertations

The development of a secure and sustainable energy economy is likely to require the production of fuels and commodity chemicals in a renewable manner. There has been renewed interest in biological commodity chemical production recently, in particular focusing on non-edible feedstocks. The fields of metabolic engineering and synthetic biology have arisen in the past 20 years to address the challenge of chemical production from biological feedstocks. Metabolic modeling is a powerful tool for studying the metabolism of an organism and predicting the effects of metabolic engineering strategies. Various techniques have been developed for modeling cellular metabolism, with the underlying principle …


Computational Identification Of Terpene Synthase Genes And Their Evolutionary Analysis, Qidong Jia May 2016

Computational Identification Of Terpene Synthase Genes And Their Evolutionary Analysis, Qidong Jia

Doctoral Dissertations

Terpenoids, the largest and most structurally and functionally diverse class of natural compounds on earth, are mostly synthesized by plants to be involved in various plant environment interactions. Some terpenoids are classified as primary metabolites essential for plant growth and development. Terpene synthases (TPSs), the key enzymes for terpenoid biosynthesis, are the major determinant of the tremendous diversity of terpenoid carbon skeletons. The TPS genes represent a mid-size family of about 30-100 functional genes in almost all major sequenced plant genomes. TPSs are also found in fungi and bacteria, but microbial TPS genes share low levels of sequence similarity and …


Accurate Mutation Annotation And Functional Prediction Enhance The Applicability Of -Omics Data In Precision Medicine, Tenghui Chen May 2016

Accurate Mutation Annotation And Functional Prediction Enhance The Applicability Of -Omics Data In Precision Medicine, Tenghui Chen

Dissertations & Theses (Open Access)

Clinical sequencing has been recognized as an effective approach for enhancing the accuracy and efficiency of cancer patient management and therefore achieve the goals of personalized therapy. However, the accuracy of large scale sequencing data in clinics has been constrained by many different aspects, such as clinical detection, annotation and interpretation of the variants that are observed in clinical sequencing data. In my Ph.D thesis work, I mainly investigated how to comprehensively and efficiently apply high dimensional -omics data to enhance the capability of precision cancer medicine. Following this motivation, my dissertation has been focused on two important topics in …


Investigating Metastatic Lineage In Colorectal Cancer By Single Cell Dna Sequencing, Marco Leung May 2016

Investigating Metastatic Lineage In Colorectal Cancer By Single Cell Dna Sequencing, Marco Leung

Dissertations & Theses (Open Access)

Metastasis is the primary cause of human cancer deaths. Patients with metastatic colorectal cancer (mCRC) show only an 11% 5-year survival rate, compared to those without local or distant metastases (92% 5-year survival rate). Understanding the CRC tumor evolution may provide valuable insights on how to improve treatment in patients with mCRC. However, the genomic basis of metastasis has been difficult to study, in part due to the extensive intratumor heterogeneity at both the primary and metastatic tumor sites, and the low frequency of subclones with metastatic potential. Previous studies have applied conventional bulk next-generation sequencing (NGS) methods, which have …


Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore Apr 2016

Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore

Dartmouth Scholarship

Identifying gene-gene interactions is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Here, we aimed at developing a permutation-based methodology relying on a machine learning method, random forest (RF), to detect gene-gene interactions. Our approach called permuted random forest (pRF) which identified the top interacting single nucleotide polymorphism (SNP) pairs by estimating how much the power of a random forest classification model is influenced by removing pairwise interactions.


Fastpop: A Rapid Principal Component Derived Method To Infer Intercontinental Ancestry Using Genetic Data, Yafang Li, Jinyoung Byun, Guoshuai Cai, Xiangjun Xiao, Younghun Han, Olivier Cornelis, James E. Dinulos, Joe Dennis, Douglas Easton, Ivan Gorlov, Michael F. Seldin, Christopher I. Amos Mar 2016

Fastpop: A Rapid Principal Component Derived Method To Infer Intercontinental Ancestry Using Genetic Data, Yafang Li, Jinyoung Byun, Guoshuai Cai, Xiangjun Xiao, Younghun Han, Olivier Cornelis, James E. Dinulos, Joe Dennis, Douglas Easton, Ivan Gorlov, Michael F. Seldin, Christopher I. Amos

Dartmouth Scholarship

Identifying subpopulations within a study and inferring intercontinental ancestry of the samples are important steps in genome wide association studies. Two software packages are widely used in analysis of substructure: Structure and Eigenstrat. Structure assigns each individual to a population by using a Bayesian method with multiple tuning parameters. It requires considerable computational time when dealing with thousands of samples and lacks the ability to create scores that could be used as covariates. Eigenstrat uses a principal component analysis method to model all sources of sampling variation. However, it does not readily provide information directly relevant to ancestral origin; the …


Phagephisher: A Pipeline For The Discovery Of Covert Viral Sequences In Complex Genomic Datasets, Thomas Hatzopoulos, Siobhan C. Watkins, Catherine Putonti Mar 2016

Phagephisher: A Pipeline For The Discovery Of Covert Viral Sequences In Complex Genomic Datasets, Thomas Hatzopoulos, Siobhan C. Watkins, Catherine Putonti

Bioinformatics Faculty Publications

Obtaining meaningful viral information from large sequencing datasets presents unique challenges distinct from prokaryotic and eukaryotic sequencing efforts. The difficulties surrounding this issue can be ascribed in part to the genomic plasticity of viruses themselves as well as the scarcity of existing information in genomic databases. The open-source software PhagePhisher (http://www.putonti-lab.com/phagephisher) has been designed as a simple pipeline to extract relevant information from complex and mixed datasets, and will improve the examination of bacteriophages, viruses, and virally related sequences, in a range of environments. Key aspects of the software include speed and ease of use; PhagePhisher can be used with …


Nbs1 Chip-Seq Identifies Off-Target Dna Double-Strand Breaks Induced By Aid In Activated Splenic B Cells, Lyne Khair, Richard E. Baker, Erin K. Linehan, Carol E. Schrader, Janet Stavnezer Feb 2016

Nbs1 Chip-Seq Identifies Off-Target Dna Double-Strand Breaks Induced By Aid In Activated Splenic B Cells, Lyne Khair, Richard E. Baker, Erin K. Linehan, Carol E. Schrader, Janet Stavnezer

Janet M. Stavnezer

Activation-induced cytidine deaminase (AID) is required for initiation of Ig class switch recombination (CSR) and somatic hypermutation (SHM) of antibody genes during immune responses. AID has also been shown to induce chromosomal translocations, mutations, and DNA double-strand breaks (DSBs) involving non-Ig genes in activated B cells. To determine what makes a DNA site a target for AID-induced DSBs, we identify off-target DSBs induced by AID by performing chromatin immunoprecipitation (ChIP) for Nbs1, a protein that binds DSBs, followed by deep sequencing (ChIP-Seq). We detect and characterize hundreds of off-target AID-dependent DSBs. Two types of tandem repeats are highly enriched within …