Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 35

Full-Text Articles in Entire DC Network

Dfhic: A Dilated Full Convolution Model To Enhance The Resolution Of Hi-C Data, Bin Wang, Kun Liu, Yaohang Li, Jianxin Wang Jan 2023

Dfhic: A Dilated Full Convolution Model To Enhance The Resolution Of Hi-C Data, Bin Wang, Kun Liu, Yaohang Li, Jianxin Wang

Computer Science Faculty Publications

Motivation: Hi-C technology has been the most widely used chromosome conformation capture(3C) experiment that measures the frequency of all paired interactions in the entire genome, which is a powerful tool for studying the 3D structure of the genome. The fineness of the constructed genome structure depends on the resolution of Hi-C data. However, due to the fact that high-resolution Hi-C data require deep sequencing and thus high experimental cost, most available Hi-C data are in low-resolution. Hence, it is essential to enhance the quality of Hi-C data by developing the effective computational methods.

Results: In this work, we propose …


The Low Abundance Of Cpg In The Sars-Cov-2 Genome Is Not An Evolutionarily Signature Of Zap, Ali Afrasiabi, Hamid Alinejad-Rokny, Azad Khosh, Mostafa Rahnama, Nigel Lovell, Zhenming Xu, Diako Ebrahimi Feb 2022

The Low Abundance Of Cpg In The Sars-Cov-2 Genome Is Not An Evolutionarily Signature Of Zap, Ali Afrasiabi, Hamid Alinejad-Rokny, Azad Khosh, Mostafa Rahnama, Nigel Lovell, Zhenming Xu, Diako Ebrahimi

Plant Pathology Faculty Publications

The zinc finger antiviral protein (ZAP) is known to restrict viral replication by binding to the CpG rich regions of viral RNA, and subsequently inducing viral RNA degradation. This enzyme has recently been shown to be capable of restricting SARS-CoV-2. These data have led to the hypothesis that the low abundance of CpG in the SARS-CoV-2 genome is due to an evolutionary pressure exerted by the host ZAP. To investigate this hypothesis, we performed a detailed analysis of many coronavirus sequences and ZAP RNA binding preference data. Our analyses showed neither evidence for an evolutionary pressure acting specifically on CpG …


Identifying The Cell Composition And Clonal Diversity Of Supratentorial Ependymoma Using Single Cell Rna-Sequencing, James He May 2021

Identifying The Cell Composition And Clonal Diversity Of Supratentorial Ependymoma Using Single Cell Rna-Sequencing, James He

University Scholar Projects

Ependymoma is a primary solid tumor of the central nervous system. Supratentorial ependymoma (ST-EPN), a subtype of ependymomas, is driven by an oncogenic fusion between the ZFTA and RELA genes in 70% of cases. We introduced this fusion into neural progenitor cells of mice embryos via in utero electroporation of a non-viral binary piggyBac transposon system containing ZFTA-RELA. From preliminary data in the LoTurco lab, inducing the expression of ZFTA-RELA into different neural progenitor cells produces tumors of varying lethality and cellular composition. To define the cellular composition and subclonal diversity of ST-EPN tumors, we used single cell RNA-sequencing to …


Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman Jan 2021

Analysis Of Subtelomeric Rextal Assemblies Using Quast, Tunazzina Islam, Desh Ranjan, Mohammad Zubair, Eleanor Young, Ming Xiao, Harold Riethman

Computer Science Faculty Publications

Genomic regions of high segmental duplication content and/or structural variation have led to gaps and misassemblies in the human reference sequence, and are refractory to assembly from whole-genome short-read datasets. Human subtelomere regions are highly enriched in both segmental duplication content and structural variations, and as a consequence are both impossible to assemble accurately and highly variable from individual to individual. Recently, we developed a pipeline for improved region-specific assembly called Regional Extension of Assemblies Using Linked-Reads (REXTAL). In this study, we evaluate REXTAL and genome-wide assembly (Supernova) approaches on 10X Genomics linked-reads data sets partitioned and barcoded using the …


Evolutionary And Ecological Determinism Of Host Specificity In Arthropod Parasites, Joseph Levey Apr 2020

Evolutionary And Ecological Determinism Of Host Specificity In Arthropod Parasites, Joseph Levey

UCARE Research Products

Understanding why some diseases infect more species than others is crucial for predicting where and when disease will spread, which can inform the management of wildlife, agriculture, and human health. Currently, large scale patterns of host-parasite dynamics are being studied to understand where to look for and how to manage emerging human diseases (Leroy 2005; Benelli 2018). Previous research has used the Global Mammal Parasite Database (GMPD) to look at host breadth—the number and diversity of species a pathogen can infect—for various groups of parasites, e.g. helminths, arthropods, fungi, etc., from a host-centric perspective (Stephens et al. 2017; Park et …


Gene-Based Association Study For Lipid Traits In Diverse Cohorts Implicates Bace1 And Sidt2 Regulation In Triglyceride Levels, Angela Andaleon, Lauren S. Mogil, Heather Wheeler Jan 2018

Gene-Based Association Study For Lipid Traits In Diverse Cohorts Implicates Bace1 And Sidt2 Regulation In Triglyceride Levels, Angela Andaleon, Lauren S. Mogil, Heather Wheeler

Bioinformatics Faculty Publications

Plasma lipid levels are risk factors for cardiovascular disease, a leading cause of death worldwide. While many studies have been conducted on lipid genetics, they mainly focus on Europeans and thus their transferability to diverse populations is unclear. We performed SNP- and gene-level genome-wide association studies (GWAS) of four lipid traits in cohorts from Nigeria and the Philippines and compared them to the results of larger, predominantly European meta-analyses. Two previously implicated loci met genome-wide significance in our SNP-level GWAS in the Nigerian cohort, rs34065661 in CETP associated with HDL cholesterol (P = 9.0 × 10−10) and …


A Longitudinal Cline Characterizes The Genetic Structure Of Human Populations In The Tibetan Plateau, Choongwon Jeong, Benjamin M. Peter, Buddha Basnyat, Maniraj Neupane, Geoff Childs, Sienna Craig, John Novembre, Anna Di Rienzo Apr 2017

A Longitudinal Cline Characterizes The Genetic Structure Of Human Populations In The Tibetan Plateau, Choongwon Jeong, Benjamin M. Peter, Buddha Basnyat, Maniraj Neupane, Geoff Childs, Sienna Craig, John Novembre, Anna Di Rienzo

Dartmouth Scholarship

Indigenous populations of the Tibetan plateau have attracted much attention for their good performance at extreme high altitude. Most genetic studies of Tibetan adaptations have used genetic variation data at the genome scale, while genetic inferences about their de- mography and population structure are largely based on uniparental markers. To provide genome-wide information on population structure, we analyzed new and published data of 338 individuals from indigenous populations across the plateau in conjunction with world- wide genetic variation data. We found a clear signal of genetic stratification across the east- west axis within Tibetan samples. Samples from more eastern locations …


Systems Level Analysis Of Systemic Sclerosis Shows A Network Of Immune And Profibrotic Pathways Connected With Genetic Polymorphisms, J. Matthew Mahoney, Jaclyn Taroni, Viktor Martyanov, Tammara A. A. Wood, Casey S. Greene, Patricia A. Pioli, Monique E. Hinchcliff, Michael L. Whitfield Jan 2015

Systems Level Analysis Of Systemic Sclerosis Shows A Network Of Immune And Profibrotic Pathways Connected With Genetic Polymorphisms, J. Matthew Mahoney, Jaclyn Taroni, Viktor Martyanov, Tammara A. A. Wood, Casey S. Greene, Patricia A. Pioli, Monique E. Hinchcliff, Michael L. Whitfield

Dartmouth Scholarship

Systemic sclerosis (SSc) is a rare systemic autoimmune disease characterized by skin and organ fibrosis. The pathogenesis of SSc and its progression are poorly understood. The SSc intrinsic gene expression subsets (inflammatory, fibroproliferative, normal-like, and limited) are observed in multiple clinical cohorts of patients with SSc. Analysis of longitudinal skin biopsies suggests that a patient's subset assignment is stable over 6-12 months. Genetically, SSc is multi-factorial with many genetic risk loci for SSc generally and for specific clinical manifestations. Here we identify the genes consistently associated with the intrinsic subsets across three independent cohorts, show the relationship between these genes …


Phenotypic Robustness And The Assortativity Signature Of Human Transcription Factor Networks, Dov A. Pechenick, Joshua L. Payne, Jason H. Moore Aug 2014

Phenotypic Robustness And The Assortativity Signature Of Human Transcription Factor Networks, Dov A. Pechenick, Joshua L. Payne, Jason H. Moore

Dartmouth Scholarship

Many developmental, physiological, and behavioral processes depend on the precise expression of genes in space and time. Such spatiotemporal gene expression phenotypes arise from the binding of sequence-specific transcription factors (TFs) to DNA, and from the regulation of nearby genes that such binding causes. These nearby genes may themselves encode TFs, giving rise to a transcription factor network (TFN), wherein nodes represent TFs and directed edges denote regulatory interactions between TFs. Computational studies have linked several topological properties of TFNs - such as their degree distribution - with the robustness of a TFN's gene expression phenotype to genetic and environmental …


Structural Features Of The Pseudomonas Fluorescens Biofilm Adhesin Lapa Required For Lapg-Dependent Cleavage, Biofilm Formation, And Cell Surface Localization, Chelsea D. Boyd, T. Jarrod Smith, Sofiane El-Kirat-Chatel, Peter D. Newell, Yves F. Dufrêne, George A. O'Toole May 2014

Structural Features Of The Pseudomonas Fluorescens Biofilm Adhesin Lapa Required For Lapg-Dependent Cleavage, Biofilm Formation, And Cell Surface Localization, Chelsea D. Boyd, T. Jarrod Smith, Sofiane El-Kirat-Chatel, Peter D. Newell, Yves F. Dufrêne, George A. O'Toole

Dartmouth Scholarship

The localization of the LapA protein to the cell surface is a key step required by Pseudomonas fluorescens Pf0-1 to irreversibly attach to a surface and form a biofilm. LapA is a member of a diverse family of predicted bacterial adhesins, and although lacking a high degree of sequence similarity, family members do share common predicted domains. Here, using mutational analysis, we determine the significance of each domain feature of LapA in relation to its export and localization to the cell surface and function in biofilm formation. Our previous work showed that the N terminus of LapA is required for …


Creating A Package In R, Brit Schneiders, Eric Archer Aug 2013

Creating A Package In R, Brit Schneiders, Eric Archer

STAR Program Research Presentations

In a time of increasingly efficient technology and data production, scientists are producing data faster than it can be analyzed. Therefore, user accessibility to data analysis is becoming more and more critical. In general, researchers have a set of raw data and want an efficient means to their final analysis. A package serves as that means by creating a set of functions and making them accessible to the user. Often, a user has a small piece of code to run (a single R script, for example), and that script requires the use of certain functions, which are contained in a …


Key Genes For Modulating Information Flow Play A Temporal Role As Breast Tumor Coexpression Networks Are Dynamically Rewired By Letrozole, Nadia M. Penrod, Jason H. Moore May 2013

Key Genes For Modulating Information Flow Play A Temporal Role As Breast Tumor Coexpression Networks Are Dynamically Rewired By Letrozole, Nadia M. Penrod, Jason H. Moore

Dartmouth Scholarship

Genes do not act in isolation but instead as part of complex regulatory networks. To understand how breast tumors adapt to the presence of the drug letrozole, at the molecular level, it is necessary to consider how the expression levels of genes in these networks change relative to one another. Using transcriptomic data generated from sequential tumor biopsy samples, taken at diagnosis, following 10-14 days and following 90 days of letrozole treatment, and a pairwise partial orrelation statistic, we build temporal gene coexpression networks. We characterize the structure of each network and identify genes that hold prominent positions for maintaining …


Transcriptomic Characterization Of A Synergistic Genetic Interaction During Carpel Margin Meristem Development In Arabidopsis Thaliana, April N. Wynn, Elizabeth E. Rueschhoff, Robert G. Franks Oct 2011

Transcriptomic Characterization Of A Synergistic Genetic Interaction During Carpel Margin Meristem Development In Arabidopsis Thaliana, April N. Wynn, Elizabeth E. Rueschhoff, Robert G. Franks

Biological Sciences Research

In flowering plants the gynoecium is the female reproductive structure. In Arabidopsis thalianaovules initiate within the developing gynoecium from meristematic tissue located along the margins of the floral carpels. When fertilized the ovules will develop into seeds. SEUSS (SEU) and AINTEGUMENTA (ANT) encode transcriptional regulators that are critical for the proper formation of ovules from the carpel margin meristem (CMM). The synergistic loss of ovule initiation observed in the seu ant double mutant suggests that SEU and ANT share overlapping functions during CMM development. However the molecular mechanism underlying this synergistic interaction is unknown. Using …


Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin May 2010

Powerful Snp Set Analysis For Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Decomposition Of The Pure Parsimony Problem, Allen Holder, Thomas M. Langley Aug 2009

A Decomposition Of The Pure Parsimony Problem, Allen Holder, Thomas M. Langley

Mathematical Sciences Technical Reports (MSTR)

We partially order a collection of genotypes so that we can represent the problem of inferring the least number of haplotypes in terms of substructures we call g-lattices. This representation allows us to prove that if the genotypes partition into chains with certain structure, then the NP-Hard problem can be solved efficiently. Even without the specified structure, the decomposition shows how to separate the underlying integer programming model into smaller models.


Ab Initio Exon Definition Using An Information Theory-Based Approach, Peter K. Rogan Mar 2009

Ab Initio Exon Definition Using An Information Theory-Based Approach, Peter K. Rogan

Biochemistry Publications

Transcribed exons in genes are joined together at donor and acceptor splice sites precisely and efficiently to generate mRNAs capa ble of being translated into proteins. The sequence variability in individual splice sites can be modeled using Shannon information theory. In the laboratory, the degree of individual splice site use is inferred from the structures of mRNAs and their relative abundance. These structures can be predicted using a bipartite information theory framework that is guided by current knowledge of biological mechanisms for exon recognition. We present the results of this analysis for the complete dataset of all expressed human exons.


Sparse Linear Discriminant Analysis For Simultaneous Testing For The Significance Of A Gene Set/Pathway And Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, Xihong Lin Jan 2009

Sparse Linear Discriminant Analysis For Simultaneous Testing For The Significance Of A Gene Set/Pathway And Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin Jun 2008

Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein Jun 2008

A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein

Harvard University Biostatistics Working Paper Series

No abstract provided.


Micrornas And The Advent Of Vertebrate Morphological Complexity, Alysha M. Heimberg, Lorenzo F. Sempere, Vanessa N. Moy, Phillip C. J. Donoghue, Kevin J. Peterson Feb 2008

Micrornas And The Advent Of Vertebrate Morphological Complexity, Alysha M. Heimberg, Lorenzo F. Sempere, Vanessa N. Moy, Phillip C. J. Donoghue, Kevin J. Peterson

Dartmouth Scholarship

The causal basis of vertebrate complexity has been sought in genome duplication events (GDEs) that occurred during the emergence of vertebrates, but evidence beyond coincidence is wanting. MicroRNAs (miRNAs) have recently been identified as a viable causal factor in increasing organismal complexity through the action of these ≈22-nt noncoding RNAs in regulating gene expression. Because miRNAs are continuously being added to animalian genomes, and, once integrated into a gene regulatory network, are strongly conserved in primary sequence and rarely secondarily lost, their evolutionary history can be accurately reconstructed. Here, using a combination of Northern analyses and genomic searches, we show …


Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai Nov 2007

Assessing Population Level Genetic Instability Via Moving Average, Samuel Mcdaniel, Rebecca Betensky, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross Jul 2007

A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross

Dartmouth Scholarship

Despite the diversity of motif representations and search algorithms, the de novo computational identification of transcription factor binding sites remains constrained by the limited accuracy of existing algorithms and the need for user-specified input parameters that describe the motif being sought.ResultsWe present a novel ensemble learning method, SCOPE, that is based on the assumption that transcription factor binding sites belong to one of three broad classes of motifs: non-degenerate, degenerate and gapped motifs. SCOPE employs a unified scoring metric to combine the results from three motif finding algorithms each aimed at the discovery of one of these classes of motifs. …


Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky Jul 2007

Assessment Of A Cgh-Based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, Rebecca A. Betensky

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li Jul 2007

Survival Analysis With Large Dimensional Covariates: An Application In Microarray Studies, David A. Engler, Yi Li

Harvard University Biostatistics Working Paper Series

Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time …


Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond Feb 2007

Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond

UW Biostatistics Working Paper Series

Whole-genome studies are becoming a mainstay of biomedical research. Examples include expression array experiments, comparative genomic hybridization analyses and large case-control studies for detecting polymorphism/disease associations. The tactic of applying a regression model to every locus to obtain test statistics is useful in such studies. However, this approach ignores potential correlation structure in the data that could be used to gain power, particularly when a Bonferroni correction is applied to adjust for multiple testing. In this article, we propose using regression techniques for misspecified multivariate outcomes to increase statistical power over independence-based modeling at each locus. Even when the outcome …


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng Aug 2006

Structural Inference In Transition Measurement Error Models For Longitudinal Data, Wenqin Pan, Xihong Lin, Donglin Zeng

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin Aug 2006

Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin Aug 2006

Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.