Open Access. Powered by Scholars. Published by Universities.®

Computational Biology Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Computational Biology

Dna Methylation-Based Epigenetic Biomarkers In Cell-Type Deconvolution And Tumor Tissue Of Origin Identification, Ze Zhang Dec 2023

Dna Methylation-Based Epigenetic Biomarkers In Cell-Type Deconvolution And Tumor Tissue Of Origin Identification, Ze Zhang

Dartmouth College Ph.D Dissertations

DNA methylation is an epigenetic modification that regulates gene expression and is essential to establishing and preserving cellular identity. Genome-wide DNA methylation arrays provide a standardized and cost-effective approach to measuring DNA methylation. When combined with a cell-type reference library, DNA methylation measures allow the assessment of underlying cell-type proportions in heterogeneous mixtures. This approach, known as DNA methylation deconvolution or methylation cytometry, offers a standardized and cost-effective method for evaluating cell-type proportions. While this approach has succeeded in discerning cell types in various human tissues like blood, brain, tumors, skin, breast, and buccal swabs, the existing methods have major …


Tracing Evolution Of Gene Transfer Agents Using Comparative Genomics, Roman Kogay Nov 2023

Tracing Evolution Of Gene Transfer Agents Using Comparative Genomics, Roman Kogay

Dartmouth College Ph.D Dissertations

The accumulating evidence suggest that viruses and their components can be domesticated by their hosts, equipping them with convenient molecular toolkits for various functions. One of such domesticated system is Gene Transfer Agents (GTAs) that are produced by some bacteria and archaea. GTAs morphologically resemble small phage-like particles and contain random fragments of their host genome. They are produced only by a small fraction of the microbial population and are released through a lysis of the host cell. Bioinformatic analyses suggest that GTAs are especially abundant in the taxonomic class of Alphaproteobacteria, where they are vertically inherited and evolve …


Genome-Scale Methylation Analysis In Blood And Tumor Identifies Immune Profile, Age Acceleration, And Dna Methylation Alterations Associated With Bladder Cancer Outcomes, Ji-Qing Chen Aug 2023

Genome-Scale Methylation Analysis In Blood And Tumor Identifies Immune Profile, Age Acceleration, And Dna Methylation Alterations Associated With Bladder Cancer Outcomes, Ji-Qing Chen

Dartmouth College Ph.D Dissertations

Bladder cancer patients receive frequent screening due to the high tumor recurrence rate (more than 60%). Nowadays, the conventional monitoring method relies on cystoscopy which is highly invasive and increases patient morbidity and burden to the health care system with frequent follow-up. As a result, it is urgent to explore novel markers related to the outcomes of bladder cancer. Immune profiles have been associated with cancer outcomes and may have the potential to be biomarkers for outcomes management. However, little work has been conducted to investigate the associations of immune cell profiles with bladder cancer outcomes. Here, I utilized the …


Cell-Typing And Interaction Analysis Of The Immune Compartment Of The Tumor Microenvironment Using High-Resolution Omics Modalities, Courtney Taylor Schiebout Apr 2023

Cell-Typing And Interaction Analysis Of The Immune Compartment Of The Tumor Microenvironment Using High-Resolution Omics Modalities, Courtney Taylor Schiebout

Dartmouth College Ph.D Dissertations

Single-cell RNA-sequencing (scRNA-seq) has provided a new frontier for the investigation of complex tissues. One ideal candidate for the utilization of this method is the tumor microenvironment (TME). The TME is often host to a complex set of cell populations and behaviors that can be highly influential for cancer inhibition or progression. This is especially true of the immune compartment of the TME: the presence of certain types of immune cells in the TME and their expression profiles can significantly affect cancer prognosis in some cases. By providing individual cell-level gene expression data, scRNA-seq can be highly informative for characterizing …


Characterization Of Cell Type-Specific Molecular Heterogeneity In Cancer Using Multi-Omic Approaches, Min Kyung Lee Jan 2023

Characterization Of Cell Type-Specific Molecular Heterogeneity In Cancer Using Multi-Omic Approaches, Min Kyung Lee

Dartmouth College Ph.D Dissertations

Tumors are composed of heterogeneous cell types each with its own unique molecular profiles. Recent advances in single cell genomics technologies have begun to increase our understanding of the molecular heterogeneity that exists in tumors with particular focus on gene expression and chromatin accessibility profiles. However, due to limitations in methods for certain sample types and high cost for single cell genomics, bulk tumor molecular profiling has been and remains widely used. In addition, other facets of single cell epigenomic profiling, particularly methylation and hydroxymethylation, remains underexplored. Thus, investigations to understand the cell type specific epigenetic heterogeneity and the cooperation …


Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore Apr 2016

Detecting Gene-Gene Interactions Using A Permutation-Based Random Forest Method, Jing Li, James D. Malley, Angeline S. Andrew, Margaret R. Karagas, Jason H. Moore

Dartmouth Scholarship

Identifying gene-gene interactions is essential to understand disease susceptibility and to detect genetic architectures underlying complex diseases. Here, we aimed at developing a permutation-based methodology relying on a machine learning method, random forest (RF), to detect gene-gene interactions. Our approach called permuted random forest (pRF) which identified the top interacting single nucleotide polymorphism (SNP) pairs by estimating how much the power of a random forest classification model is influenced by removing pairwise interactions.


Fastpop: A Rapid Principal Component Derived Method To Infer Intercontinental Ancestry Using Genetic Data, Yafang Li, Jinyoung Byun, Guoshuai Cai, Xiangjun Xiao, Younghun Han, Olivier Cornelis, James E. Dinulos, Joe Dennis, Douglas Easton, Ivan Gorlov, Michael F. Seldin, Christopher I. Amos Mar 2016

Fastpop: A Rapid Principal Component Derived Method To Infer Intercontinental Ancestry Using Genetic Data, Yafang Li, Jinyoung Byun, Guoshuai Cai, Xiangjun Xiao, Younghun Han, Olivier Cornelis, James E. Dinulos, Joe Dennis, Douglas Easton, Ivan Gorlov, Michael F. Seldin, Christopher I. Amos

Dartmouth Scholarship

Identifying subpopulations within a study and inferring intercontinental ancestry of the samples are important steps in genome wide association studies. Two software packages are widely used in analysis of substructure: Structure and Eigenstrat. Structure assigns each individual to a population by using a Bayesian method with multiple tuning parameters. It requires considerable computational time when dealing with thousands of samples and lacks the ability to create scores that could be used as covariates. Eigenstrat uses a principal component analysis method to model all sources of sampling variation. However, it does not readily provide information directly relevant to ancestral origin; the …


Identifying Gene-Gene Interactions That Are Highly Associated With Body Mass Index Using Quantitative Multifactor Dimensionality Reduction (Qmdr), Rishika De, Shefali S. Verma, Fotios Drenos, Emily R. Holzinger Dec 2015

Identifying Gene-Gene Interactions That Are Highly Associated With Body Mass Index Using Quantitative Multifactor Dimensionality Reduction (Qmdr), Rishika De, Shefali S. Verma, Fotios Drenos, Emily R. Holzinger

Dartmouth Scholarship

Despite heritability estimates of 40–70% for obesity, less than 2% of its variation is explained by Body Mass Index (BMI) associated loci that have been identified so far. Epistasis, or gene-gene interactions are a plausible source to explain portions of the missing heritability of BMI. Using genotypic data from 18,686 individuals across five study cohorts – ARIC, CARDIA, FHS, CHS, MESA – we filtered SNPs (Single Nucleotide Polymorphisms) using two parallel approaches. SNPs were filtered either on the strength of their main effects of association with BMI, or on the number of knowledge sources supporting a specific SNP-SNP interaction in …


Integrated Assessment Of Predicted Mhc Binding And Cross-Conservation With Self Reveals Patterns Of Viral Camouflage, Lu He, Anne S. De Groot, Andres H. Gutierrez, William D. Martin, Lenny Moise, Chris Bailey-Kellogg Mar 2014

Integrated Assessment Of Predicted Mhc Binding And Cross-Conservation With Self Reveals Patterns Of Viral Camouflage, Lu He, Anne S. De Groot, Andres H. Gutierrez, William D. Martin, Lenny Moise, Chris Bailey-Kellogg

Dartmouth Scholarship

Immune recognition of foreign proteins by T cells hinges on the formation of a ternary complex sandwiching a constituent peptide of the protein between a major histocompatibility complex (MHC) molecule and a T cell receptor (TCR). Viruses have evolved means of "camouflaging" themselves, avoiding immune recognition by reducing the MHC and/or TCR binding of their constituent peptides. Computer-driven T cell epitope mapping tools have been used to evaluate the degree to which articular viruses have used this means of avoiding immune response, but most such analyses focus on MHC-facing ‘agretopes'. Here we set out a new means of evaluating the …


How Long Is A Piece Of Loop?, Yoonjoo Choi, Sumeet Agarwal, Charlotte M. Deane Feb 2013

How Long Is A Piece Of Loop?, Yoonjoo Choi, Sumeet Agarwal, Charlotte M. Deane

Dartmouth Scholarship

Loops are irregular structures which connect two secondary structure elements in proteins. They often play important roles in function, including enzyme reactions and ligand binding. Despite their importance, their structure remains difficult to predict. Most protein loop structure prediction methods sample local loop segments and score them. In particular protein loop classifications and database search methods depend heavily on local properties of loops. Here we examine the distance between a loop's end points (span). We find that the distribution of loop span appears to be independent of the number of residues in the loop, in other words the separation between …


Gene Ontology Analysis Of Pairwise Genetic Associations In Two Genome-Wide Studies Of Sporadic Als, Nora Chung Kim, Peter C. Andrews, Folkert W. Asselbergs, H Robert Frost, Scott M. Williams, Brent T. Harris, Cynthia Read, Kathleen D. Askland, Jason H. Moore Jul 2012

Gene Ontology Analysis Of Pairwise Genetic Associations In Two Genome-Wide Studies Of Sporadic Als, Nora Chung Kim, Peter C. Andrews, Folkert W. Asselbergs, H Robert Frost, Scott M. Williams, Brent T. Harris, Cynthia Read, Kathleen D. Askland, Jason H. Moore

Dartmouth Scholarship

It is increasingly clear that common human diseases have a complex genetic architecture characterized by both additive and nonadditive genetic effects. The goal of the present study was to determine whether patterns of both additive and nonadditive genetic associations aggregate in specific functional groups as defined by the Gene Ontology (GO).


Planning Combinatorial Disulfide Cross-Links For Protein Fold Determination, Fei Xiong, Alan M Friedman, Chris Bailey-Kellogg Nov 2011

Planning Combinatorial Disulfide Cross-Links For Protein Fold Determination, Fei Xiong, Alan M Friedman, Chris Bailey-Kellogg

Dartmouth Scholarship

Fold recognition techniques take advantage of the limited number of overall structural organizations, and have become increasingly effective at identifying the fold of a given target sequence. However, in the absence of sufficient sequence identity, it remains difficult for fold recognition methods to always select the correct model. While a native-like model is often among a pool of highly ranked models, it is not necessarily the highest-ranked one, and the model rankings depend sensitively on the scoring function used. Structure elucidation methods can then be employed to decide among the models based on relatively rapid biochemical/biophysical experiments.


Evolving Hard Problems: Generating Human Genetics Datasets With A Complex Etiology, Daniel S Himmelstein, Casey S Greene, Jason H Moore Jul 2011

Evolving Hard Problems: Generating Human Genetics Datasets With A Complex Etiology, Daniel S Himmelstein, Casey S Greene, Jason H Moore

Dartmouth Scholarship

BackgroundA goal of human genetics is to discover genetic factors that influence individuals' susceptibility to common diseases. Most common diseases are thought to result from the joint failure of two or more interacting components instead of single component failures. This greatly complicates both the task of selecting informative genetic variants and the task of modeling interactions between them. We and others have previously developed algorithms to detect and model the relationships between these genetic factors and disease. Previously these methods have been evaluated with datasets simulated according to pre-defined genetic models.


Optimization Algorithms For Functional Deimmunization Of Therapeutic Proteins, Andrew S. Parker, Wei Zheng, Karl E. Griswold, Chris Bailey-Kellogg Apr 2010

Optimization Algorithms For Functional Deimmunization Of Therapeutic Proteins, Andrew S. Parker, Wei Zheng, Karl E. Griswold, Chris Bailey-Kellogg

Dartmouth Scholarship

To develop protein therapeutics from exogenous sources, it is necessary to mitigate the risks of eliciting an anti-biotherapeutic immune response. A key aspect of the response is the recognition and surface display by antigen-presenting cells of epitopes, short peptide fragments derived from the foreign protein. Thus, developing minimal-epitope variants represents a powerful approach to deimmunizing protein therapeutics. Critically, mutations selected to reduce immunogenicity must not interfere with the protein's therapeutic activity.


Multifactor Dimensionality Reduction Analysis Identifies Specific Nucleotide Patterns Promoting Genetic Polymorphisms, Eric Arehart, Scott Gleim, Bill White, John Hwa, Jason H. Moore Mar 2009

Multifactor Dimensionality Reduction Analysis Identifies Specific Nucleotide Patterns Promoting Genetic Polymorphisms, Eric Arehart, Scott Gleim, Bill White, John Hwa, Jason H. Moore

Dartmouth Scholarship

The fidelity of DNA replication serves as the nidus for both genetic evolution and genomic instability fostering disease. Single nucleotide polymorphisms (SNPs) constitute greater than 80% of the genetic variation between individuals. A new theory regarding DNA replication fidelity has emerged in which selectivity is governed by base-pair geometry through interactions between the selected nucleotide, the complementary strand, and the polymerase active site. We hypothesize that specific nucleotide combinations in the flanking regions of SNP fragments are associated with mutation.


A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross Jul 2007

A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross

Dartmouth Scholarship

Despite the diversity of motif representations and search algorithms, the de novo computational identification of transcription factor binding sites remains constrained by the limited accuracy of existing algorithms and the need for user-specified input parameters that describe the motif being sought.ResultsWe present a novel ensemble learning method, SCOPE, that is based on the assumption that transcription factor binding sites belong to one of three broad classes of motifs: non-degenerate, degenerate and gapped motifs. SCOPE employs a unified scoring metric to combine the results from three motif finding algorithms each aimed at the discovery of one of these classes of motifs. …


Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross May 2006

Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross

Dartmouth Scholarship

The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach.