Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Series

Algorithms

Discipline
Institution
Publication Year
Publication

Articles 31 - 46 of 46

Full-Text Articles in Genetics and Genomics

Planning Combinatorial Disulfide Cross-Links For Protein Fold Determination, Fei Xiong, Alan M Friedman, Chris Bailey-Kellogg Nov 2011

Planning Combinatorial Disulfide Cross-Links For Protein Fold Determination, Fei Xiong, Alan M Friedman, Chris Bailey-Kellogg

Dartmouth Scholarship

Fold recognition techniques take advantage of the limited number of overall structural organizations, and have become increasingly effective at identifying the fold of a given target sequence. However, in the absence of sufficient sequence identity, it remains difficult for fold recognition methods to always select the correct model. While a native-like model is often among a pool of highly ranked models, it is not necessarily the highest-ranked one, and the model rankings depend sensitively on the scoring function used. Structure elucidation methods can then be employed to decide among the models based on relatively rapid biochemical/biophysical experiments.


Comparison Of Four Chip-Seq Analytical Algorithms Using Rice Endosperm H3k27 Trimethylation Profiling Data., Brandon M. Malone, Feng Tan, Susan M. Bridges, Zhaohua Peng Sep 2011

Comparison Of Four Chip-Seq Analytical Algorithms Using Rice Endosperm H3k27 Trimethylation Profiling Data., Brandon M. Malone, Feng Tan, Susan M. Bridges, Zhaohua Peng

Bagley College of Engineering Publications and Scholarship

Chromatin immunoprecipitation coupled with high throughput DNA Sequencing (ChIP-Seq) has emerged as a powerful tool for genome wide profiling of the binding sites of proteins associated with DNA such as histones and transcription factors. However, no peak calling program has gained consensus acceptance by the scientific community as the preferred tool for ChIP-Seq data analysis. Analyzing the large data sets generated by ChIP-Seq studies remains highly challenging for most molecular biology laboratories.Here we profile H3K27me3 enrichment sites in rice young endosperm using the ChIP-Seq approach and analyze the data using four peak calling algorithms (FindPeaks, PeakSeq, USeq, and MACS). Comparison …


Evolving Hard Problems: Generating Human Genetics Datasets With A Complex Etiology, Daniel S Himmelstein, Casey S Greene, Jason H Moore Jul 2011

Evolving Hard Problems: Generating Human Genetics Datasets With A Complex Etiology, Daniel S Himmelstein, Casey S Greene, Jason H Moore

Dartmouth Scholarship

BackgroundA goal of human genetics is to discover genetic factors that influence individuals' susceptibility to common diseases. Most common diseases are thought to result from the joint failure of two or more interacting components instead of single component failures. This greatly complicates both the task of selecting informative genetic variants and the task of modeling interactions between them. We and others have previously developed algorithms to detect and model the relationships between these genetic factors and disease. Previously these methods have been evaluated with datasets simulated according to pre-defined genetic models.


Identifying Functional Relationships Within Sets Of Co-Expressed Genes By Combining Upstream Regulatory Motif Analysis And Gene Expression Information, Viktor Martyanov, Robert H. Gross Nov 2010

Identifying Functional Relationships Within Sets Of Co-Expressed Genes By Combining Upstream Regulatory Motif Analysis And Gene Expression Information, Viktor Martyanov, Robert H. Gross

Dartmouth Scholarship

Existing clustering approaches for microarray data do not adequately differentiate between subsets of co-expressed genes. We devised a novel approach that integrates expression and sequence data in order to generate functionally coherent and biologically meaningful subclusters of genes. Specifically, the approach clusters co-expressed genes on the basis of similar content and distributions of predicted statistically significant sequence motifs in their upstream regions.


Improved Ibd Detection Using Incomplete Haplotype Information, Giulio Genovese, Gregory Leibon, Martin R. Pollak, Daniel N. Rockmore Jun 2010

Improved Ibd Detection Using Incomplete Haplotype Information, Giulio Genovese, Gregory Leibon, Martin R. Pollak, Daniel N. Rockmore

Dartmouth Scholarship

The availability of high density genetic maps and genotyping platforms has transformed human genetic studies. The use of these platforms has enabled population-based genome-wide association studies. However, in inheritance-based studies, current methods do not take full advantage of the information present in such genotyping analyses. In this paper we describe an improved method for identifying genetic regions shared identical-by-descent (IBD) from recent common ancestors. This method improves existing methods by taking advantage of phase information even if it is less than fully accurate or missing. We present an analysis of how using phase information increases the accuracy of IBD detection …


Optimization Algorithms For Functional Deimmunization Of Therapeutic Proteins, Andrew S. Parker, Wei Zheng, Karl E. Griswold, Chris Bailey-Kellogg Apr 2010

Optimization Algorithms For Functional Deimmunization Of Therapeutic Proteins, Andrew S. Parker, Wei Zheng, Karl E. Griswold, Chris Bailey-Kellogg

Dartmouth Scholarship

To develop protein therapeutics from exogenous sources, it is necessary to mitigate the risks of eliciting an anti-biotherapeutic immune response. A key aspect of the response is the recognition and surface display by antigen-presenting cells of epitopes, short peptide fragments derived from the foreign protein. Thus, developing minimal-epitope variants represents a powerful approach to deimmunizing protein therapeutics. Critically, mutations selected to reduce immunogenicity must not interfere with the protein's therapeutic activity.


Identifying Protein Complexes From Interaction Networks Based On Clique Percolation And Distance Restriction, Jianxin Wang, Binbin Liu, Min Li, Yi Pan Jan 2010

Identifying Protein Complexes From Interaction Networks Based On Clique Percolation And Distance Restriction, Jianxin Wang, Binbin Liu, Min Li, Yi Pan

Computer Science Faculty Publications

Background: Identification of protein complexes in large interaction networks is crucial to understand principles of cellular organization and predict protein functions, which is one of the most important issues in the post-genomic era. Each protein might be subordinate multiple protein complexes in the real protein-protein interaction networks.Identifying overlapping protein complexes from protein-protein interaction networks is a considerable research topic.

Result: As an effective algorithm in identifying overlapping module structures, clique percolation method (CPM) has a wide range of application in social networks and biological networks. However, the recognition accuracy of algorithm CPM is lowly. Furthermore, algorithm CPM is unfit to …


Spatially Uniform Relieff (Surf) For Computationally-Efficient Filtering Of Gene-Gene Interactions, Casey S. Greene, Nadia M. Penrod, Jeff Kiralis, Jason H. Moore Sep 2009

Spatially Uniform Relieff (Surf) For Computationally-Efficient Filtering Of Gene-Gene Interactions, Casey S. Greene, Nadia M. Penrod, Jeff Kiralis, Jason H. Moore

Dartmouth Scholarship

Genome-wide association studies are becoming the de facto standard in the genetic analysis of common human diseases. Given the complexity and robustness of biological networks such diseases are unlikely to be the result of single points of failure but instead likely arise from the joint failure of two or more interacting components. The hope in genome-wide screens is that these points of failure can be linked to single nucleotide polymorphisms (SNPs) which confer disease susceptibility. Detecting interacting variants that lead to disease in the absence of single-gene effects is difficult however, and methods to exhaustively analyze sets of these variants …


Minimum Criteria For Dna Damage-Induced Phase Advances In Circadian Rhythms, Christian I. Hong, Judit Zámborszky, Attila Csikász-Nagy May 2009

Minimum Criteria For Dna Damage-Induced Phase Advances In Circadian Rhythms, Christian I. Hong, Judit Zámborszky, Attila Csikász-Nagy

Dartmouth Scholarship

Robust oscillatory behaviors are common features of circadian and cell cycle rhythms. These cyclic processes, however, behave distinctively in terms of their periods and phases in response to external influences such as light, temperature, nutrients, etc. Nevertheless, several links have been found between these two oscillators. Cell division cycles gated by the circadian clock have been observed since the late 1950s. On the other hand, ionizing radiation (IR) treatments cause cells to undergo a DNA damage response, which leads to phase shifts (mostly advances) in circadian rhythms. Circadian gating of the cell cycle can be attributed to the cell cycle …


Multifactor Dimensionality Reduction Analysis Identifies Specific Nucleotide Patterns Promoting Genetic Polymorphisms, Eric Arehart, Scott Gleim, Bill White, John Hwa, Jason H. Moore Mar 2009

Multifactor Dimensionality Reduction Analysis Identifies Specific Nucleotide Patterns Promoting Genetic Polymorphisms, Eric Arehart, Scott Gleim, Bill White, John Hwa, Jason H. Moore

Dartmouth Scholarship

The fidelity of DNA replication serves as the nidus for both genetic evolution and genomic instability fostering disease. Single nucleotide polymorphisms (SNPs) constitute greater than 80% of the genetic variation between individuals. A new theory regarding DNA replication fidelity has emerged in which selectivity is governed by base-pair geometry through interactions between the selected nucleotide, the complementary strand, and the polymerase active site. We hypothesize that specific nucleotide combinations in the flanking regions of SNP fragments are associated with mutation.


Multi-Break Rearrangements And Breakpoint Re-Uses: From Circular To Linear Genomes, Max A. Alekseyev Nov 2008

Multi-Break Rearrangements And Breakpoint Re-Uses: From Circular To Linear Genomes, Max A. Alekseyev

Faculty Publications

Multi-break rearrangements break a genome into multiple fragments and further glue them together in a new order. While 2-break rearrangements represent standard reversals, fusions, fissions, and translocations, 3-break rearrangements represent a natural generalization of transpositions. Alekseyev and Pevzner (2007a, 2008a) studied multi-break rearrangements in circular genomes and further applied them to the analysis of chromosomal evolution in mammalian genomes. In this paper, we extend these results to the more difficult case of linear genomes. In particular, we give lower bounds for the rearrangement distance between linear genomes and for the breakpoint re-use rate as functions of the number and proportion …


A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross Jul 2007

A Novel Ensemble Learning Method For De Novo Computational Identification Of Dna Binding Sites, Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Robert H H. Gross

Dartmouth Scholarship

Despite the diversity of motif representations and search algorithms, the de novo computational identification of transcription factor binding sites remains constrained by the limited accuracy of existing algorithms and the need for user-specified input parameters that describe the motif being sought.ResultsWe present a novel ensemble learning method, SCOPE, that is based on the assumption that transcription factor binding sites belong to one of three broad classes of motifs: non-degenerate, degenerate and gapped motifs. SCOPE employs a unified scoring metric to combine the results from three motif finding algorithms each aimed at the discovery of one of these classes of motifs. …


Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross May 2006

Bounded Search For De Novo Identification Of Degenerate Cis-Regulatory Elements, Jonathan M. Carlson, Arijit Chakravarty, Radhika S. Khetani, Robert H. Gross

Dartmouth Scholarship

The identification of statistically overrepresented sequences in the upstream regions of coregulated genes should theoretically permit the identification of potential cis-regulatory elements. However, in practice many cis-regulatory elements are highly degenerate, precluding the use of an exhaustive word-counting strategy for their identification. While numerous methods exist for inferring base distributions using a position weight matrix, recent studies suggest that the independence assumptions inherent in the model, as well as the inability to reach a global optimum, limit this approach.


Dissecting Trait Heterogeneity: A Comparison Of Three Clustering Methods Applied To Genotypic Data, Tricia A. Thornton-Wells, Jason H. Moore, Jonathan L. Haines Apr 2006

Dissecting Trait Heterogeneity: A Comparison Of Three Clustering Methods Applied To Genotypic Data, Tricia A. Thornton-Wells, Jason H. Moore, Jonathan L. Haines

Dartmouth Scholarship

Background: Trait heterogeneity, which exists when a trait has been defined with insufficient specificity such that it is actually two or more distinct traits, has been implicated as a confounding factor in traditional statistical genetics of complex hu man disease. In the absence of de tailed phenotypic data collected consistently in combination with genetic data, unsupervised computational methodologies offer the potential for discovering underlying trait heteroge neity. The performance of three such methods – Bayesian Classification, Hyperg raph-Based Clustering, and Fuzzy k -Modes Clustering – appropriate for categorical data were comp ared. Also tested was the ability of these methods …


Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie Jan 2006

Gpnn: Power Studies And Applications Of A Neural Network Method For Detecting Gene-Gene Interactions In Studies Of Human Disease, Alison A. Motsinger, Stephen L. Lee, George Mellick, Marylyn D. Ritchie

Dartmouth Scholarship

The identification and characterization of genes that influence the risk of common, complex multifactorial disease primarily through interactions with other genes and environmental factors remains a statistical and computational challenge in genetic epidemiology. We have previously introduced a genetic programming optimized neural network (GPNN) as a method for optimizing the architecture of a neural network to improve the identification of gene combinations associated with disease risk. The goal of this study was to evaluate the power of GPNN for identifying high-order gene-gene interactions. We were also interested in applying GPNN to a real data analysis in Parkinson's disease.


Principal Component Analysis For Predicting Transcription-Factor Binding Motifs From Array-Derived Data, Yunlong Liu, Matthew P Vincenti, Hiroki Yokota Nov 2005

Principal Component Analysis For Predicting Transcription-Factor Binding Motifs From Array-Derived Data, Yunlong Liu, Matthew P Vincenti, Hiroki Yokota

Dartmouth Scholarship

The responses to interleukin 1 (IL-1) in human chondrocytes constitute a complex regulatory mechanism, where multiple transcription factors interact combinatorially to transcription-factor binding motifs (TFBMs). In order to select a critical set of TFBMs from genomic DNA information and an array-derived data, an efficient algorithm to solve a combinatorial optimization problem is required. Although computational approaches based on evolutionary algorithms are commonly employed, an analytical algorithm would be useful to predict TFBMs at nearly no computational cost and evaluate varying modelling conditions. Singular value decomposition (SVD) is a powerful method to derive primary components of a given matrix. Applying SVD …