Open Access. Powered by Scholars. Published by Universities.®

Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Genomics

Transcription Factor Binding Site Clusters Identify Target Genes With Similar Tissue-Wide Expression And Buffer Against Mutations., Peter Rogan, Ruipeng Lu Jan 2019

Transcription Factor Binding Site Clusters Identify Target Genes With Similar Tissue-Wide Expression And Buffer Against Mutations., Peter Rogan, Ruipeng Lu

Biochemistry Publications

Background: The distribution and composition of cis-regulatory modules composed of transcription factor (TF) binding site (TFBS) clusters in promoters substantially determine gene expression patterns and TF targets. TF knockdown experiments have revealed that TF binding profiles and gene expression levels are correlated. We use TFBS features within accessible promoter intervals to predict genes with similar tissue-wide expression patterns and TF targets using Machine Learning (ML). Methods: Bray-Curtis Similarity was used to identify genes with correlated expression patterns across 53 tissues. TF targets from knockdown experiments were also analyzed by this approach to set up the ML framework. TFBSs were …


Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu Apr 2018

Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu

Electronic Thesis and Dissertation Repository

ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.

This thesis presents a novel …


Validation Of Predicted Mrna Splicing Mutations Using High-Throughput Transcriptome Data, Coby Viner, Stephanie Dorman, Ben Shirley, Peter Rogan Jan 2014

Validation Of Predicted Mrna Splicing Mutations Using High-Throughput Transcriptome Data, Coby Viner, Stephanie Dorman, Ben Shirley, Peter Rogan

Biochemistry Publications

Interpretation of variants present in complete genomes or exomes reveals numerous sequence changes, only a fraction of which are likely to be pathogenic. Mutations have been traditionally inferred from allele frequencies and inheritance patterns in such data. Variants predicted to alter mRNA splicing can be validated by manual inspection of transcriptome sequencing data, however this approach is intractable for large datasets. These abnormal mRNA splicing patterns are characterized by reads demonstrating either exon skipping, cryptic splice site use, and high levels of intron inclusion, or combinations of these properties. We present, Veridical, an in silico method for the automatic validation …


Array-Based Genomic Diversity Measures Portray Mus Musculus Phylogenetic And Genealogical Relationships, And Detect Genetic Variation Among C57bl/6j Mice And Between Tissues Of The Same Mouse, Susan T. Eitutis Jul 2013

Array-Based Genomic Diversity Measures Portray Mus Musculus Phylogenetic And Genealogical Relationships, And Detect Genetic Variation Among C57bl/6j Mice And Between Tissues Of The Same Mouse, Susan T. Eitutis

Electronic Thesis and Dissertation Repository

Mouse models lack affordable genomic technologies slowing the identification of candidate variants contributing to complex phenotypes. The Mouse Diversity Genotyping Array (MDGA) is a low cost, high-resolution platform permitting genomic diversity assessment. Using a validated list of >500,000 single nucleotide polymorphisms (SNPs), we applied the first comprehensive analysis of SNP differences to detect genetic distance across 362 Mus musculus samples. Genetic distance measured between distantly and closely related mice correlates with known phylogeny and genealogy. Variation detected between C57BL/6J mice is consistent with previous reports of variants within this strain. Putative genetic variation detected between and within tissues indicates somatic …