Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 2 of 2
Full-Text Articles in Genomics
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Computational Modelling Of Human Transcriptional Regulation By An Information Theory-Based Approach, Ruipeng Lu
Electronic Thesis and Dissertation Repository
ChIP-seq experiments can identify the genome-wide binding site motifs of a transcription factor (TF) and determine its sequence specificity. Multiple algorithms were developed to derive TF binding site (TFBS) motifs from ChIP-seq data, including the entropy minimization-based Bipad that can derive both contiguous and bipartite motifs. Prior studies applying these algorithms to ChIP-seq data only analyzed a small number of top peaks with the highest signal strengths, biasing their resultant position weight matrices (PWMs) towards consensus-like, strong binding sites; nor did they derive bipartite motifs, disabling the accurate modelling of binding behavior of dimeric TFs.
This thesis presents a novel …
Discovery And Validation Of Information Theory-Based Transcription Factor And Cofactor Binding Site Motifs., Ruipeng Lu, Eliseos J Mucaki, Peter K Rogan
Discovery And Validation Of Information Theory-Based Transcription Factor And Cofactor Binding Site Motifs., Ruipeng Lu, Eliseos J Mucaki, Peter K Rogan
Biochemistry Publications
Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, …