Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 5 of 5

Full-Text Articles in Genetics and Genomics

Prioritizing Protein Complexes Implicated In Human Diseases By Network Optimization., Yong Chen, Thibault Jacquemin, Shuyan Zhang, Rui Jiang Sep 2019

Prioritizing Protein Complexes Implicated In Human Diseases By Network Optimization., Yong Chen, Thibault Jacquemin, Shuyan Zhang, Rui Jiang

Yong Chen

BACKGROUND: The detection of associations between protein complexes and human inherited diseases is of great importance in understanding mechanisms of diseases. Dysfunctions of a protein complex are usually defined by its member disturbance and consequently result in certain diseases. Although individual disease proteins have been widely predicted, computational methods are still absent for systematically investigating disease-related protein complexes.

RESULTS: We propose a method, MAXCOM, for the prioritization of candidate protein complexes. MAXCOM performs a maximum information flow algorithm to optimize relationships between a query disease and candidate protein complexes through a heterogeneous network that is constructed by combining protein-protein interactions …


Climp: Clustering Motifs Via Maximal Cliques With Parallel Computing Design., Shaoqiang Zhang, Yong Chen Sep 2019

Climp: Clustering Motifs Via Maximal Cliques With Parallel Computing Design., Shaoqiang Zhang, Yong Chen

Yong Chen

A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif …


Fishermp: Fully Parallel Algorithm For Detecting Combinatorial Motifs From Large Chip-Seq Datasets., Shaoqiang Zhang, Ying Liang, Xiangyun Wang, Zhengchang Su, Yong Chen Sep 2019

Fishermp: Fully Parallel Algorithm For Detecting Combinatorial Motifs From Large Chip-Seq Datasets., Shaoqiang Zhang, Ying Liang, Xiangyun Wang, Zhengchang Su, Yong Chen

Yong Chen

Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene regulations. Although a number of motif-finding algorithms have been presented, most are either time consuming or have sub-optimal accuracy for processing large-scale datasets. In this article, we present a fully parallelized algorithm for detecting combinatorial motifs from ChIP-seq datasets by using Fisher combined method and OpenMP parallel design. Large scale validations on both synthetic data and 350 ChIP-seq datasets from the ENCODE database showed that FisherMP has not only super speeds on large datasets, but also …


Fishermp: Fully Parallel Algorithm For Detecting Combinatorial Motifs From Large Chip-Seq Datasets., Shaoqiang Zhang, Ying Liang, Xiangyun Wang, Zhengchang Su, Yong Chen Jun 2019

Fishermp: Fully Parallel Algorithm For Detecting Combinatorial Motifs From Large Chip-Seq Datasets., Shaoqiang Zhang, Ying Liang, Xiangyun Wang, Zhengchang Su, Yong Chen

Faculty Scholarship for the College of Science & Mathematics

Detecting binding motifs of combinatorial transcription factors (TFs) from chromatin immunoprecipitation sequencing (ChIP-seq) experiments is an important and challenging computational problem for understanding gene regulations. Although a number of motif-finding algorithms have been presented, most are either time consuming or have sub-optimal accuracy for processing large-scale datasets. In this article, we present a fully parallelized algorithm for detecting combinatorial motifs from ChIP-seq datasets by using Fisher combined method and OpenMP parallel design. Large scale validations on both synthetic data and 350 ChIP-seq datasets from the ENCODE database showed that FisherMP has not only super speeds on large datasets, but also …


Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang Apr 2019

Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang

Biostatistics Faculty Publications

To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …