Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Life Sciences

PDF

Faculty Publications

Series

2006

Bioinformatics, Computational Biology

Articles 1 - 2 of 2

Full-Text Articles in Entire DC Network

Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. S. Waterman, X. J. Zhou Jan 2006

Integrative Missing Value Estimation For Microarray Data, Jianjun Hu, H. Li, M. S. Waterman, X. J. Zhou

Faculty Publications

Background

Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples.

Results

We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets …


Emd: An Ensemble Algorithm For Discovering Regulatory Motifs In Dna Sequences, Jianjun Hu, Y. D. Yang, D. Kihara Jan 2006

Emd: An Ensemble Algorithm For Discovering Regulatory Motifs In Dna Sequences, Jianjun Hu, Y. D. Yang, D. Kihara

Faculty Publications

Background

Understanding gene regulatory networks has become one of the central research problems in bioinformatics. More than thirty algorithms have been proposed to identify DNA regulatory sites during the past thirty years. However, the prediction accuracy of these algorithms is still quite low. Ensemble algorithms have emerged as an effective strategy in bioinformatics for improving the prediction accuracy by exploiting the synergetic prediction capability of multiple algorithms.

Results

We proposed a novel clustering-based ensemble algorithm named EMD for de novo motif discovery by combining multiple predictions from multiple runs of one or more base component algorithms. The ensemble approach is …