Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2002

PDF

University of Massachusetts Amherst

Erin M. Conlon

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Statistical Issues In The Clustering Of Gene Expression Data, Darlene R. Goldstein, Debashis Ghosh, Erin M. Conlon Jan 2002

Statistical Issues In The Clustering Of Gene Expression Data, Darlene R. Goldstein, Debashis Ghosh, Erin M. Conlon

Erin M. Conlon

This paper illustrates some of the problems which can occur in any data set when clustering samples of gene expression profiles. These include a possible high degree of dependence of results on choice of clustering algorithm, further dependence of results on the choices of genes and samples to be included in the clustering (for example, whether or not to include control samples), and difficulty in assessing the validity of the grouping. We also demonstrate the use of Cox regression as a tool to identify genes influencing survival.