Open Access. Powered by Scholars. Published by Universities.®

Life Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Bioinformatics

University of Massachusetts - Amherst

Selected Works

Articles 1 - 1 of 1

Full-Text Articles in Life Sciences

Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian Sep 2010

Sample Size And Statistical Power Considerations In High-Dimensionality Data Settings: A Comparative Study Of Classification Algorithms, Yu Guo, Armin Garber, Raji Balasubramanian

Raji Balasubramanian

Background: Data generated using ‘omics’ technologies are characterized by high dimensionality, where the number of features measured per subject vastly exceeds the number of subjects in the study. In this paper, we consider issues relevant in the design of biomedical studies in which the goal is the discovery of a subset of features and an associated algorithm that can predict a binary outcome, such as disease status. We compare the performance of four commonly used classifiers (K-Nearest Neighbors, Prediction Analysis for Microarrays, Random Forests and Support Vector Machines) in high-dimensionality data settings. We evaluate the effects of varying levels of …