Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Applied Statistics

COBRA

2014

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

A Scalable Supervised Subsemble Prediction Algorithm, Stephanie Sapp, Mark J. Van Der Laan Apr 2014

A Scalable Supervised Subsemble Prediction Algorithm, Stephanie Sapp, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Subsemble is a flexible ensemble method that partitions a full data set into subsets of observations, fits the same algorithm on each subset, and uses a tailored form of V-fold cross-validation to construct a prediction function that combines the subset-specific fits with a second metalearner algorithm. Previous work studied the performance of Subsemble with subsets created randomly, and showed that these types of Subsembles often result in better prediction performance than the underlying algorithm fit just once on the full dataset. Since the final Subsemble estimator varies depending on the data used to create the subset-specific fits, different strategies for …