Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Resampling-Based Multiple Testing: Asymptotic Control Of Type I Error And Applications To Gene Expression Data, Katherine S. Pollard, Mark J. Van Der Laan Jun 2003

Resampling-Based Multiple Testing: Asymptotic Control Of Type I Error And Applications To Gene Expression Data, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We define a general statistical framework for multiple hypothesis testing and show that the correct null distribution for the test statistics is obtained by projecting the true distribution of the test statistics onto the space of mean zero distributions. For common choices of test statistics (based on an asymptotically linear parameter estimator), this distribution is asymptotically multivariate normal with mean zero and the covariance of the vector influence curve for the parameter estimator. This test statistic null distribution can be estimated by applying the non-parametric or parametric bootstrap to correctly centered test statistics. We prove that this bootstrap estimated null …


Simple Parallel Statistical Computing In R, Anthony Rossini, Luke Tierney, Na Li Mar 2003

Simple Parallel Statistical Computing In R, Anthony Rossini, Luke Tierney, Na Li

UW Biostatistics Working Paper Series

Theoretically, many modern statistical procedures are trivial to parallelize. However, practical deployment of a parallelized implementation which is robust and reliably runs on different computational cluster configurations and environments is far from trivial. We present a framework for the R statistical computing language that provides a simple yet powerful programming interface to a computational cluster. This interface allows the development of R functions that distribute independent computations across the nodes of the computational cluster. The resulting framework allows statisticians to obtain significant speed-ups for some computations at little additional development cost. The particular implementation can be deployed in heterogeneous computing …