Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2003

Statistics and Probability

U.C. Berkeley Division of Biostatistics Working Paper Series

Censored data

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng Dec 2003

Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng

U.C. Berkeley Division of Biostatistics Working Paper Series

Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …


Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan Sep 2003

Tree-Based Multivariate Regression And Density Estimation With Right-Censored Data , Annette M. Molinaro, Sandrine Dudoit, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence of censoring. This approach is entirely driven by the choice of a loss function for the full (uncensored) data structure and can be stated in terms of the following three main steps. (1) Define the parameter of interest as the minimizer of the expected loss, or risk, for a full data loss function chosen to represent the desired measure of performance. Map the full data loss function into an observed (censored) data loss function having the same expected value and leading to an efficient estimator …


A Semiparametric Model Selection Criterion With Applications To The Marginal Structural Model, M. Alan Brookhart, Mark J. Van Der Laan Mar 2003

A Semiparametric Model Selection Criterion With Applications To The Marginal Structural Model, M. Alan Brookhart, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Estimators for the parameter of interest in semiparametric models often depend on a guessed model for the nuisance parameter. The choice of the model for the nuisance parameter can affect both the finite sample bias and efficiency of the resulting estimator of the parameter of interest. In this paper we propose a finite sample criterion based on cross validation that can be used to select a nuisance parameter model from a list of candidate models. We show that expected value of this criterion is minimized by the nuisance parameter model that yields the estimator of the parameter of interest with …