Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Statistics and Probability

Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin Oct 2006

Targeted Maximum Likelihood Learning, Mark J. Van Der Laan, Daniel Rubin

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one has available an estimate of the density of the data generating distribution such as a maximum likelihood estimator according to a given or data adaptively selected model. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of the density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and …


Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan Sep 2006

Diagnosing Bias In The Inverse Probability Of Treatment Weighted Estimator Resulting From Violation Of Experimental Treatment Assignment, Yue Wang, Maya L. Petersen, David Bangsberg, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Inverse probability of treatment weighting (IPTW) is frequently used to estimate the causal effects of treatments and interventions. The consistency of the IPTW estimator relies not only on the well-recognized assumption of no unmeasured confounders (Sequential Randomization Assumption or SRA), but also on the assumption of experimentation in the assignment of treatment (Experimental Treatment Assignment or ETA). In finite samples, violations in the ETA assumption can occur due simply to chance; certain treatments become rare or non-existent for certain strata of the population. Such practical violations of the ETA assumption occur frequently in real data, and can result in significant …


Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan Sep 2006

Extending Marginal Structural Models Through Local, Penalized, And Additive Learning, Daniel Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a relationship between a treatment and the marginal distribution of a corresponding counterfactual outcome. Following their introduction in Robins (1997), MSMs have typically been fit after assuming a semiparametric model, and then estimating a finite dimensional parameter. van der Laan and Dudoit (2003) proposed to instead view MSM fitting not as a task of semiparametric parameter estimation, but of nonparametric function approximation. They introduced a class of causal effect estimators based on mapping loss functions suitable for the unavailable counterfactual data to those suitable for the …


Statistical Learning Of Origin-Specific Statically Optimal Individualized Treatment Rules, Mark J. Van Der Laan, Maya L. Petersen Sep 2006

Statistical Learning Of Origin-Specific Statically Optimal Individualized Treatment Rules, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider a longitudinal observational or controlled study in which one collects chronological data over time on n randomly sampled subjects. The time-dependent process one observes on each randomly sampled subject contains time-dependent covariates, time-dependent treatment actions, and an outcome process or single final outcome of interest. A statically optimal individualized treatment rule (as introduced in van der Laan, Petersen & Joffe (2005), Petersen & van der Laan (2006)) is a (unknown) treatment rule which at any point in time conditions on a user-supplied subset of the past, computes the future static treatment regimen that maximizes a (conditional) mean future outcome …


Doubly Robust Censoring Unbiased Transformations, Daniel Rubin, Mark J. Van Der Laan Jun 2006

Doubly Robust Censoring Unbiased Transformations, Daniel Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider random design nonparametric regression when the response variable is subject to right censoring. Following the work of Fan and Gijbels (1994), a common approach to this problem is to apply what has been termed a censoring unbiased transformation to the data to obtain surrogate responses, and then enter these surrogate responses with covariate data into standard smoothing algorithms. Existing censoring unbiased transformations generally depend on either the conditional survival function of the response of interest, or that of the censoring variable. We show that a mapping introduced in another statistical context is in fact a censoring unbiased transformation …


A Method To Increase The Power Of Multiple Testing Procedures Through Sample Splitting, Daniel Rubin, Sandrine Dudoit, Mark J. Van Der Laan Jun 2006

A Method To Increase The Power Of Multiple Testing Procedures Through Sample Splitting, Daniel Rubin, Sandrine Dudoit, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the expected number of false positives does not exceed a user-supplied threshold. Among such multiple testing procedures, we derive the ``most powerful'' method, meaning the test statistic cutoffs that maximize the expected number of true positives. Unfortunately, these optimal cutoffs depend on the true unknown data generating distribution, so could never be used …


Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan Mar 2006

Multiple Tests Of Association With Biological Annotation Metadata, Sandrine Dudoit, Sunduz Keles, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a general and formal statistical framework for the multiple tests of associations between known fixed features of a genome and unknown parameters of the distribution of variable features of this genome in a population of interest. The known fixed gene-annotation profiles, corresponding to the fixed features of the genome, may concern Gene Ontology (GO) annotation, pathway membership, regulation by particular transcription factors, nucleotide sequences, or protein sequences. The unknown gene-parameter profiles, corresponding to the variable features of the genome, may be, for example, regression coefficients relating genome-wide transcript levels or DNA copy numbers to possibly censored biological and …