Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Tmle For Marginal Structural Models Based On An Instrument, Boriska Toth, Mark J. Van Der Laan Jun 2016

Tmle For Marginal Structural Models Based On An Instrument, Boriska Toth, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We consider estimation of a causal effect of a possibly continuous treatment when treatment assignment is potentially subject to unmeasured confounding, but an instrumental variable is available. Our focus is on estimating heterogeneous treatment effects, so that the treatment effect can be a function of an arbitrary subset of the observed covariates. One setting where this framework is especially useful is with clinical outcomes. Allowing the causal dose-response curve to depend on a subset of the covariates, we define our parameter of interest to be the projection of the true dose-response curve onto a user-supplied working marginal structural model. We …


One-Step Targeted Minimum Loss-Based Estimation Based On Universal Least Favorable One-Dimensional Submodels, Mark J. Van Der Laan, Susan Gruber Mar 2016

One-Step Targeted Minimum Loss-Based Estimation Based On Universal Least Favorable One-Dimensional Submodels, Mark J. Van Der Laan, Susan Gruber

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider a study in which one observes n independent and identically distributed random variables whose probability distribution is known to be an element of a particular statistical model, and one is concerned with estimation of a particular real valued pathwise differentiable target parameter of this data probability distribution. The targeted maximum likelihood estimator (TMLE) is an asymptotically efficient substitution estimator obtained by constructing a so called least favorable parametric submodel through an initial estimator with score, at zero fluctuation of the initial estimator, that spans the efficient influence curve, and iteratively maximizing the corresponding parametric likelihood till no more updates …


A Generally Efficient Targeted Minimum Loss Based Estimator, Mark J. Van Der Laan Dec 2015

A Generally Efficient Targeted Minimum Loss Based Estimator, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose we observe n independent and identically distributed observations of a finite dimensional bounded random variable. This article is concerned with the construction of an efficient targeted minimum loss-based estimator (TMLE) of a pathwise differentiable target parameter based on a realistic statistical model.

The canonical gradient of the target parameter at a particular data distribution will depend on the data distribution through an infinite dimensional nuisance parameter which can be defined as the minimizer of the expectation of a loss function (e.g., log-likelihood loss). For many models and target parameters the nuisance parameter can be split up in two components, …


One-Step Targeted Minimum Loss-Based Estimation Based On Universal Least Favorable One-Dimensional Submodels, Mark J. Van Der Laan Jun 2015

One-Step Targeted Minimum Loss-Based Estimation Based On Universal Least Favorable One-Dimensional Submodels, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider a study in which one observes n independent and identically distributed random variables whose probability distribution is known to be an element of a particular statistical model, and one is concerned with estimation of a particular real valued pathwise differentiable target parameter of this data probability distribution. The canonical gradient of the pathwise derivative of the target parameter, also called the efficient influence curve, defines an asymptotically efficient estimator as an estimator that is asymptotically linear with influence curve equal to the efficient influence curve.The targeted maximum likelihood estimator is a two stage estimator obtained by constructing a so …


Optimal Dynamic Treatments In Resource-Limited Settings, Alexander R. Luedtke, Mark J. Van Der Laan Jan 2015

Optimal Dynamic Treatments In Resource-Limited Settings, Alexander R. Luedtke, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

A dynamic treatment rule (DTR) is a treatment rule which assigns treatments to individuals based on (a subset of) their measured covariates. An optimal DTR is the DTR which maximizes the population mean outcome. Previous works in this area have assumed that treatment is an unlimited resource so that the entire population can be treated if this strategy maximizes the population mean outcome. We consider optimal DTRs in settings where the treatment resource is limited so that there is a maximum proportion of the population which can be treated. We give a general closed-form expression for an optimal stochastic DTR …


Entering The Era Of Data Science: Targeted Learning And The Integration Of Statistics And Computational Data Analysis, Mark J. Van Der Laan, Richard J.C.M. Starmans Jul 2014

Entering The Era Of Data Science: Targeted Learning And The Integration Of Statistics And Computational Data Analysis, Mark J. Van Der Laan, Richard J.C.M. Starmans

U.C. Berkeley Division of Biostatistics Working Paper Series

This outlook article will appear in Advances in Statistics and it reviews the research of Dr. van der Laan's group on Targeted Learning, a subfield of statistics that is concerned with the construction of data adaptive estimators of user-supplied target parameters of the probability distribution of the data and corresponding confidence intervals, aiming to only rely on realistic statistical assumptions. Targeted Learning fully utilizes the state of the art in machine learning tools, while still preserving the important identity of statistics as a field that is concerned with both accurate estimation of the true target parameter value and assessment of …


Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh Jun 2013

Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in estimation-sample (one of the V subsamples) and corresponding complementary parameter-generating sample that is used to generate a target parameter. For each of the V parameter-generating samples, we apply an algorithm that maps the sample in a target parameter mapping which represent the statistical target parameter generated by that parameter-generating …


Statistical Inference When Using Data Adaptive Estimators Of Nuisance Parameters, Mark J. Van Der Laan Nov 2012

Statistical Inference When Using Data Adaptive Estimators Of Nuisance Parameters, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In order to be concrete we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing n independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically …


Adaptive Matching In Randomized Trials And Observational Studies, Mark J. Van Der Laan, Laura Balzer, Maya L. Petersen Jul 2012

Adaptive Matching In Randomized Trials And Observational Studies, Mark J. Van Der Laan, Laura Balzer, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, …


Nonparametric Population Average Models: Deriving The Form Of Approximate Population Average Models Estimated Using Generalized Estimating Equations, Alan E. Hubbard, Mark J. Van Der Laan Jun 2009

Nonparametric Population Average Models: Deriving The Form Of Approximate Population Average Models Estimated Using Generalized Estimating Equations, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

For estimating regressions for repeated measures outcome data, a popular choice is the population average models estimated by generalized estimating equations (GEE). We review in this report the derivation of the robust inference (sandwich-type estimator of the standard error). In addition, we present formally how the approximation of a misspecified working population average model relates to the true model and in turn how to interpret the results of such a misspecified model.


Resampling-Based Multiple Hypothesis Testing With Applications To Genomics: New Developments In The R/Bioconductor Package Multtest, Houston N. Gilbert, Katherine S. Pollard, Mark J. Van Der Laan, Sandrine Dudoit Apr 2009

Resampling-Based Multiple Hypothesis Testing With Applications To Genomics: New Developments In The R/Bioconductor Package Multtest, Houston N. Gilbert, Katherine S. Pollard, Mark J. Van Der Laan, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

The multtest package is a standard Bioconductor package containing a suite of functions useful for executing, summarizing, and displaying the results from a wide variety of multiple testing procedures (MTPs). In addition to many popular MTPs, the central methodological focus of the multtest package is the implementation of powerful joint multiple testing procedures. Joint MTPs are able to account for the dependencies between test statistics by effectively making use of (estimates of) the test statistics joint null distribution. To this end, two additional bootstrap-based estimates of the test statistics joint null distribution have been developed for use in the …


Locally Efficient Estimation Of Regression Parameters Using Current Status Data, Chris Andrews, Mark J. Van Der Laan, James M. Robins Sep 2002

Locally Efficient Estimation Of Regression Parameters Using Current Status Data, Chris Andrews, Mark J. Van Der Laan, James M. Robins

U.C. Berkeley Division of Biostatistics Working Paper Series

In biostatistics applications interest often focuses on the estimation of the distribution of a time-variable T. If one only observes whether or not T exceeds an observed monitoring time C, then the data structure is called current status data, also known as interval censored data, case I. We consider this data structure extended to allow the presence of both time-independent covariates and time-dependent covariate processes that are observed until the monitoring time. We assume that the monitoring process satisfies coarsening at random.

Our goal is to estimate the regression parameter beta of the regression model T = Z*beta+epsilon where the …


Bivariate Current Status Data, Mark J. Van Der Laan, Nicholas P. Jewell Sep 2002

Bivariate Current Status Data, Mark J. Van Der Laan, Nicholas P. Jewell

U.C. Berkeley Division of Biostatistics Working Paper Series

In many applications, it is often of interest to estimate a bivariate distribution of two survival random variables. Complete observation of such random variables is often incomplete. If one only observes whether or not each of the individual survival times exceeds a common observed monitoring time C, then the data structure is referred to as bivariate current status data (Wang and Ding, 2000). For such data, we show that the identifiable part of the joint distribution is represented by three univariate cumulative distribution functions, namely the two marginal cumulative distribution functions, and the bivariate cumulative distribution function evaluated on the …


Estimation Of The Bivariate Survival Function With Generalized Bivariate Right Censored Data Structures, Sunduz Keles, Mark J. Van Der Laan, James M. Robins Aug 2002

Estimation Of The Bivariate Survival Function With Generalized Bivariate Right Censored Data Structures, Sunduz Keles, Mark J. Van Der Laan, James M. Robins

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a bivariate survival function estimator for a general right censored data structure that includes a time dependent covariate process. Firstly, an initial estimator that generalizes Dabrowska's (1988) estimator is introduced. We obtain this estimator by a general methodology of constructing estimating functions in censored data models. The initial estimator is guaranteed to improve on Dabrowska's estimator and remains consistent and asymptotically linear under informative censoring schemes if the censoring mechanism is estimated consistently. We then construct an orthogonalized estimating function which results in a more robust and efficient estimator than our initial estimator. A simulation study demonstrates the …