Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Biostatistics

PDF

COBRA

U.C. Berkeley Division of Biostatistics Working Paper Series

2011

Keyword

Articles 1 - 14 of 14

Full-Text Articles in Physical Sciences and Mathematics

Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan Oct 2011

Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level x0 with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [van der Laan & Rubin, 2006; van der Laan & Rose, 2011]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation …


Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber Aug 2011

Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted minimum loss based estimation (TMLE) provides a template for the construction of semiparametric locally efficient double robust substitution estimators of the target parameter of the data generating distribution in a semiparametric censored data or causal inference model based on a sample of independent and identically distributed copies from this data generating distribution (van der Laan and Rubin (2006), van der Laan (2008), van der Laan and Rose (2011)). TMLE requires 1) writing the target parameter as a particular mapping from a typically infinite dimensional parameter of the probability distribution of the unit data structure into the parameter space, 2) …


Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan Aug 2011

Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Estimating the causal effect of an intervention on a population typically involves defining parameters in a nonparametric structural equation model (Pearl, 2000, NPSEM) in which the treatment or exposure is deter- ministically assigned in a static or dynamic way. We define a new causal parameter that takes into account the fact that intervention policies can result in stochastically assigned exposures. The statistical parameter that identifies the causal parameter of interest is established. Inverse probability of treatment weighting (IPTW), augmented IPTW (A-IPTW), and targeted maximum likelihood estimators (TMLE) are developed. A simulation study is performed to demonstrate the properties of these …


Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan Jul 2011

Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In many causal inference problems, one is interested in the direct causal effect of an exposure on an outcome of interest that is not mediated by certain intermediate variables. Robins and Greenland (1992) and Pearl (2000) formalized the definition of two types of direct effects (natural and controlled) under the counterfactual framework. Since then, identifiability conditions for these effects have been studied extensively. By contrast, considerably fewer efforts have been invested in the estimation problem of the natural direct effect. In this article, we propose a semiparametric efficient, multiply robust estimator for the natural direct effect of a binary treatment …


Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan Jul 2011

Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Applying targeted maximum likelihood estimation to longitudinal data can be computationally intensive. As the number of time points and/or number of intermediate factors grows, the computation resources consumed by these algorithms likewise increases. Different TMLE algorithms have different computational speeds and implementation challenges; there may also be efficiency differences of the corresponding estimators. The algorithm we describe here proceeds by solving the empirical efficient influence curve equation directly using numerical computation methods, rather than indirectly (by solving a score equation), which is the usual route. We believe that this estimator is the simplest of the TMLE procedures to implement in …


Targeted Methods For Finding Quantitative Trait Loci, Hui Wang, Sherri Rose, Mark J. Van Der Laan Jul 2011

Targeted Methods For Finding Quantitative Trait Loci, Hui Wang, Sherri Rose, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Conventional genetic mapping methods typically assume parametric models with Gaussian errors, and obtain parameter estimates through maximum likelihood estimation. We propose a general semiparametric model to map quantitative trait loci (QTL) in experimental crosses. In contrast with widely-used interval mapping (IM) derived methods, our model requires fewer assumptions and also accommodates various machine learning algorithms. Estimation using both targeted maximum likelihood and collaborative targeted maximum likelihood methods is compared to a composite interval mapping (CIM) approach. We demonstrate with simulations and real data analyses that, on average, our semiparametric targeted learning approach produces less biased QTL effect estimates than those …


Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. Van Der Laan Jun 2011

Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The conditional relative risk is an important measure in medical and epidemiological studies when the outcome of interest is binary (i.e. disease vs. no disease). When the outcome is common, estimation of conditional relative risk and related parameters can be problematic, especially when the exposure or covariates are continuous. We propose a new estimation procedure based on targeted maximum likelihood methodology that targets the parameters relating to the conditional relative risk for common outcomes under a log-linear, or multiplicative, semi-parametric model. In this paper, we present three possible targeted maximum likelihood estimators for relative risk parameters implied by such a …


Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. Van Der Laan Jun 2011

Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper we present a histogram-like estimator of a conditional density that uses super learner crossvalidation to estimate the histogram probabilities, as well as the optimal number and position of the bins. This estimator is an alternative to kernel density estimators when the dimension of the problem is large. We demonstrate its applicability to estimation of Marginal Structural Model (MSM) parameters in which an initial estimator of the treatment %mechanism is needed. MSM estimation based on the proposed density estimator results in less biased estimates, when compared to estimates based on a misspecified parametric model.


A General Implementation Of Tmle For Longitudinal Data Applied To Causal Inference In Survival Analysis, Ori M. Stitelman, Victor De Gruttola, Mark J. Van Der Laan Apr 2011

A General Implementation Of Tmle For Longitudinal Data Applied To Causal Inference In Survival Analysis, Ori M. Stitelman, Victor De Gruttola, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In many randomized controlled trials the outcome of interest is a time to event, and one measures on each subject baseline covariates and time-dependent covariates until the subject either drops-out, the time to event is observed, or the end of study is reached. The goal of such a study is to assess the causal effect of the treatment on the survival curve. Standard methods (e.g., Kaplan-Meier estimator, Cox-proportional hazards) ignore the available baseline and time-dependent covariates, and are therefore biased if the drop-out is affected by these covariates, and are always inefficient. We present a targeted maximum likelihood estimator of …


Targeted Minimum Loss Based Estimator That Outperforms A Given Estimator, Susan Gruber, Mark J. Van Der Laan Apr 2011

Targeted Minimum Loss Based Estimator That Outperforms A Given Estimator, Susan Gruber, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted minimum loss based estimation (TMLE) provides a template for the construction of semiparametric locally efficient double robust substitution estimators of the target parameter of the data generating distribution in a semiparametric censored data or causal inference model (van der Laan and Rubin (2006),van der Laan (2008), van der Laan and Rose (2011)). In this article we demonstrate how to construct a TMLE that also satisfies the property that it is at least as efficient as a user supplied asymptotically linear estimator. For the sake of illustration we focus on estimation of the additive average causal effect of a point …


The Relative Performance Of Targeted Maximum Likelihood Estimators, Kristin E. Porter, Susan Gruber, Mark J. Van Der Laan, Jasjeet S. Sekhon Apr 2011

The Relative Performance Of Targeted Maximum Likelihood Estimators, Kristin E. Porter, Susan Gruber, Mark J. Van Der Laan, Jasjeet S. Sekhon

U.C. Berkeley Division of Biostatistics Working Paper Series

There is an active debate in the literature on censored data about the relative performance of model based maximum likelihood estimators, IPCW-estimators, and a variety of double robust semiparametric efficient estimators. Kang and Schafer (2007) demonstrate the fragility of double robust and IPCW-estimators in a simulation study with positivity violations. They focus on a simple missing data problem with covariates where one desires to estimate the mean of an outcome that is subject to missingness. Responses by Robins et al. (2007), Tsiatis and Davidian (2007), Tan (2007a) and Ridgeway and McCaffrey (2007) further explore the challenges faced by double robust …


Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan Apr 2011

Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This article is devoted to the construction and asymptotic study of adaptive group sequential covariate-adjusted randomized clinical trials analyzed through the prism of the semiparametric methodology of targeted maximum likelihood estimation (TMLE). We show how to build, as the data accrue group-sequentially, a sampling design which targets a user-supplied optimal design. We also show how to carry out a sound TMLE statistical inference based on such an adaptive sampling scheme (therefore extending some results known in the i.i.d setting only so far), and how group-sequential testing applies on top of it. The procedure is robust (i.e., consistent even if the …


Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan Mar 2011

Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …


Tmle: An R Package For Targeted Maximum Likelihood Estimation, Susan Gruber, Mark J. Van Der Laan Feb 2011

Tmle: An R Package For Targeted Maximum Likelihood Estimation, Susan Gruber, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted maximum likelihood estimation (TMLE) presents an approach for construction of an efficient double-robust semi-parametric substitution estimator of a target feature of the data generating distribution, such as a statistical association measure or a causal effect parameter. tmle is a recently developed R package that implements TMLE for estimation of the effect of a binary treatment at a single point in time on an outcome of interest, controlling for user supplied covariates: the additive treatment effect, the relative risk, the odds ratio. The package allows outcome data with missingness, and experimental units that contribute repeated records of the point-treatment data …