Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Biostatistics

PDF

COBRA

2011

Keyword
Publication

Articles 1 - 30 of 36

Full-Text Articles in Physical Sciences and Mathematics

Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng Dec 2011

Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

No abstract provided.


Proxy Pattern-Mixture Analysis For A Binary Variable Subject To Nonresponse., Rebecca H. Andridge, Roderick J. Little Nov 2011

Proxy Pattern-Mixture Analysis For A Binary Variable Subject To Nonresponse., Rebecca H. Andridge, Roderick J. Little

The University of Michigan Department of Biostatistics Working Paper Series

We consider assessment of the impact of nonresponse for a binary survey

variable Y subject to nonresponse, when there is a set of covariates

observed for nonrespondents and respondents. To reduce dimensionality and

for simplicity we reduce the covariates to a continuous proxy variable X

that has the highest correlation with Y, estimated from a probit

regression analysis of respondent data. We extend our previously proposed

proxy-pattern mixture analysis (PPMA) for continuous outcomes to the binary

outcome using a latent variable approach. The method does not assume data

are missing at random, and creates a framework for sensitivity analyses.

Maximum …


Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan Oct 2011

Estimation Of A Non-Parametric Variable Importance Measure Of A Continuous Exposure, Chambaz Antoine, Pierre Neuvial, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level x0 with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [van der Laan & Rubin, 2006; van der Laan & Rose, 2011]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation …


Bland-Altman Plots For Evaluating Agreement Between Solid Tumor Measurements, Chaya S. Moskowitz, Mithat Gonen Sep 2011

Bland-Altman Plots For Evaluating Agreement Between Solid Tumor Measurements, Chaya S. Moskowitz, Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Rationale and Objectives. Solid tumor measurements are regularly used in clinical trials of anticancer therapeutic agents and in clinical practice managing patients' care. Consequently studies evaluating the reproducibility of solid tumor measurements are important as lack of reproducibility may directly affect patient management. The authors propose utilizing a modified Bland-Altman plot with a difference metric that lends itself naturally to this situation and facilitates interpretation. Materials and Methods. The modification to the Bland-Altman plot involves replacing the difference plotted on the vertical axis with the relative percent change (RC) between the two measurements. This quantity is the same one used …


A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman Sep 2011

A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman

Harvard University Biostatistics Working Paper Series

No abstract provided.


Longitudinal Analysis Of Spatiotemporal Processes: A Case Study Of Dynamic Contrast-Enhanced Magnetic Resonance Imaging In Multiple Sclerosis, Russell T. Shinohara, Ciprian M. Crainiceanu, Brian S. Caffo, Daniel S. Reich Sep 2011

Longitudinal Analysis Of Spatiotemporal Processes: A Case Study Of Dynamic Contrast-Enhanced Magnetic Resonance Imaging In Multiple Sclerosis, Russell T. Shinohara, Ciprian M. Crainiceanu, Brian S. Caffo, Daniel S. Reich

Johns Hopkins University, Dept. of Biostatistics Working Papers

Multiple sclerosis (MS) is an immune-mediated disease in which inflammatory lesions form in the brain. In many active MS lesions, the blood-brain barrier (BBB) is disrupted and blood flows into white matter; this disruption may be related to morbidity and disability. Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) allows quantitative study of blood flow and permeability dynamics throughout the brain. This technique involves a subject being imaged sequentially during a study visit as an intravenously administered contrast agent flows into the brain. In regions where flow is abnormal, such as white matter lesions, this allows the quantification of the BBB damage. …


Movelets: A Dictionary Of Movement, Jiawei Bai, Jeff Goldsmith, Brian Caffo, Thomas A. Glass, Ciprian M. Crainiceanu Aug 2011

Movelets: A Dictionary Of Movement, Jiawei Bai, Jeff Goldsmith, Brian Caffo, Thomas A. Glass, Ciprian M. Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

Recent technological advances provide researchers a way of gathering real-time information on an individual’s movement through the use of wearable devices that record acceleration. In this paper, we propose a method for identifying activity types, like walking, standing, and resting, from acceleration data. Our approach decomposes movements into short components called “movelets”, and builds a reference for each activity type. Unknown activities are predicted by matching new movelets to the reference. We apply our method to data collected from a single, three-axis accelerometer and focus on activities of interest in studying physical function in elderly populations. An important technical advantage …


Some Observations On The Wilcoxon Rank Sum Test, Scott S. Emerson Aug 2011

Some Observations On The Wilcoxon Rank Sum Test, Scott S. Emerson

UW Biostatistics Working Paper Series

This manuscript presents some general comments about the Wilcoxon rank sum test. Even the most casual reader will gather that I am not too impressed with the scientific usefulness of the Wilcoxon test. However, the actual motivation is more to illustrate differences between parametric, semiparametric, and nonparametric (distribution-free) inference, and to use this example to illustrate how many misconceptions have been propagated through a focus on (semi)parametric probability models as the basis for evaluating commonly used statistical analysis models. The document itself arose as a teaching tool for courses aimed at graduate students in biostatistics and statistics, with parts of …


The Importance Of Statistical Theory In Outlier Detection, Sarah C. Emerson, Scott S. Emerson Aug 2011

The Importance Of Statistical Theory In Outlier Detection, Sarah C. Emerson, Scott S. Emerson

UW Biostatistics Working Paper Series

We explore the performance of the outlier-sum statistic (Tibshirani and Hastie, Biostatistics 2007 8:2--8), a proposed method for identifying genes for which only a subset of a group of samples or patients exhibits differential expression levels. Our discussion focuses on this method as an example of how inattention to standard statistical theory can lead to approaches that exhibit some serious drawbacks. In contrast to the results presented by those authors, when comparing this method to several variations of the $t$-test, we find that the proposed method offers little benefit even in the most idealized scenarios, and suffers from a number …


Effectively Selecting A Target Population For A Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L. J. Wei Aug 2011

Effectively Selecting A Target Population For A Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, L. J. Wei

Harvard University Biostatistics Working Paper Series

When comparing a new treatment with a control in a randomized clinical study, the treatment effect is generally assessed by evaluating a summary measure over a specific study population. The success of the trial heavily depends on the choice of such a population. In this paper, we show a systematic, effective way to identify a promising population, for which the new treatment is expected to have a desired benefit, using the data from a current study involving similar comparator treatments. Specifically, with the existing data we first create a parametric scoring system using multiple covariates to estimate subject-specific treatment differences. …


Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber Aug 2011

Targeted Minimum Loss Based Estimation Of An Intervention Specific Mean Outcome, Mark J. Van Der Laan, Susan Gruber

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted minimum loss based estimation (TMLE) provides a template for the construction of semiparametric locally efficient double robust substitution estimators of the target parameter of the data generating distribution in a semiparametric censored data or causal inference model based on a sample of independent and identically distributed copies from this data generating distribution (van der Laan and Rubin (2006), van der Laan (2008), van der Laan and Rose (2011)). TMLE requires 1) writing the target parameter as a particular mapping from a typically infinite dimensional parameter of the probability distribution of the unit data structure into the parameter space, 2) …


Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan Aug 2011

Population Intervention Causal Effects Based On Stochastic Interventions, Ivan Diaz Munoz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Estimating the causal effect of an intervention on a population typically involves defining parameters in a nonparametric structural equation model (Pearl, 2000, NPSEM) in which the treatment or exposure is deter- ministically assigned in a static or dynamic way. We define a new causal parameter that takes into account the fact that intervention policies can result in stochastically assigned exposures. The statistical parameter that identifies the causal parameter of interest is established. Inverse probability of treatment weighting (IPTW), augmented IPTW (A-IPTW), and targeted maximum likelihood estimators (TMLE) are developed. A simulation study is performed to demonstrate the properties of these …


Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan Jul 2011

Targeted Maximum Likelihood Estimation Of Natural Direct Effect, Wenjing Zheng, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In many causal inference problems, one is interested in the direct causal effect of an exposure on an outcome of interest that is not mediated by certain intermediate variables. Robins and Greenland (1992) and Pearl (2000) formalized the definition of two types of direct effects (natural and controlled) under the counterfactual framework. Since then, identifiability conditions for these effects have been studied extensively. By contrast, considerably fewer efforts have been invested in the estimation problem of the natural direct effect. In this article, we propose a semiparametric efficient, multiply robust estimator for the natural direct effect of a binary treatment …


On The Covariate-Adjusted Estimation For An Overall Treatment Difference With Data From A Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, L. J. Wei Jul 2011

On The Covariate-Adjusted Estimation For An Overall Treatment Difference With Data From A Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan Jul 2011

Targeted Minimum Loss Based Estimation Based On Directly Solving The Efficient Influence Curve Equation, Paul Chaffee, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Applying targeted maximum likelihood estimation to longitudinal data can be computationally intensive. As the number of time points and/or number of intermediate factors grows, the computation resources consumed by these algorithms likewise increases. Different TMLE algorithms have different computational speeds and implementation challenges; there may also be efficiency differences of the corresponding estimators. The algorithm we describe here proceeds by solving the empirical efficient influence curve equation directly using numerical computation methods, rather than indirectly (by solving a score equation), which is the usual route. We believe that this estimator is the simplest of the TMLE procedures to implement in …


Targeted Methods For Finding Quantitative Trait Loci, Hui Wang, Sherri Rose, Mark J. Van Der Laan Jul 2011

Targeted Methods For Finding Quantitative Trait Loci, Hui Wang, Sherri Rose, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Conventional genetic mapping methods typically assume parametric models with Gaussian errors, and obtain parameter estimates through maximum likelihood estimation. We propose a general semiparametric model to map quantitative trait loci (QTL) in experimental crosses. In contrast with widely-used interval mapping (IM) derived methods, our model requires fewer assumptions and also accommodates various machine learning algorithms. Estimation using both targeted maximum likelihood and collaborative targeted maximum likelihood methods is compared to a composite interval mapping (CIM) approach. We demonstrate with simulations and real data analyses that, on average, our semiparametric targeted learning approach produces less biased QTL effect estimates than those …


When Does Combining Markers Improve Classification Performance And What Are Implications For Practice?, Aasthaa Bansal, Margaret Sullivan Pepe Jun 2011

When Does Combining Markers Improve Classification Performance And What Are Implications For Practice?, Aasthaa Bansal, Margaret Sullivan Pepe

UW Biostatistics Working Paper Series

When an existing standard marker does not have sufficient classification accuracy on its own, new markers are sought with the goal of yielding a combination with better performance. The primary criterion for selecting new markers is that they have good performance on their own and preferably be uncorrelated with the standard. Most often linear combinations are considered. In this paper we investigate the increment in performance that is possible by combining a novel continuous marker with a moderately performing standard continuous marker under a variety of biologically motivated models for their joint distribution. We find that an uncorrelated continuous marker …


Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. Van Der Laan Jun 2011

Targeted Maximum Likelihood Estimation Of Conditional Relative Risk In A Semi-Parametric Regression Model, Cathy Tuglus, Kristin E. Porter, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The conditional relative risk is an important measure in medical and epidemiological studies when the outcome of interest is binary (i.e. disease vs. no disease). When the outcome is common, estimation of conditional relative risk and related parameters can be problematic, especially when the exposure or covariates are continuous. We propose a new estimation procedure based on targeted maximum likelihood methodology that targets the parameters relating to the conditional relative risk for common outcomes under a log-linear, or multiplicative, semi-parametric model. In this paper, we present three possible targeted maximum likelihood estimators for relative risk parameters implied by such a …


Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. Van Der Laan Jun 2011

Super Learner Based Conditional Density Estimation With Application To Marginal Structural Models, Ivan Diaz Munoz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper we present a histogram-like estimator of a conditional density that uses super learner crossvalidation to estimate the histogram probabilities, as well as the optimal number and position of the bins. This estimator is an alternative to kernel density estimators when the dimension of the problem is large. We demonstrate its applicability to estimation of Marginal Structural Model (MSM) parameters in which an initial estimator of the treatment %mechanism is needed. MSM estimation based on the proposed density estimator results in less biased estimates, when compared to estimates based on a misspecified parametric model.


Comparing Roc Curves Derived From Regression Models, Venkatraman E. Seshan, Mithat Gonen, Colin B. Begg Jun 2011

Comparing Roc Curves Derived From Regression Models, Venkatraman E. Seshan, Mithat Gonen, Colin B. Begg

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non-significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely-used ROC area test [1] produces exceptionally conservative test size and extremely low power [2]. In this article we show why the ROC …


On Causal Mediation Analysis With A Survival Outcome, Eric J. Tchetgen Tchetgen Jun 2011

On Causal Mediation Analysis With A Survival Outcome, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

Suppose that having established a marginal total effect of a point exposure on a time-to-event outcome, an investigator wishes to decompose this effect into its direct and indirect pathways, also know as natural direct and indirect effects, mediated by a variable known to occur after the exposure and prior to the outcome. This paper proposes a theory of estimation of natural direct and indirect effects in two important semiparametric models for a failure time outcome. The underlying survival model for the marginal total effect and thus for the direct and indirect effects, can either be a marginal structural Cox proportional …


Semiparametric Estimation Of Models For Natural Direct And Indirect Effects, Eric J. Tchetgen Tchetgen, Ilya Shpitser Jun 2011

Semiparametric Estimation Of Models For Natural Direct And Indirect Effects, Eric J. Tchetgen Tchetgen, Ilya Shpitser

Harvard University Biostatistics Working Paper Series

In recent years, researchers in the health and social sciences have become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of an exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Natural direct and indirect effects are of particular interest as they generally combine to produce the total effect of the exposure and therefore provide insight on the mechanism by which it operates to produce the outcome. A semiparametric theory …


Semiparametric Theory For Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, And Sensitivity Analysis, Eric J. Tchetgen Tchetgen, Ilya Shpitser Jun 2011

Semiparametric Theory For Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, And Sensitivity Analysis, Eric J. Tchetgen Tchetgen, Ilya Shpitser

Harvard University Biostatistics Working Paper Series

Whilst estimation of the marginal (total) causal effect of a point exposure on an outcome is arguably the most common objective of experimental and observational studies in the health and social sciences, in recent years, investigators have also become increasingly interested in mediation analysis. Specifically, upon establishing a non-null total effect of the exposure, investigators routinely wish to make inferences about the direct (indirect) pathway of the effect of the exposure not through (through) a mediator variable that occurs subsequently to the exposure and prior to the outcome. Although powerful semiparametric methodologies have been developed to analyze observational studies, that …


A General Implementation Of Tmle For Longitudinal Data Applied To Causal Inference In Survival Analysis, Ori M. Stitelman, Victor De Gruttola, Mark J. Van Der Laan Apr 2011

A General Implementation Of Tmle For Longitudinal Data Applied To Causal Inference In Survival Analysis, Ori M. Stitelman, Victor De Gruttola, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In many randomized controlled trials the outcome of interest is a time to event, and one measures on each subject baseline covariates and time-dependent covariates until the subject either drops-out, the time to event is observed, or the end of study is reached. The goal of such a study is to assess the causal effect of the treatment on the survival curve. Standard methods (e.g., Kaplan-Meier estimator, Cox-proportional hazards) ignore the available baseline and time-dependent covariates, and are therefore biased if the drop-out is affected by these covariates, and are always inefficient. We present a targeted maximum likelihood estimator of …


Targeted Minimum Loss Based Estimator That Outperforms A Given Estimator, Susan Gruber, Mark J. Van Der Laan Apr 2011

Targeted Minimum Loss Based Estimator That Outperforms A Given Estimator, Susan Gruber, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Targeted minimum loss based estimation (TMLE) provides a template for the construction of semiparametric locally efficient double robust substitution estimators of the target parameter of the data generating distribution in a semiparametric censored data or causal inference model (van der Laan and Rubin (2006),van der Laan (2008), van der Laan and Rose (2011)). In this article we demonstrate how to construct a TMLE that also satisfies the property that it is at least as efficient as a user supplied asymptotically linear estimator. For the sake of illustration we focus on estimation of the additive average causal effect of a point …


The Relative Performance Of Targeted Maximum Likelihood Estimators, Kristin E. Porter, Susan Gruber, Mark J. Van Der Laan, Jasjeet S. Sekhon Apr 2011

The Relative Performance Of Targeted Maximum Likelihood Estimators, Kristin E. Porter, Susan Gruber, Mark J. Van Der Laan, Jasjeet S. Sekhon

U.C. Berkeley Division of Biostatistics Working Paper Series

There is an active debate in the literature on censored data about the relative performance of model based maximum likelihood estimators, IPCW-estimators, and a variety of double robust semiparametric efficient estimators. Kang and Schafer (2007) demonstrate the fragility of double robust and IPCW-estimators in a simulation study with positivity violations. They focus on a simple missing data problem with covariates where one desires to estimate the mean of an outcome that is subject to missingness. Responses by Robins et al. (2007), Tsiatis and Davidian (2007), Tan (2007a) and Ridgeway and McCaffrey (2007) further explore the challenges faced by double robust …


Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan Apr 2011

Estimation And Testing In Targeted Group Sequential Covariate-Adjusted Randomized Clinical Trials, Antoine Chambaz, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This article is devoted to the construction and asymptotic study of adaptive group sequential covariate-adjusted randomized clinical trials analyzed through the prism of the semiparametric methodology of targeted maximum likelihood estimation (TMLE). We show how to build, as the data accrue group-sequentially, a sampling design which targets a user-supplied optimal design. We also show how to carry out a sound TMLE statistical inference based on such an adaptive sampling scheme (therefore extending some results known in the i.i.d setting only so far), and how group-sequential testing applies on top of it. The procedure is robust (i.e., consistent even if the …


Subsample Ignorable Likelihood For Accelerated Failure Time Models With Missing Predictors, Nanhua Zhang, Roderick J. Little Apr 2011

Subsample Ignorable Likelihood For Accelerated Failure Time Models With Missing Predictors, Nanhua Zhang, Roderick J. Little

The University of Michigan Department of Biostatistics Working Paper Series

No abstract provided.


Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan Mar 2011

Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …


Bate Curve In Assessment Of Clinical Utility Of Predictive Biomarkers, Xiao-Hua Zhou, Yunbei Ma Feb 2011

Bate Curve In Assessment Of Clinical Utility Of Predictive Biomarkers, Xiao-Hua Zhou, Yunbei Ma

UW Biostatistics Working Paper Series

In this paper, for time-to-event data, we propose a new statistical framework for casual inference in evaluating clinical utility of predictive biomarkers and in selecting an optimal treatment for a particular patient. This new casual framework is based on a new concept, called Biomarker Adjusted Treatment Effect (BATE) curve, which can be used to represent the clinical utility of a predictive biomarker and select an optimal treatment for one particular patient. We then propose semi-parametric methods for estimating the BATE curves of biomarkers and establish asymptotic results of the proposed estimators for the BATE curves. We also conduct extensive simulation …