Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

COBRA

2013

Discipline
Keyword
Publication

Articles 1 - 30 of 62

Full-Text Articles in Biostatistics

Issues Related To Combining Multiple Speciated Pm2.5 Data Sources In Spatio-Temporal Exposure Models For Epidemiology: The Npact Case Study, Sun-Young Kim, Lianne Sheppard, Timothy V. Larson, Joel Kaufman, Sverre Vedal Dec 2013

Issues Related To Combining Multiple Speciated Pm2.5 Data Sources In Spatio-Temporal Exposure Models For Epidemiology: The Npact Case Study, Sun-Young Kim, Lianne Sheppard, Timothy V. Larson, Joel Kaufman, Sverre Vedal

UW Biostatistics Working Paper Series

Background: Regulatory monitoring data have been the most common exposure data resource in studies of the association between long-term PM2.5 components and health. However, data collected for regulatory purposes may not be compatible with epidemiological study.

Objectives: We aimed to explore three important features of the PM2.5 component monitoring data obtained from multiple sources to combine all available data for developing spatio-temporal prediction models in the National Particle Component and Toxicity (NPACT) study.

Methods: The NPACT monitoring data were collected in an extensive monitoring campaign targeting cohort participants. The regulatory monitoring data were obtained from the Chemical Speciation …


Prediction Of Fine Particulate Matter Chemical Components For The Multi-Ethnic Study Of Atherosclerosis Cohort: A Comparison Of Two Modeling Approaches, Sun-Young Kim, Lianne Sheppard, Silas Bergen, Adam A. Szpiro, Paul D. Sampson, Joel Kaufman, Sverre Vedal Dec 2013

Prediction Of Fine Particulate Matter Chemical Components For The Multi-Ethnic Study Of Atherosclerosis Cohort: A Comparison Of Two Modeling Approaches, Sun-Young Kim, Lianne Sheppard, Silas Bergen, Adam A. Szpiro, Paul D. Sampson, Joel Kaufman, Sverre Vedal

UW Biostatistics Working Paper Series

Recent epidemiological cohort studies of the health effects of PM2.5 have developed exposure estimates from advanced exposure prediction models. Such models represent spatial variability across participant residential locations. However, few cohort studies have developed exposure predictions for PM2.5 components. We used two exposure modeling approaches to obtain long-term average predicted concentrations for four PM2.5 components: sulfur, silicon, and elemental and organic carbon (EC and OC). The models were specifically developed for the Multi-Ethnic Study of Atherosclerosis (MESA) cohort as a part of the National Particle Component and Toxicity (NPACT) study. The spatio-temporal model used 2-week average measurements …


Characterizing Expected Benefits Of Biomarkers In Treatment Selection, Ying Huang, Eric Laber, Holly Janes Nov 2013

Characterizing Expected Benefits Of Biomarkers In Treatment Selection, Ying Huang, Eric Laber, Holly Janes

UW Biostatistics Working Paper Series

Biomarkers associated with the treatment response heterogeneity hold potential for treatment selection. In practice, the decision regarding whether to adopt a treatment selection marker depends on the effect of treatment selection on the rate of targeted disease as well as additional cost associated with the treatment. We propose an expected benefit measure that incorporates both aspects to quantify a biomarker's treatment selection capacity. This measure extends an existing decision-theoretic framework, to account for the fact that optimal treatment absent marker information varies with the cost of treatment. In addition, we establish upper and lower bounds for the performance of a …


A General Instrumental Variable Framework For Regression Analysis With Outcome Missing Not At Random, Eric J. Tchetgen Tchetgen, Kathleen Wirth Nov 2013

A General Instrumental Variable Framework For Regression Analysis With Outcome Missing Not At Random, Eric J. Tchetgen Tchetgen, Kathleen Wirth

Harvard University Biostatistics Working Paper Series

No abstract provided.


Alternative Identification And Inference For The Effect Of Treatment On The Treated With An Instrumental Variable, Eric J. Tchetgen Tchetgen, Stijn Vansteelandt Nov 2013

Alternative Identification And Inference For The Effect Of Treatment On The Treated With An Instrumental Variable, Eric J. Tchetgen Tchetgen, Stijn Vansteelandt

Harvard University Biostatistics Working Paper Series

No abstract provided.


Identification And Estimation Of Survivor Average Causal Effects, Eric J. Tchetgen Tchetgen Nov 2013

Identification And Estimation Of Survivor Average Causal Effects, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

No abstract provided.


On The Causal Interpretation Of Race In Regressions Adjusting For Confounding And Mediating Variables, Tyler J. Vanderweele, Whitney Robinson Nov 2013

On The Causal Interpretation Of Race In Regressions Adjusting For Confounding And Mediating Variables, Tyler J. Vanderweele, Whitney Robinson

Harvard University Biostatistics Working Paper Series

We consider different possible interpretations of the “effect of race” when regressions are run with race as an exposure variable, controlling also for various confounding and mediating variables. When adjustment is made for socioeconomic status early in a person's life, we discuss under what contexts the regression coefficients for race can be interpreted as corresponding to the extent to which a racial disparity would remain if various socioeconomic distributions early in life across racial groups could be equalized. When adjustment is also made for adult socioeconomic status, we note how the overall disparity can be decomposed into the portion that …


A Unification Of Mediation And Interaction, Tyler J. Vanderweele Nov 2013

A Unification Of Mediation And Interaction, Tyler J. Vanderweele

Harvard University Biostatistics Working Paper Series

We show that the overall effect of an exposure on an outcome, in the presence of a mediator with which the exposure may interact, can be decomposed into four components: (i) the effect of the exposure in the absence of the mediator, (ii) the interactive effect when the mediator is left to what is would be in the absence of exposure, (iii) a mediated interaction and (iv) a pure mediated effect. These four components respectively correspond to the portion of the effect that is due to neither mediation nor interaction, to just interaction (but not mediation), to both mediation and …


Challenges In Estimating The Causal Effect Of An Intervention With Pre-Post Data (Part 1): Definition & Identification Of The Causal Parameter, Ann M. Weber, Mark J. Van Der Laan, Maya L. Petersen Oct 2013

Challenges In Estimating The Causal Effect Of An Intervention With Pre-Post Data (Part 1): Definition & Identification Of The Causal Parameter, Ann M. Weber, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

There is mixed evidence of the effectiveness of interventions operating on a large scale. Although the lack of consistent results is generally attributed to problems of implementation or governance of the program, the failure to find a statistically significant effect (or the success of finding one) may be due to choices made in the evaluation. To demonstrate the potential limitations and pitfalls of the usual analytic methods used for estimating causal effects, we apply the first half of a roadmap for causal inference to a pre-post evaluation of a community-level, national nutrition program. Selection into the program was non-random and …


Variable Importance And Prediction Methods For Longitudinal Problems With Missing Variables, Ivan Diaz, Alan E. Hubbard, Anna Decker, Mitchell Cohen Oct 2013

Variable Importance And Prediction Methods For Longitudinal Problems With Missing Variables, Ivan Diaz, Alan E. Hubbard, Anna Decker, Mitchell Cohen

U.C. Berkeley Division of Biostatistics Working Paper Series

In this paper we present prediction and variable importance (VIM) methods for longitudinal data sets containing both continuous and binary exposures subject to missingness. We demonstrate the use of these methods for prognosis of medical outcomes of severe trauma patients, a field in which current medical practice involves rules of thumb and scoring methods that only use a few variables and ignore the dynamic and high-dimensional nature of trauma recovery. Well-principled prediction and VIM methods can thus provide a tool to make care decisions informed by the high-dimensional patient’s physiological and clinical history. Our VIM parameters can be causally interpreted …


Targeted Learning Of An Optimal Dynamic Treatment, And Statistical Inference For Its Mean Outcome, Mark J. Van Der Laan Oct 2013

Targeted Learning Of An Optimal Dynamic Treatment, And Statistical Inference For Its Mean Outcome, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Suppose we observe n independent and identically distributed observations of a time-dependent random variable consisting of baseline covariates, initial treatment and censoring indicator, intermediate covariates, subsequent treatment and censoring indicator, and a final outcome. For example, this could be data generated by a sequentially randomized controlled trial, where subjects are sequentially randomized to a first line and second line treatment, possibly assigned in response to an intermediate biomarker, and are subject to right-censoring. In this article we consider estimation of an optimal dynamic multiple time-point treatment rule defined as the rule that maximizes the mean outcome under the dynamic treatment, …


Sparse Median Graphs Estimation In A High Dimensional Semiparametric Model, Fang Han, Han Liu, Brian Caffo Oct 2013

Sparse Median Graphs Estimation In A High Dimensional Semiparametric Model, Fang Han, Han Liu, Brian Caffo

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this manuscript a unified framework for conducting inference on complex aggregated data in high dimensional settings is proposed. The data are assumed to be a collection of multiple non-Gaussian realizations with underlying undirected graphical structures. Utilizing the concept of median graphs in summarizing the commonality across these graphical structures, a novel semiparametric approach to modeling such complex aggregated data is provided along with robust estimation of the median graph, which is assumed to be sparse. The estimator is proved to be consistent in graph recovery and an upper bound on the rate of convergence is given. Experiments on both …


Adapting Data Adaptive Methods For Small, But High Dimensional Omic Data: Applications To Gwas/Ewas And More, Sara Kherad Pajouh, Alan E. Hubbard, Martyn T. Smith Oct 2013

Adapting Data Adaptive Methods For Small, But High Dimensional Omic Data: Applications To Gwas/Ewas And More, Sara Kherad Pajouh, Alan E. Hubbard, Martyn T. Smith

U.C. Berkeley Division of Biostatistics Working Paper Series

Exploratory analysis of high dimensional "omics" data has received much attention since the explosion of high-throughput technology allows simultaneous screening of tens of thousands of characteristics (genomics, metabolomics, proteomics, adducts, etc., etc.). Part of this trend has been an increase in the dimension of exposure data in studies of environmental exposure and associated biomarkers. Though some of the general approaches, such as GWAS, are transferable, what has received less focus is 1) how to derive estimation of independent associations in the context of many competing causes, without resorting to a misspecified model, and 2) how to derive accurate small-sample inference …


Regression Trees For Longitudinal Data, Madan Gopal Kundu, Jaroslaw Harezlak Sep 2013

Regression Trees For Longitudinal Data, Madan Gopal Kundu, Jaroslaw Harezlak

COBRA Preprint Series

Often when a longitudinal change is studied in a population of interest we find that changes over time are heterogeneous (in terms of time and/or covariates' effect) and a traditional linear mixed effect model [Laird and Ware, 1982] on the entire population assuming common parametric form for covariates and time may not be applicable to the entire population. This is usually the case in studies when there are many possible predictors influencing the response trajectory. For example, Raudenbush [2001] used depression as an example to argue that it is incorrect to assume that all the people in a given population …


Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty Sep 2013

Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty

UW Biostatistics Working Paper Series

The evaluation of biomarkers to improve risk prediction is a common theme in modern research. Since its introduction in 2008, the net reclassification index (NRI) (Pencina et al. 2008, Pencina et al. 2011) has gained widespread use as a measure of prediction performance with over 1,200 citations as of June 30, 2013. The NRI is considered by some to be more sensitive to clinically important changes in risk than the traditional change in the AUC (Delta AUC) statistic (Hlatky et al. 2009). Recent statistical research has raised questions, however, about the validity of conclusions based on the NRI. (Hilden and …


Normalization Techniques For Statistical Inference From Magnetic Resonance Imaging, Russell T. Shinohara, Elizabeth M. Sweeney, Jeff Goldsmith, Navid Shiee, Farrah J. Mateen, Peter A. Calabresi, Samson Jarso, Dzung L. Pham, Daniel S. Reich, Ciprian M. Crainiceanu Aug 2013

Normalization Techniques For Statistical Inference From Magnetic Resonance Imaging, Russell T. Shinohara, Elizabeth M. Sweeney, Jeff Goldsmith, Navid Shiee, Farrah J. Mateen, Peter A. Calabresi, Samson Jarso, Dzung L. Pham, Daniel S. Reich, Ciprian M. Crainiceanu

UPenn Biostatistics Working Papers

While computed tomography and other imaging techniques are measured in absolute units with physical meaning, magnetic resonance images are expressed in arbitrary units that are difficult to interpret and differ between study visits and subjects. Much work in the image processing literature on intensity normalization has focused on histogram matching and other histogram mapping techniques, with little emphasis on normalizing images to have biologically interpretable units. Furthermore, there are no formalized principles or goals for the crucial comparability of image intensities within and across subjects. To address this, we propose a set of criteria necessary for the normalization of images. …


Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe Aug 2013

Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe

UW Biostatistics Working Paper Series

Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.

Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.

Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the …


Testing The Relative Performance Of Data Adaptive Prediction Algorithms: A Generalized Test Of Conditional Risk Differences, Benjamin A. Goldstein, Eric Polley, Farren Briggs, Mark J. Van Der Laan Jul 2013

Testing The Relative Performance Of Data Adaptive Prediction Algorithms: A Generalized Test Of Conditional Risk Differences, Benjamin A. Goldstein, Eric Polley, Farren Briggs, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In statistical medicine comparing the predictability or fit of two models can help to determine whether a set of prognostic variables contains additional information about medical outcomes, or whether one of two different model fits (perhaps based on different algorithms, or different set of variables) should be preferred for clinical use. Clinical medicine has tended to rely on comparisons of clinical metrics like C-statistics and more recently reclassification. Such metrics rely on the outcome being categorical and utilize a specific and often obscure loss function. In classical statistics one can use likelihood ratio tests and information based criterion if the …


Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen Jul 2013

Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …


Sample Size Considerations In The Design Of Cluster Randomized Trials Of Combination Hiv Prevention, Rui Wang, Ravi Goyal, Quanhong Lei, M. Essex, Victor Degruttola Jul 2013

Sample Size Considerations In The Design Of Cluster Randomized Trials Of Combination Hiv Prevention, Rui Wang, Ravi Goyal, Quanhong Lei, M. Essex, Victor Degruttola

Harvard University Biostatistics Working Paper Series

No abstract provided.


Fast Covariance Estimation For High-Dimensional Functional Data, Luo Xiao, David Ruppert, Vadim Zipunnikov, Ciprian Crainiceanu Jun 2013

Fast Covariance Estimation For High-Dimensional Functional Data, Luo Xiao, David Ruppert, Vadim Zipunnikov, Ciprian Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

For smoothing covariance functions, we propose two fast algorithms that scale linearly with the number of observations per function. Most available methods and software cannot smooth covariance matrices of dimension J x J with J>500; the recently introduced sandwich smoother is an exception, but it is not adapted to smooth covariance matrices of large dimensions such as J \ge 10,000. Covariance matrices of order J=10,000, and even J=100,000$ are becoming increasingly common, e.g., in 2- and 3-dimensional medical imaging and high-density wearable sensor data. We introduce two new algorithms that can handle very large covariance matrices: 1) FACE: a …


Soft Null Hypotheses: A Case Study Of Image Enhancement Detection In Brain Lesions, Haochang Shou, Russell T. Shinohara, Han Liu, Daniel Reich, Ciprian Crainiceanu Jun 2013

Soft Null Hypotheses: A Case Study Of Image Enhancement Detection In Brain Lesions, Haochang Shou, Russell T. Shinohara, Han Liu, Daniel Reich, Ciprian Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

This work is motivated by a study of a population of multiple sclerosis (MS) patients using dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to identify active brain lesions. At each visit, a contrast agent is administered intravenously to a subject and a series of images is acquired to reveal the location and activity of MS lesions within the brain. Our goal is to identify and quantify lesion enhancement location at the subject level and lesion enhancement patterns at the population level. With this example, we aim to address the difficult problem of transforming a qualitative scientific null hypothesis, such as "this …


Phylogenetic Linkage Among Hiv-Infected Village Residents In Botswana: Estimation Of Clustering Rates In The Presence Of Missing Data, Nicole Bohme Carnegie, Rui Wang, Vladimir Novitsky, Victor G. Degruttola Jun 2013

Phylogenetic Linkage Among Hiv-Infected Village Residents In Botswana: Estimation Of Clustering Rates In The Presence Of Missing Data, Nicole Bohme Carnegie, Rui Wang, Vladimir Novitsky, Victor G. Degruttola

Harvard University Biostatistics Working Paper Series

No abstract provided.


Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh Jun 2013

Statistical Inference For Data Adaptive Target Parameters, Mark J. Van Der Laan, Alan E. Hubbard, Sara Kherad Pajouh

U.C. Berkeley Division of Biostatistics Working Paper Series

Consider one observes n i.i.d. copies of a random variable with a probability distribution that is known to be an element of a particular statistical model. In order to define our statistical target we partition the sample in V equal size sub-samples, and use this partitioning to define V splits in estimation-sample (one of the V subsamples) and corresponding complementary parameter-generating sample that is used to generate a target parameter. For each of the V parameter-generating samples, we apply an algorithm that maps the sample in a target parameter mapping which represent the statistical target parameter generated by that parameter-generating …


Restricted Likelihood Ratio Tests For Functional Effects In The Functional Linear Model, Bruce J. Swihart, Jeff Goldsmith, Ciprian M. Crainiceanu Jun 2013

Restricted Likelihood Ratio Tests For Functional Effects In The Functional Linear Model, Bruce J. Swihart, Jeff Goldsmith, Ciprian M. Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

The goal of our article is to provide a transparent, robust, and computationally feasible statistical approach for testing in the context of scalar-on-function linear regression models. In particular, we are interested in testing for the necessity of functional effects against standard linear models. Our methods are motivated by and applied to a large longitudinal study involving diffusion tensor imaging of intracranial white matter tracts in a susceptible cohort. In the context of this study, we conduct hypothesis tests that are motivated by anatomical knowledge and which support recent findings regarding the relationship between cognitive impairment and white matter demyelination. R-code …


Augmentation Of Propensity Scores For Medical Records-Based Research, Mikel Aickin Jun 2013

Augmentation Of Propensity Scores For Medical Records-Based Research, Mikel Aickin

COBRA Preprint Series

Therapeutic research based on electronic medical records suffers from the possibility of various kinds of confounding. Over the past 30 years, propensity scores have increasingly been used to try to reduce this possibility. In this article a gap is identified in the propensity score methodology, and it is proposed to augment traditional treatment-propensity scores with outcome-propensity scores, thereby removing all other aspects of common causes from the analysis of treatment effects.


A Versatile Test For Equality Of Two Survival Functions Based On Weighted Differences Of Kaplan-Meier Curves, Hajime Uno, Lu Tian, Brian Claggett, L. J. Wei May 2013

A Versatile Test For Equality Of Two Survival Functions Based On Weighted Differences Of Kaplan-Meier Curves, Hajime Uno, Lu Tian, Brian Claggett, L. J. Wei

Harvard University Biostatistics Working Paper Series

With censored event time observations, the logrank test is the most popular tool for testing the equality of two underlying survival distributions. Although this test is asymptotically distribution-free, it may not be powerful when the proportional hazards assumption is violated. Various other novel testing procedures have been proposed, which generally are derived by assuming a class of specific alternative hypotheses with respect to the hazard functions. The test considered by Pepe and Fleming (1989) is based on a linear combination of weighted differences of two Kaplan-Meier curves over time and is a natural tool to assess the difference of two …


Subsemble: An Ensemble Method For Combining Subset-Specific Algorithm Fits, Stephanie Sapp, Mark J. Van Der Laan, John Canny May 2013

Subsemble: An Ensemble Method For Combining Subset-Specific Algorithm Fits, Stephanie Sapp, Mark J. Van Der Laan, John Canny

U.C. Berkeley Division of Biostatistics Working Paper Series

Ensemble methods using the same underlying algorithm trained on different subsets of observations have recently received increased attention as practical prediction tools for massive datasets. We propose Subsemble: a general subset ensemble prediction method, which can be used for small, moderate, or large datasets. Subsemble partitions the full dataset into subsets of observations, fits a specified underlying algorithm on each subset, and uses a clever form of V-fold cross-validation to output a prediction function that combines the subset-specific fits. We give an oracle result that provides a theoretical performance guarantee for Subsemble. Through simulations, we demonstrate that Subsemble can be …


Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan May 2013

Targeted Maximum Likelihood Estimation For Dynamic And Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper describes a targeted maximum likelihood estimator (TMLE) for the parameters of longitudinal static and dynamic marginal structural models. We consider a longitudinal data structure consisting of baseline covariates, time-dependent intervention nodes, intermediate time-dependent covariates, and a possibly time dependent outcome. The intervention nodes at each time point can include a binary treatment as well as a right-censoring indicator. Given a class of dynamic or static interventions, a marginal structural model is used to model the mean of the intervention specific counterfactual outcome as a function of the intervention, time point, and possibly a subset of baseline covariates. Because …


Varying Index Coefficient Models, Shujie Ma, Peter Xuekun Song May 2013

Varying Index Coefficient Models, Shujie Ma, Peter Xuekun Song

The University of Michigan Department of Biostatistics Working Paper Series

It has been a long history of utilizing interactions in regression analysis to investigate interactive effects of covariates on response variables. In this paper we aim to address two kinds of new challenges resulted from the inclusion of such high-order effects in the regression model for complex data. The first kind arises from a situation where interaction effects of individual covariates are weak but those of combined covariates are strong, and the other kind pertains to the presence of nonlinear interactive effects. Generalizing the single index coefficient regression model (Xia and Li, 1999), we propose a new class of semiparametric …