Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 141

Full-Text Articles in Physical Sciences and Mathematics

Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, An-Shun Tai, Sheng-Hsuan Lin May 2020

Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, An-Shun Tai, Sheng-Hsuan Lin

Harvard University Biostatistics Working Paper Series

Recent methodological developments in causal mediation analysis have addressed several issues regarding multiple mediators. However, these developed methods differ in their definitions of causal parameters, assumptions for identification, and interpretations of causal effects, making it unclear which method ought to be selected when investigating a given causal effect. Thus, in this study, we construct an integrated framework, which unifies all existing methodologies, as a standard for mediation analysis with multiple mediators. To clarify the relationship between existing methods, we propose four strategies for effect decomposition: two-way, partially forward, partially backward, and complete decompositions. This study reveals how the direct and …


Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall Nov 2018

Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall

UW Biostatistics Working Paper Series

BACKGROUND: National- or regional-scale prediction models that estimate individual-level air pollution concentrations commonly include hundreds of geographic variables. However, these many variables may not be necessary and parsimonious approach including small numbers of variables may achieve sufficient prediction ability. This parsimonious approach can also be applied to most criteria pollutants. This approach will be powerful when generating publicly available datasets of model predictions that support research in environmental health and other fields. OBJECTIVES: We aim to (1) build annual-average integrated empirical geographic (IEG) regression models for the contiguous U.S. for six criteria pollutants, for all years with regulatory monitoring data …


A Spline-Assisted Semiparametric Approach To Nonparametric Measurement Error Models, Fei Jiang, Yanyuan Ma Mar 2018

A Spline-Assisted Semiparametric Approach To Nonparametric Measurement Error Models, Fei Jiang, Yanyuan Ma

COBRA Preprint Series

Nonparametric estimation of the probability density function of a random variable measured with error is considered to be a difficult problem, in the sense that depending on the measurement error prop- erty, the estimation rate can be as slow as the logarithm of the sample size. Likewise, nonparametric estimation of the regression function with errors in the covariate suffers the same possibly slow rate. The traditional methods for both problems are based on deconvolution, where the slow convergence rate is caused by the quick convergence to zero of the Fourier transform of the measurement error density, which, unfortunately, appears in …


Technical Considerations In The Use Of The E-Value, Tyler J. Vanderweele, Peng Ding, Maya Mathur Feb 2018

Technical Considerations In The Use Of The E-Value, Tyler J. Vanderweele, Peng Ding, Maya Mathur

Harvard University Biostatistics Working Paper Series

The E-value is defined as the minimum strength of association on the risk ratio scale that an unmeasured confounder would have to have with both the exposure and the outcome, conditional on the measured covariates, to explain away the observed exposure-outcome association. We have elsewhere proposed that the reporting of E-values for estimates and for the limit of the confidence interval closest to the null become routine whenever causal effects are of interest. A number of questions have arisen about the use of E-value including questions concerning the interpretation of the relevant confounding association parameters, the nature of the transformation …


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen Feb 2017

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Nested Partially-Latent, Class Models For Dependent Binary Data, Estimating Disease Etiology, Zhenke Wu, Maria Deloria-Knoll, Scott L. Zeger Nov 2015

Nested Partially-Latent, Class Models For Dependent Binary Data, Estimating Disease Etiology, Zhenke Wu, Maria Deloria-Knoll, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

The Pneumonia Etiology Research for Child Health (PERCH) study seeks to use modern measurement technology to infer the causes of pneumonia for which gold-standard evidence is unavailable. The paper describes a latent variable model designed to infer from case-control data the etiology distribution for the population of cases, and for an individual case given his or her measurements. We assume each observation is drawn from a mixture model for which each component represents one cause or disease class. The model addresses a major limitation of the traditional latent class approach by taking account of residual dependence among multivariate binary outcome …


A General Framework For Diagnosing Confounding Of Time-Varying And Other Joint Exposures, John W. Jackson May 2015

A General Framework For Diagnosing Confounding Of Time-Varying And Other Joint Exposures, John W. Jackson

Harvard University Biostatistics Working Paper Series

No abstract provided.


Enhanced Precision In The Analysis Of Randomized Trials With Ordinal Outcomes, Iván Díaz, Elizabeth Colantuoni, Michael Rosenblum Oct 2014

Enhanced Precision In The Analysis Of Randomized Trials With Ordinal Outcomes, Iván Díaz, Elizabeth Colantuoni, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

We present a general method for estimating the effect of a treatment on an ordinal outcome in randomized trials. The method is robust in that it does not rely on the proportional odds assumption. Our estimator leverages information in prognostic baseline variables, and has all of the following properties: (i) it is consistent; (ii) it is locally efficient; (iii) it is guaranteed to match or improve the precision of the standard, unadjusted estimator. To the best of our knowledge, this is the first estimator of the causal relation between a treatment and an ordinal outcome to satisfy these properties. We …


Partially-Latent Class Models (Plcm) For Case-Control Studies Of Childhood Pneumonia Etiology, Zhenke Wu, Maria Deloria-Knoll, Laura L. Hammitt, Scott L. Zeger May 2014

Partially-Latent Class Models (Plcm) For Case-Control Studies Of Childhood Pneumonia Etiology, Zhenke Wu, Maria Deloria-Knoll, Laura L. Hammitt, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

In population studies on the etiology of disease, one goal is the estimation of the fraction of cases attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal, or other pathogens. The study of pneumonia etiology is challenging because directly sampling from the lung to identify the etiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. This paper considers the problem of estimating the population etiology distribution and the individual etiology probabilities. We formulate the scientific …


A Unification Of Mediation And Interaction: A Four-Way Decomposition, Tyler J. Vanderweele Mar 2014

A Unification Of Mediation And Interaction: A Four-Way Decomposition, Tyler J. Vanderweele

Harvard University Biostatistics Working Paper Series

It is shown that the overall effect of an exposure on an outcome, in the presence of a mediator with which the exposure may interact, can be decomposed into four components: (i) the effect of the exposure in the absence of the mediator, (ii) the interactive effect when the mediator is left to what it would be in the absence of exposure, (iii) a mediated interaction, and (iv) a pure mediated effect. These four components, respectively, correspond to the portion of the effect that is due to neither mediation nor interaction, to just interaction (but not mediation), to both mediation …


Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani Feb 2014

Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani

COBRA Preprint Series

Endometriosis is increasingly collecting worldwide attention due to its medical complexity and social impact. The European community has identified this as a “social disease”. A large amount of information comes from scientists, yet several aspects of this pathology and staging criteria need to be clearly defined on a suitable number of individuals. In fact, available studies on endometriosis are not easily comparable due to a lack of standardized criteria to collect patients’ informations and scarce definitions of symptoms. Currently, only retrospective surgical stadiation is used to measure pathology intensity, while the Evidence Based Medicine (EBM) requires shareable methods and correct …


Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Jan 2014

Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As consequence, the observed data cannot be considered as n/2 independent, identically distributed (i.i.d.) pairs of units, as current practice assumes. …


Estimating Population Treatment Effects From A Survey Sub-Sample, Kara E. Rudolph, Ivan Diaz, Michael Rosenblum, Elizabeth A. Stuart Jan 2014

Estimating Population Treatment Effects From A Survey Sub-Sample, Kara E. Rudolph, Ivan Diaz, Michael Rosenblum, Elizabeth A. Stuart

Johns Hopkins University, Dept. of Biostatistics Working Papers

We consider the problem of estimating an average treatment effect for a target population from a survey sub-sample. Our motivating example is generalizing a treatment effect estimated in a sub-sample of the National Comorbidity Survey Replication Adolescent Supplement to the population of U.S. adolescents. To address this problem, we evaluate easy-to-implement methods that account for both non-random treatment assignment and a non-random two-stage selection mechanism. We compare the performance of a Horvitz-Thompson estimator using inverse probability weighting (IPW) and two double robust estimators in a variety of scenarios. We demonstrate that the two double robust estimators generally outperform IPW in …


Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty Sep 2013

Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty

UW Biostatistics Working Paper Series

The evaluation of biomarkers to improve risk prediction is a common theme in modern research. Since its introduction in 2008, the net reclassification index (NRI) (Pencina et al. 2008, Pencina et al. 2011) has gained widespread use as a measure of prediction performance with over 1,200 citations as of June 30, 2013. The NRI is considered by some to be more sensitive to clinically important changes in risk than the traditional change in the AUC (Delta AUC) statistic (Hlatky et al. 2009). Recent statistical research has raised questions, however, about the validity of conclusions based on the NRI. (Hilden and …


Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe Aug 2013

Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe

UW Biostatistics Working Paper Series

Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.

Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.

Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the …


Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen Jul 2013

Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


The Net Reclassification Index (Nri): A Misleading Measure Of Prediction Improvement With Miscalibrated Or Overfit Models, Margaret Pepe, Jin Fang, Ziding Feng, Thomas Gerds, Jorgen Hilden Mar 2013

The Net Reclassification Index (Nri): A Misleading Measure Of Prediction Improvement With Miscalibrated Or Overfit Models, Margaret Pepe, Jin Fang, Ziding Feng, Thomas Gerds, Jorgen Hilden

UW Biostatistics Working Paper Series

The Net Reclassification Index (NRI) is a very popular measure for evaluating the improvement in prediction performance gained by adding a marker to a set of baseline predictors. However, the statistical properties of this novel measure have not been explored in depth. We demonstrate the alarming result that the NRI statistic calculated on a large test dataset using risk models derived from a training set is likely to be positive even when the new marker has no predictive information. A related theoretical example is provided in which a miscalibrated risk model that includes an uninformative marker is proven to erroneously …


A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman Dec 2012

A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman

UW Biostatistics Working Paper Series

Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or “land use” regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging …


Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng Dec 2011

Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

No abstract provided.


Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang Nov 2011

Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang

Johns Hopkins University, Dept. of Biostatistics Working Papers

In disease surveillance systems or registries, bivariate survival data are typically collected under interval sampling. It refers to a situation when entry into a registry is at the time of the first failure event (e.g., HIV infection) within a calendar time interval, the time of the initiating event (e.g., birth) is retrospectively identified for all the cases in the registry, and subsequently the second failure event (e.g., death) is observed during the follow-up. Sampling bias is induced due to the selection process that the data are collected conditioning on the first failure event occurs within a time interval. Consequently, the …


A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman Sep 2011

A Regularization Corrected Score Method For Nonlinear Regression Models With Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, Donna Spiegelman

Harvard University Biostatistics Working Paper Series

No abstract provided.


Variable Importance Analysis With The Multipim R Package, Stephan J. Ritter, Nicholas P. Jewell, Alan E. Hubbard Jul 2011

Variable Importance Analysis With The Multipim R Package, Stephan J. Ritter, Nicholas P. Jewell, Alan E. Hubbard

U.C. Berkeley Division of Biostatistics Working Paper Series

We describe the R package multiPIM, including statistical background, functionality and user options. The package is for variable importance analysis, and is meant primarily for analyzing data from exploratory epidemiological studies, though it could certainly be applied in other areas as well. The approach taken to variable importance comes from the causal inference field, and is different from approaches taken in other R packages. By default, multiPIM uses a double robust targeted maximum likelihood estimator (TMLE) of a parameter akin to the attributable risk. Several regression methods/machine learning algorithms are available for estimating the nuisance parameters of the models, including …


Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng Jul 2011

Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an …


Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan Mar 2011

Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Landmark Prediction Of Survival, Layla Parast, Tianxi Cai Sep 2010

Landmark Prediction Of Survival, Layla Parast, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Improving Statistical Analysis Of Prospective Clinical Trials In Stem Cell Transplantation. An Inventory Of New Approaches In Survival Analysis, Aurelien Latouche Jun 2010

Improving Statistical Analysis Of Prospective Clinical Trials In Stem Cell Transplantation. An Inventory Of New Approaches In Survival Analysis, Aurelien Latouche

COBRA Preprint Series

The CLINT project is an European Union funded project, run as a specific support action, under the sixth framework programme. It is a 2 year project aimed at supporting the European Group for Blood and Marrow Transplantation (EBMT) to develop its infrastructure for the conduct of trans-European clinical trials in accordance with the EU Clinical Trials Directive, and to facilitate International prospective clinical trials in stem cell transplantation. The initial task is to create an inventory of the existing biostatistical literature on new approaches to survival analyses that are not currently widely utilised. The estimation of survival endpoints is introduced, …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.