Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Keyword
-
- Causal inference (6)
- Counterfactual (4)
- Air pollution (3)
- Epidemiology (3)
- Classification (2)
-
- Confounding (2)
- Corrected score (2)
- Double robust estimation (2)
- G-computation estimation (2)
- Genetics (2)
- Likelihood (2)
- Logistic (2)
- Marginal structural model (2)
- Marginal structural models (2)
- Measurement error (2)
- Observational studies (2)
- Sensitivity (2)
- Specificity (2)
- Survival (2)
- 3.3 HEALTH SCIENCES (1)
- Additive hazards model (1)
- Adjacency matrix; disease mapping; epidemiology; Markov processes (1)
- Air Pollution; Backfitting Algorithm; Environmental Epidemiology; Particulate Matter; Spatio-temporal Modeling (1)
- Ambient air quality (1)
- Antiretroviral resistance (1)
- Antiretroviral therapy (1)
- Asthma; Cluster Detection; Cumulative Residuals; Martingales; Spatial Scan Statistic (1)
- Asymptotics; Augmented kernel estimating equations; Double robustness; Efficiency; Inverse probability weighted kernel estimating equations; Kernel smoothing (1)
- Attributable Proportion (1)
- B-splines (1)
- Publication Year
Articles 1 - 30 of 50
Full-Text Articles in Statistical Models
The Need To Incorporate Communities In Compartmental Models, Michael J. Kane, Owais Gilani
The Need To Incorporate Communities In Compartmental Models, Michael J. Kane, Owais Gilani
Faculty Journal Articles
Tian et al. provide a framework for assessing population- level interventions of disease outbreaks through the construction of counterfactuals in a large-scale, natural experiment assessing the efficacy of mild, but early interventions compared to delayed interventions. The technique is applied to the recent SARS-CoV-2 outbreak with the population of Shenzhen, China acting as the mild-but-early treatment group and a combination of several US counties resembling Shenzhen but enacting a delayed intervention acting as the control. To help further the development of this framework and identify an avenue for further enhancement, we focus on the use and potential limitations of compartmental …
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick
School of Mathematical and Statistical Sciences Faculty Publications and Presentations
AIMS:
Over 1/3 of Americans have prediabetes, while 9.4% have type 2 diabetes. The aim of our study was to estimate the prevalence of prediabetes in Mexican Americans, with known 28.2% prevalence of type 2 diabetes, by age and sex and to identify critical socio-demographic and clinical factors associated with prediabetes.
METHODS:
Data were collected between 2004 and 2017 from the Cameron County Hispanic Cohort in Texas. Weighted crude and sex- and age- stratified prevalences were calculated. Survey weighted logistic regression analyses were conducted to identify risk factors for prediabetes.
RESULTS:
The prevalence of prediabetes (32%) was slightly higher than …
Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki
Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki
Epidemiology Faculty Publications
For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early …
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret
UW Biostatistics Working Paper Series
We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …
Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley
Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley
Department of Mathematics Publications
When an influenza pandemic occurs most of the population is susceptible and attack rates can range as high as 40–50 %. The most important failure in pandemic planning is the lack of standards or guidelines regarding what it means to be ‘prepared’. The aim of this study was to assess the preparedness of acute hospitals in the Republic of Ireland for an influenza pandemic from an infection control perspective.
Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen
Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen
Harvard University Biostatistics Working Paper Series
A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …
Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan
Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …
A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman
A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman
UW Biostatistics Working Paper Series
Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or “land use” regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging …
Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng
Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng
Johns Hopkins University, Dept. of Biostatistics Working Papers
No abstract provided.
Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang
Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang
Johns Hopkins University, Dept. of Biostatistics Working Papers
In disease surveillance systems or registries, bivariate survival data are typically collected under interval sampling. It refers to a situation when entry into a registry is at the time of the first failure event (e.g., HIV infection) within a calendar time interval, the time of the initiating event (e.g., birth) is retrospectively identified for all the cases in the registry, and subsequently the second failure event (e.g., death) is observed during the follow-up. Sampling bias is induced due to the selection process that the data are collected conditioning on the first failure event occurs within a time interval. Consequently, the …
Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.
Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.
CHIP Documents
In any scientific discipline, the ability to portray research patterns graphically often aids greatly in interpreting a phenomenon. In part to depict phenomena, the statistics and capabilities of meta-analytic models have grown increasingly sophisticated. Accordingly, this article details how to move the constant in weighted meta-analysis regression models (viz. “meta-regression”) to illuminate the patterns in such models across a range of complexities. Although it is commonly ignored in practice, the constant (or intercept) in such models can be indispensible when it is not relegated to its usual static role. The moving constant technique makes possible estimates and confidence intervals at …
Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng
Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng
Johns Hopkins University, Dept. of Biostatistics Working Papers
Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an …
Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan
Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …
Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel
Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel
COBRA Preprint Series
The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.
We …
Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin
Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin
Harvard University Biostatistics Working Paper Series
No abstract provided.
Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager
Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager
U.C. Berkeley Division of Biostatistics Working Paper Series
One of the identifiabilty assumptions of causal effects defined by marginal structural model (MSM) parameters is the experimental treatment assignment (ETA) assumption. Practical violations of this assumption frequently occur in data analysis, when certain exposures are rarely observed within some strata of the population. The inverse probability of treatment weighted (IPTW) estimator is particularly sensitive to violations of this assumption, however, we demonstrate that this is a problem for all estimators of causal effects. This is due to the fact that the ETA assumption is about information (or lack thereof) in the data. A new class of causal models, causal …
A Spatio-Temporal Approach For Estimating Chronic Effects Of Air Pollution, Sonja Greven, Francesca Dominici, Scott L. Zeger
A Spatio-Temporal Approach For Estimating Chronic Effects Of Air Pollution, Sonja Greven, Francesca Dominici, Scott L. Zeger
Johns Hopkins University, Dept. of Biostatistics Working Papers
Estimating the health risks associated with air pollution exposure is of great importance in public health. In air pollution epidemiology, two study designs have been used mainly. Time series studies estimate acute risk associated with short-term exposure. They compare day-to-day variation of pollution concentrations and mortality rates, and have been criticized for potential confounding by time-varying covariates. Cohort studies estimate chronic effects associated with long-term exposure. They compare long-term average pollution concentrations and time-to-death across cities, and have been criticized for potential confounding by individual risk factors or city-level characteristics.
We propose a new study design and a statistical model, …
Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei
Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei
Harvard University Biostatistics Working Paper Series
No abstract provided.
Spatial Misalignment In Time Series Studies Of Air Pollution And Health Data, Roger D. Peng, Michelle L. Bell
Spatial Misalignment In Time Series Studies Of Air Pollution And Health Data, Roger D. Peng, Michelle L. Bell
Johns Hopkins University, Dept. of Biostatistics Working Papers
Time series studies of environmental exposures often involve comparing daily changes in a toxicant measured at a point in space with daily changes in an aggregate measure of health. Spatial misalignment of the exposure and response variables can bias the estimation of health risk and the magnitude of this bias depends on the spatial variation of the exposure of interest. In air pollution epidemiology, there is an increasing focus on estimating the health effects of the chemical components of particulate matter. One issue that is raised by this new focus is the spatial misalignment error introduced by the lack of …
Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan
Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan
Harvard University Biostatistics Working Paper Series
No abstract provided.
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh
Harvard University Biostatistics Working Paper Series
No abstract provided.
Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd
Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd
UW Biostatistics Working Paper Series
The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing …
Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li
Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li
Harvard University Biostatistics Working Paper Series
No abstract provided.
Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma
Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma
UW Biostatistics Working Paper Series
The relative risk or prevalence ratio is a natural and familiar summary of association between a binary outcome and an exposure or intervention. For rare events, the relative risk can be approximately estimated by logistic regression. For common events estimation is more difficult. We review proposed estimation algorithms for relative risk regression. Some of these give inconsistent estimates or invalid standard errors. We show that the methods that give correct inference can be viewed as arising from a family of quasilikelihood estimating functions for the same generalized linear model, differing in their efficiency and in their robustness to outlying values …
Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan
Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan
COBRA Preprint Series
In behavioral medicine trials, such as smoking cessation trials, two or more active treatments are often compared. Noncompliance by some subjects with their assigned treatment poses a challenge to the data analyst. Causal parameters of interest might include those defined by subpopulations based on their potential compliance status under each assignment, using the principal stratification framework (e.g., causal effect of new therapy compared to standard therapy among subjects that would comply with either intervention). Even if subjects in one arm do not have access to the other treatment(s), the causal effect of each treatment typically can only be identified from …
Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh
Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh
Harvard University Biostatistics Working Paper Series
Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic …
Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan
Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan
U.C. Berkeley Division of Biostatistics Working Paper Series
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a] treatment variable or risk variable on the distribution of a disease in a population. These models, as originally introduced by Robins (e.g., Robins (2000a), Robins (2000b), van der Laan and Robins (2002)), model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates, and its dependence on treatment. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at …
Gauss-Seidel Estimation Of Generalized Linear Mixed Models With Application To Poisson Modeling Of Spatially Varying Disease Rates, Subharup Guha, Louise Ryan
Gauss-Seidel Estimation Of Generalized Linear Mixed Models With Application To Poisson Modeling Of Spatially Varying Disease Rates, Subharup Guha, Louise Ryan
Harvard University Biostatistics Working Paper Series
Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases.
This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM …
Computational Techniques For Spatial Logistic Regression With Large Datasets, Christopher J. Paciorek, Louise Ryan
Computational Techniques For Spatial Logistic Regression With Large Datasets, Christopher J. Paciorek, Louise Ryan
Harvard University Biostatistics Working Paper Series
In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation.
A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial …
A Nonstationary Negative Binomial Time Series With Time-Dependent Covariates: Enterococcus Counts In Boston Harbor, E. Andres Houseman, Brent Coull, James P. Shine
A Nonstationary Negative Binomial Time Series With Time-Dependent Covariates: Enterococcus Counts In Boston Harbor, E. Andres Houseman, Brent Coull, James P. Shine
Harvard University Biostatistics Working Paper Series
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence …