Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Epidemiology

Series

Institution
Keyword
Publication Year
Publication

Articles 1 - 30 of 50

Full-Text Articles in Statistical Models

The Need To Incorporate Communities In Compartmental Models, Michael J. Kane, Owais Gilani Jan 2021

The Need To Incorporate Communities In Compartmental Models, Michael J. Kane, Owais Gilani

Faculty Journal Articles

Tian et al. provide a framework for assessing population- level interventions of disease outbreaks through the construction of counterfactuals in a large-scale, natural experiment assessing the efficacy of mild, but early interventions compared to delayed interventions. The technique is applied to the recent SARS-CoV-2 outbreak with the population of Shenzhen, China acting as the mild-but-early treatment group and a combination of several US counties resembling Shenzhen but enacting a delayed intervention acting as the control. To help further the development of this framework and identify an avenue for further enhancement, we focus on the use and potential limitations of compartmental …


Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick Jan 2020

Sex And Age Differences In Prevalence And Risk Factors For Prediabetes In Mexican-Americans, Kristina Vatcheva, Belinda M. Reininger, Susan P. Fisher-Hoch, Joseph B. Mccormick

School of Mathematical and Statistical Sciences Faculty Publications and Presentations

AIMS:

Over 1/3 of Americans have prediabetes, while 9.4% have type 2 diabetes. The aim of our study was to estimate the prevalence of prediabetes in Mexican Americans, with known 28.2% prevalence of type 2 diabetes, by age and sex and to identify critical socio-demographic and clinical factors associated with prediabetes.

METHODS:

Data were collected between 2004 and 2017 from the Cameron County Hispanic Cohort in Texas. Weighted crude and sex- and age- stratified prevalences were calculated. Survey weighted logistic regression analyses were conducted to identify risk factors for prediabetes.

RESULTS:

The prevalence of prediabetes (32%) was slightly higher than …


Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki Jun 2017

Mixture Models For Undiagnosed Prevalent Disease And Interval-Censored Incident Disease: Applications To A Cohort Assembled From Electronic Health Records., Li C Cheung, Qing Pan, Noorie Hyun, Mark Schiffman, Barbara Fetterman, Philip E Castle, Thomas Lorey, Hormuzd A Katki

Epidemiology Faculty Publications

For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley Sep 2015

Preparedness Of Hospitals In The Republic Of Ireland For An Influenza Pandemic, An Infection Control Perspective, Mary Reidy, Fiona Ryan, Dervla Hogan, Seán Lacey, Claire Buckley

Department of Mathematics Publications

When an influenza pandemic occurs most of the population is susceptible and attack rates can range as high as 40–50 %. The most important failure in pandemic planning is the lack of standards or guidelines regarding what it means to be ‘prepared’. The aim of this study was to assess the preparedness of acute hospitals in the Republic of Ireland for an influenza pandemic from an infection control perspective.


Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen Jul 2013

Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman Dec 2012

A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman

UW Biostatistics Working Paper Series

Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or “land use” regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging …


Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng Dec 2011

Flexible Distributed Lag Models Using Random Functions With Application To Estimating Mortality Displacement From Heat-Related Deaths, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

No abstract provided.


Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang Nov 2011

Assessing Association For Bivariate Survival Data With Interval Sampling: A Copula Model Approach With Application To Aids Study, Hong Zhu, Mei-Cheng Wang

Johns Hopkins University, Dept. of Biostatistics Working Papers

In disease surveillance systems or registries, bivariate survival data are typically collected under interval sampling. It refers to a situation when entry into a registry is at the time of the first failure event (e.g., HIV infection) within a calendar time interval, the time of the initiating event (e.g., birth) is retrospectively identified for all the cases in the registry, and subsequently the second failure event (e.g., death) is observed during the follow-up. Sampling bias is induced due to the selection process that the data are collected conditioning on the first failure event occurs within a time interval. Consequently, the …


Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr. Oct 2011

Depicting Estimates Using The Intercept In Meta-Regression Models: The Moving Constant Technique, Blair T. Johnson Dr., Tania B. Huedo-Medina Dr.

CHIP Documents

In any scientific discipline, the ability to portray research patterns graphically often aids greatly in interpreting a phenomenon. In part to depict phenomena, the statistics and capabilities of meta-analytic models have grown increasingly sophisticated. Accordingly, this article details how to move the constant in weighted meta-analysis regression models (viz. “meta-regression”) to illuminate the patterns in such models across a range of complexities. Although it is commonly ignored in practice, the constant (or intercept) in such models can be indispensible when it is not relegated to its usual static role. The moving constant technique makes possible estimates and confidence intervals at …


Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng Jul 2011

Reduced Bayesian Hierarchical Models: Estimating Health Effects Of Simultaneous Exposure To Multiple Pollutants, Jennifer F. Bobb, Francesca Dominici, Roger D. Peng

Johns Hopkins University, Dept. of Biostatistics Working Papers

Quantifying the health effects associated with simultaneous exposure to many air pollutants is now a research priority of the US EPA. Bayesian hierarchical models (BHM) have been extensively used in multisite time series studies of air pollution and health to estimate health effects of a single pollutant adjusted for potential confounding of other pollutants and other time-varying factors. However, when the scientific goal is to estimate the impacts of many pollutants jointly, a straightforward application of BHM is challenged by the need to specify a random-effect distribution on a high-dimensional vector of nuisance parameters, which often do not have an …


Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan Mar 2011

Threshold Regression Models Adapted To Case-Control Studies, And The Risk Of Lung Cancer Due To Occupational Exposure To Asbestos In France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Asbestos has been known for many years as a powerful carcinogen. Our purpose is quantify the relationship between an occupational exposure to asbestos and an increase of the risk of lung cancer. Furthermore, we wish to tackle the very delicate question of the evaluation, in subjects suffering from a lung cancer, of how much the amount of exposure to asbestos explains the occurrence of the cancer. For this purpose, we rely on a recent French case-control study. We build a large collection of threshold regression models, data-adaptively select a better model in it by multi-fold likelihood-based cross-validation, then fit the …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager Oct 2009

Causal Inference In Epidemiological Studies With Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. Van Der Laan, Ira B. Tager

U.C. Berkeley Division of Biostatistics Working Paper Series

One of the identifiabilty assumptions of causal effects defined by marginal structural model (MSM) parameters is the experimental treatment assignment (ETA) assumption. Practical violations of this assumption frequently occur in data analysis, when certain exposures are rarely observed within some strata of the population. The inverse probability of treatment weighted (IPTW) estimator is particularly sensitive to violations of this assumption, however, we demonstrate that this is a problem for all estimators of causal effects. This is due to the fact that the ETA assumption is about information (or lack thereof) in the data. A new class of causal models, causal …


A Spatio-Temporal Approach For Estimating Chronic Effects Of Air Pollution, Sonja Greven, Francesca Dominici, Scott L. Zeger Jun 2009

A Spatio-Temporal Approach For Estimating Chronic Effects Of Air Pollution, Sonja Greven, Francesca Dominici, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

Estimating the health risks associated with air pollution exposure is of great importance in public health. In air pollution epidemiology, two study designs have been used mainly. Time series studies estimate acute risk associated with short-term exposure. They compare day-to-day variation of pollution concentrations and mortality rates, and have been criticized for potential confounding by time-varying covariates. Cohort studies estimate chronic effects associated with long-term exposure. They compare long-term average pollution concentrations and time-to-death across cities, and have been criticized for potential confounding by individual risk factors or city-level characteristics.

We propose a new study design and a statistical model, …


Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei Mar 2009

Analysis Of Randomized Comparative Clinical Trial Data For Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Spatial Misalignment In Time Series Studies Of Air Pollution And Health Data, Roger D. Peng, Michelle L. Bell Dec 2008

Spatial Misalignment In Time Series Studies Of Air Pollution And Health Data, Roger D. Peng, Michelle L. Bell

Johns Hopkins University, Dept. of Biostatistics Working Papers

Time series studies of environmental exposures often involve comparing daily changes in a toxicant measured at a point in space with daily changes in an aggregate measure of health. Spatial misalignment of the exposure and response variables can bias the estimation of health risk and the magnitude of this bias depends on the spatial variation of the exposure of interest. In air pollution epidemiology, there is an increasing focus on estimating the health effects of the chemical components of particulate matter. One issue that is raised by this new focus is the spatial misalignment error introduced by the lack of …


Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan Nov 2006

Spatio-Temporal Analysis Of Areal Data And Discovery Of Neighborhood Relationships In Conditionally Autoregressive Models, Subharup Guha, Louise Ryan

Harvard University Biostatistics Working Paper Series

No abstract provided.


Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh Nov 2006

Semiparametric Regression Of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines And Linear Mixed Models, Dawei Liu, Xihong Lin, Debashis Ghosh

Harvard University Biostatistics Working Paper Series

No abstract provided.


Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd Oct 2006

Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd

UW Biostatistics Working Paper Series

The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing …


Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li Sep 2006

Spatial Cluster Detection For Censored Outcome Data, Andrea J. Cook, Diane Gold, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma Jul 2006

Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma

UW Biostatistics Working Paper Series

The relative risk or prevalence ratio is a natural and familiar summary of association between a binary outcome and an exposure or intervention. For rare events, the relative risk can be approximately estimated by logistic regression. For common events estimation is more difficult. We review proposed estimation algorithms for relative risk regression. Some of these give inconsistent estimates or invalid standard errors. We show that the methods that give correct inference can be viewed as arising from a family of quasilikelihood estimating functions for the same generalized linear model, differing in their efficiency and in their robustness to outlying values …


Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan Jul 2006

Causal Comparisons In Randomized Trials Of Two Active Treatments: The Effect Of Supervised Exercise To Promote Smoking Cessation, Jason Roy, Joseph W. Hogan

COBRA Preprint Series

In behavioral medicine trials, such as smoking cessation trials, two or more active treatments are often compared. Noncompliance by some subjects with their assigned treatment poses a challenge to the data analyst. Causal parameters of interest might include those defined by subpopulations based on their potential compliance status under each assignment, using the principal stratification framework (e.g., causal effect of new therapy compared to standard therapy among subjects that would comply with either intervention). Even if subjects in one arm do not have access to the other treatment(s), the causal effect of each treatment typically can only be identified from …


Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh Apr 2006

Semiparametric Latent Variable Regression Models For Spatio-Temporal Modeling Of Mobile Source Particles In The Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, Helen H. Suh

Harvard University Biostatistics Working Paper Series

Traffic particle concentrations show considerable spatial variability within a metropolitan area. We consider latent variable semiparametric regression models for modeling the spatial and temporal variability of black carbon and elemental carbon concentrations in the greater Boston area. Measurements of these pollutants, which are markers of traffic particles, were obtained from several individual exposure studies conducted at specific household locations as well as 15 ambient monitoring sites in the city. The models allow for both flexible, nonlinear effects of covariates and for unexplained spatial and temporal variability in exposure. In addition, the different individual exposure studies recorded different surrogates of traffic …


Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan Oct 2005

Population Intervention Models In Causal Inference, Alan E. Hubbard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a] treatment variable or risk variable on the distribution of a disease in a population. These models, as originally introduced by Robins (e.g., Robins (2000a), Robins (2000b), van der Laan and Robins (2002)), model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates, and its dependence on treatment. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at …


Gauss-Seidel Estimation Of Generalized Linear Mixed Models With Application To Poisson Modeling Of Spatially Varying Disease Rates, Subharup Guha, Louise Ryan Oct 2005

Gauss-Seidel Estimation Of Generalized Linear Mixed Models With Application To Poisson Modeling Of Spatially Varying Disease Rates, Subharup Guha, Louise Ryan

Harvard University Biostatistics Working Paper Series

Generalized linear mixed models (GLMMs) provide an elegant framework for the analysis of correlated data. Due to the non-closed form of the likelihood, GLMMs are often fit by computational procedures like penalized quasi-likelihood (PQL). Special cases of these models are generalized linear models (GLMs), which are often fit using algorithms like iterative weighted least squares (IWLS). High computational costs and memory space constraints often make it difficult to apply these iterative procedures to data sets with very large number of cases.

This paper proposes a computationally efficient strategy based on the Gauss-Seidel algorithm that iteratively fits sub-models of the GLMM …


Computational Techniques For Spatial Logistic Regression With Large Datasets, Christopher J. Paciorek, Louise Ryan Oct 2005

Computational Techniques For Spatial Logistic Regression With Large Datasets, Christopher J. Paciorek, Louise Ryan

Harvard University Biostatistics Working Paper Series

In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation.

A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial …


A Nonstationary Negative Binomial Time Series With Time-Dependent Covariates: Enterococcus Counts In Boston Harbor, E. Andres Houseman, Brent Coull, James P. Shine Sep 2005

A Nonstationary Negative Binomial Time Series With Time-Dependent Covariates: Enterococcus Counts In Boston Harbor, E. Andres Houseman, Brent Coull, James P. Shine

Harvard University Biostatistics Working Paper Series

Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence …