Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 173

Full-Text Articles in Physical Sciences and Mathematics

Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, An-Shun Tai, Sheng-Hsuan Lin May 2020

Integrated Multiple Mediation Analysis: A Robustness–Specificity Trade-Off In Causal Structure, An-Shun Tai, Sheng-Hsuan Lin

Harvard University Biostatistics Working Paper Series

Recent methodological developments in causal mediation analysis have addressed several issues regarding multiple mediators. However, these developed methods differ in their definitions of causal parameters, assumptions for identification, and interpretations of causal effects, making it unclear which method ought to be selected when investigating a given causal effect. Thus, in this study, we construct an integrated framework, which unifies all existing methodologies, as a standard for mediation analysis with multiple mediators. To clarify the relationship between existing methods, we propose four strategies for effect decomposition: two-way, partially forward, partially backward, and complete decompositions. This study reveals how the direct and …


A Modular Framework For Early-Phase Seamless Oncology Trials, Philip S. Boonstra, Thomas M. Braun, Elizabeth C. Chase Jan 2020

A Modular Framework For Early-Phase Seamless Oncology Trials, Philip S. Boonstra, Thomas M. Braun, Elizabeth C. Chase

The University of Michigan Department of Biostatistics Working Paper Series

Background: As our understanding of the etiology and mechanisms of cancer becomes more sophisticated and the number of therapeutic options increases, phase I oncology trials today have multiple primary objectives. Many such designs are now 'seamless', meaning that the trial estimates both the maximum tolerated dose and the efficacy at this dose level. Sponsors often proceed with further study only with this additional efficacy evidence. However, with this increasing complexity in trial design, it becomes challenging to articulate fundamental operating characteristics of these trials, such as (i) what is the probability that the design will identify an acceptable, i.e. safe …


Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss Oct 2019

Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss

The University of Michigan Department of Biostatistics Working Paper Series

A patient's medical problem list describes his or her current health status and aids in the coordination and transfer of care between providers, among other things. Because a problem list is generated once and then subsequently modified or updated, what is not usually observable is the provider-effect. That is, to what extent does a patient's problem in the electronic medical record actually reflect a consensus communication of that patient's current health status? To that end, we report on and analyze a unique interview-based design in which multiple medical providers independently generate problem lists for each of three patient case abstracts …


Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall Nov 2018

Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall

UW Biostatistics Working Paper Series

BACKGROUND: National- or regional-scale prediction models that estimate individual-level air pollution concentrations commonly include hundreds of geographic variables. However, these many variables may not be necessary and parsimonious approach including small numbers of variables may achieve sufficient prediction ability. This parsimonious approach can also be applied to most criteria pollutants. This approach will be powerful when generating publicly available datasets of model predictions that support research in environmental health and other fields. OBJECTIVES: We aim to (1) build annual-average integrated empirical geographic (IEG) regression models for the contiguous U.S. for six criteria pollutants, for all years with regulatory monitoring data …


A Spline-Assisted Semiparametric Approach To Nonparametric Measurement Error Models, Fei Jiang, Yanyuan Ma Mar 2018

A Spline-Assisted Semiparametric Approach To Nonparametric Measurement Error Models, Fei Jiang, Yanyuan Ma

COBRA Preprint Series

Nonparametric estimation of the probability density function of a random variable measured with error is considered to be a difficult problem, in the sense that depending on the measurement error prop- erty, the estimation rate can be as slow as the logarithm of the sample size. Likewise, nonparametric estimation of the regression function with errors in the covariate suffers the same possibly slow rate. The traditional methods for both problems are based on deconvolution, where the slow convergence rate is caused by the quick convergence to zero of the Fourier transform of the measurement error density, which, unfortunately, appears in …


Technical Considerations In The Use Of The E-Value, Tyler J. Vanderweele, Peng Ding, Maya Mathur Feb 2018

Technical Considerations In The Use Of The E-Value, Tyler J. Vanderweele, Peng Ding, Maya Mathur

Harvard University Biostatistics Working Paper Series

The E-value is defined as the minimum strength of association on the risk ratio scale that an unmeasured confounder would have to have with both the exposure and the outcome, conditional on the measured covariates, to explain away the observed exposure-outcome association. We have elsewhere proposed that the reporting of E-values for estimates and for the limit of the confidence interval closest to the null become routine whenever causal effects are of interest. A number of questions have arisen about the use of E-value including questions concerning the interpretation of the relevant confounding association parameters, the nature of the transformation …


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen Feb 2017

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral …


Estimating The Probability Of Clonal Relatedness Of Pairs Of Tumors In Cancer Patients, Audrey Mauguen, Venkatraman E. Seshan, Irina Ostrovnaya, Colin B. Begg Feb 2017

Estimating The Probability Of Clonal Relatedness Of Pairs Of Tumors In Cancer Patients, Audrey Mauguen, Venkatraman E. Seshan, Irina Ostrovnaya, Colin B. Begg

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Next generation sequencing panels are being used increasingly in cancer research to study tumor evolution. A specific statistical challenge is to compare the mutational profiles in different tumors from a patient to determine the strength of evidence that the tumors are clonally related, i.e. derived from a single, founder clonal cell. The presence of identical mutations in each tumor provides evidence of clonal relatedness, although the strength of evidence from a match is related to how commonly the mutation is seen in the tumor type under investigation. This evidence must be weighed against the evidence in favor of independent tumors …


Studying The Optimal Scheduling For Controlling Prostate Cancer Under Intermittent Androgen Suppression, Sunil K. Dhar, Hans R. Chaudhry, Bruce G. Bukiet, Zhiming Ji, Nan Gao, Thomas W. Findley Jan 2017

Studying The Optimal Scheduling For Controlling Prostate Cancer Under Intermittent Androgen Suppression, Sunil K. Dhar, Hans R. Chaudhry, Bruce G. Bukiet, Zhiming Ji, Nan Gao, Thomas W. Findley

Harvard University Biostatistics Working Paper Series

This retrospective study shows that the majority of patients’ correlations between PSA and Testosterone during the on-treatment period is at least 0.90. Model-based duration calculations to control PSA levels during off-treatment are provided. There are two pairs of models. In one pair, the Generalized Linear Model and Mixed Model are both used to analyze the variability of PSA at the individual patient level by using the variable “Patient ID” as a repeated measure. In the second pair, Patient ID is not used as a repeated measure but additional baseline variables are included to analyze the variability of PSA.


Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li Mar 2016

Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Identifying important biomarkers that are predictive for cancer patients' prognosis is key in gaining better insights into the biological influences on the disease and has become a critical component of precision medicine. The emergence of large-scale biomedical survival studies, which typically involve excessive number of biomarkers, has brought high demand in designing efficient screening tools for selecting predictive biomarkers. The vast amount of biomarkers defies any existing variable selection methods via regularization. The recently developed variable screening methods, though powerful in many practical setting, fail to incorporate prior information on the importance of each biomarker and are less powerful in …


Strengthening Instrumental Variables Through Weighting, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li Mar 2016

Strengthening Instrumental Variables Through Weighting, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Instrumental variable (IV) methods are widely used to deal with the issue of unmeasured confounding and are becoming popular in health and medical research. IV models are able to obtain consistent estimates in the presence of unmeasured confounding, but rely on assumptions that are hard to verify and often criticized. An instrument is a variable that influences or encourages individuals toward a particular treatment without directly affecting the outcome. Estimates obtained using instruments with a weak influence over the treatment are known to have larger small-sample bias and to be less robust to the critical IV assumption that the instrument …


Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang Feb 2016

Hpcnmf: A High-Performance Toolbox For Non-Negative Matrix Factorization, Karthik Devarajan, Guoli Wang

COBRA Preprint Series

Non-negative matrix factorization (NMF) is a widely used machine learning algorithm for dimension reduction of large-scale data. It has found successful applications in a variety of fields such as computational biology, neuroscience, natural language processing, information retrieval, image processing and speech recognition. In bioinformatics, for example, it has been used to extract patterns and profiles from genomic and text-mining data as well as in protein sequence and structure analysis. While the scientific performance of NMF is very promising in dealing with high dimensional data sets and complex data structures, its computational cost is high and sometimes could be critical for …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


An Efficient Basket Trial Design, Kristen Cunanan, Alexia Iasonos, Ronglai Shen, Colin B. Begg, Mithat Gonen Jan 2016

An Efficient Basket Trial Design, Kristen Cunanan, Alexia Iasonos, Ronglai Shen, Colin B. Begg, Mithat Gonen

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

The landscape for early phase cancer clinical trials is changing dramatically due to the advent of targeted therapy. Increasingly, new drugs are designed to work against a target such as the presence of a specific tumor mutation. Since typically only a small proportion of cancer patients will possess the mutational target, but the mutation is present in many different cancers, a new class of basket trials is emerging, whereby the drug is tested simultaneously in different baskets, i.e., sub-groups of different tumor types. Investigators not only desire to test whether the drug works, but also to determine which types of …


Nested Partially-Latent, Class Models For Dependent Binary Data, Estimating Disease Etiology, Zhenke Wu, Maria Deloria-Knoll, Scott L. Zeger Nov 2015

Nested Partially-Latent, Class Models For Dependent Binary Data, Estimating Disease Etiology, Zhenke Wu, Maria Deloria-Knoll, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

The Pneumonia Etiology Research for Child Health (PERCH) study seeks to use modern measurement technology to infer the causes of pneumonia for which gold-standard evidence is unavailable. The paper describes a latent variable model designed to infer from case-control data the etiology distribution for the population of cases, and for an individual case given his or her measurements. We assume each observation is drawn from a mixture model for which each component represents one cause or disease class. The model addresses a major limitation of the traditional latent class approach by taking account of residual dependence among multivariate binary outcome …


A General Framework For Diagnosing Confounding Of Time-Varying And Other Joint Exposures, John W. Jackson May 2015

A General Framework For Diagnosing Confounding Of Time-Varying And Other Joint Exposures, John W. Jackson

Harvard University Biostatistics Working Paper Series

No abstract provided.


Enhanced Precision In The Analysis Of Randomized Trials With Ordinal Outcomes, Iván Díaz, Elizabeth Colantuoni, Michael Rosenblum Oct 2014

Enhanced Precision In The Analysis Of Randomized Trials With Ordinal Outcomes, Iván Díaz, Elizabeth Colantuoni, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

We present a general method for estimating the effect of a treatment on an ordinal outcome in randomized trials. The method is robust in that it does not rely on the proportional odds assumption. Our estimator leverages information in prognostic baseline variables, and has all of the following properties: (i) it is consistent; (ii) it is locally efficient; (iii) it is guaranteed to match or improve the precision of the standard, unadjusted estimator. To the best of our knowledge, this is the first estimator of the causal relation between a treatment and an ordinal outcome to satisfy these properties. We …


Partially-Latent Class Models (Plcm) For Case-Control Studies Of Childhood Pneumonia Etiology, Zhenke Wu, Maria Deloria-Knoll, Laura L. Hammitt, Scott L. Zeger May 2014

Partially-Latent Class Models (Plcm) For Case-Control Studies Of Childhood Pneumonia Etiology, Zhenke Wu, Maria Deloria-Knoll, Laura L. Hammitt, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

In population studies on the etiology of disease, one goal is the estimation of the fraction of cases attributable to each of several causes. For example, pneumonia is a clinical diagnosis of lung infection that may be caused by viral, bacterial, fungal, or other pathogens. The study of pneumonia etiology is challenging because directly sampling from the lung to identify the etiologic pathogen is not standard clinical practice in most settings. Instead, measurements from multiple peripheral specimens are made. This paper considers the problem of estimating the population etiology distribution and the individual etiology probabilities. We formulate the scientific …


Deductive Derivation And Computerization Of Compatible Semiparametric Efficient Estimation, Constantine E. Frangakis, Tianchen Qian, Zhenke Wu, Ivan Diaz May 2014

Deductive Derivation And Computerization Of Compatible Semiparametric Efficient Estimation, Constantine E. Frangakis, Tianchen Qian, Zhenke Wu, Ivan Diaz

U.C. Berkeley Division of Biostatistics Working Paper Series

Researchers often seek robust inference for a parameter through semiparametric estimation. Efficient semiparametric estimation currently requires theoretical derivation of the efficient influence function (EIF), which can be a challenging and time-consuming task. If this task can be computerized, it can save dramatic human effort, which can be transferred, for example, to the design of new studies. Although the EIF is, in principle, a derivative, simple numerical differentiation to calculate the EIF by a computer masks the EIF's functional dependence on the parameter of interest. For this reason, the standard approach to obtaining the EIF has been the theoretical construction of …


Dose Expansion Cohorts In Phase I Trials, Alexia Iasonos, John O'Quigley May 2014

Dose Expansion Cohorts In Phase I Trials, Alexia Iasonos, John O'Quigley

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

A rapidly increasing number of Phase I dose-finding studies, and in particular those based on the standard 3+3 design, frequently prolong the study and include dose expansion cohorts (DEC) with the goal to better characterize the toxicity profiles of experimental agents and to study disease specific cohorts. These trials consist of two phases: the usual dose escalation phase that aims to establish the maximum tolerated dose (MTD) and the dose expansion phase that accrues additional patients, often with different eligibility criteria, and where additional information is being collected. Current protocols typically do not specify whether the MTD will be updated …


A Unification Of Mediation And Interaction: A Four-Way Decomposition, Tyler J. Vanderweele Mar 2014

A Unification Of Mediation And Interaction: A Four-Way Decomposition, Tyler J. Vanderweele

Harvard University Biostatistics Working Paper Series

It is shown that the overall effect of an exposure on an outcome, in the presence of a mediator with which the exposure may interact, can be decomposed into four components: (i) the effect of the exposure in the absence of the mediator, (ii) the interactive effect when the mediator is left to what it would be in the absence of exposure, (iii) a mediated interaction, and (iv) a pure mediated effect. These four components, respectively, correspond to the portion of the effect that is due to neither mediation nor interaction, to just interaction (but not mediation), to both mediation …


Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani Feb 2014

Computational Model For Survey And Trend Analysis Of Patients With Endometriosis : A Decision Aid Tool For Ebm, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, Alessandro Fasciani

COBRA Preprint Series

Endometriosis is increasingly collecting worldwide attention due to its medical complexity and social impact. The European community has identified this as a “social disease”. A large amount of information comes from scientists, yet several aspects of this pathology and staging criteria need to be clearly defined on a suitable number of individuals. In fact, available studies on endometriosis are not easily comparable due to a lack of standardized criteria to collect patients’ informations and scarce definitions of symptoms. Currently, only retrospective surgical stadiation is used to measure pathology intensity, while the Evidence Based Medicine (EBM) requires shareable methods and correct …


Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Jan 2014

Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As consequence, the observed data cannot be considered as n/2 independent, identically distributed (i.i.d.) pairs of units, as current practice assumes. …


Estimating Population Treatment Effects From A Survey Sub-Sample, Kara E. Rudolph, Ivan Diaz, Michael Rosenblum, Elizabeth A. Stuart Jan 2014

Estimating Population Treatment Effects From A Survey Sub-Sample, Kara E. Rudolph, Ivan Diaz, Michael Rosenblum, Elizabeth A. Stuart

Johns Hopkins University, Dept. of Biostatistics Working Papers

We consider the problem of estimating an average treatment effect for a target population from a survey sub-sample. Our motivating example is generalizing a treatment effect estimated in a sub-sample of the National Comorbidity Survey Replication Adolescent Supplement to the population of U.S. adolescents. To address this problem, we evaluate easy-to-implement methods that account for both non-random treatment assignment and a non-random two-stage selection mechanism. We compare the performance of a Horvitz-Thompson estimator using inverse probability weighting (IPW) and two double robust estimators in a variety of scenarios. We demonstrate that the two double robust estimators generally outperform IPW in …


Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty Sep 2013

Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty

UW Biostatistics Working Paper Series

The evaluation of biomarkers to improve risk prediction is a common theme in modern research. Since its introduction in 2008, the net reclassification index (NRI) (Pencina et al. 2008, Pencina et al. 2011) has gained widespread use as a measure of prediction performance with over 1,200 citations as of June 30, 2013. The NRI is considered by some to be more sensitive to clinically important changes in risk than the traditional change in the AUC (Delta AUC) statistic (Hlatky et al. 2009). Recent statistical research has raised questions, however, about the validity of conclusions based on the NRI. (Hilden and …


Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe Aug 2013

Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe

UW Biostatistics Working Paper Series

Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.

Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.

Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the …


Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen Jul 2013

Attributing Effects To Interactions, Tyler J. Vanderweele, Eric J. Tchetgen Tchetgen

Harvard University Biostatistics Working Paper Series

A framework is presented which allows an investigator to estimate the portion of the effect of one exposure that is attributable to an interaction with a second exposure. We show that when the two exposures are independent, the total effect of one exposure can be decomposed into a conditional effect of that exposure and a component due to interaction. The decomposition applies on difference or ratio scales. We discuss how the components can be estimated using standard regression models, and how these components can be used to evaluate the proportion of the total effect of the primary exposure attributable to …


Restricted Likelihood Ratio Tests For Functional Effects In The Functional Linear Model, Bruce J. Swihart, Jeff Goldsmith, Ciprian M. Crainiceanu Jun 2013

Restricted Likelihood Ratio Tests For Functional Effects In The Functional Linear Model, Bruce J. Swihart, Jeff Goldsmith, Ciprian M. Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

The goal of our article is to provide a transparent, robust, and computationally feasible statistical approach for testing in the context of scalar-on-function linear regression models. In particular, we are interested in testing for the necessity of functional effects against standard linear models. Our methods are motivated by and applied to a large longitudinal study involving diffusion tensor imaging of intracranial white matter tracts in a susceptible cohort. In the context of this study, we conduct hypothesis tests that are motivated by anatomical knowledge and which support recent findings regarding the relationship between cognitive impairment and white matter demyelination. R-code …


Augmentation Of Propensity Scores For Medical Records-Based Research, Mikel Aickin Jun 2013

Augmentation Of Propensity Scores For Medical Records-Based Research, Mikel Aickin

COBRA Preprint Series

Therapeutic research based on electronic medical records suffers from the possibility of various kinds of confounding. Over the past 30 years, propensity scores have increasingly been used to try to reduce this possibility. In this article a gap is identified in the propensity score methodology, and it is proposed to augment traditional treatment-propensity scores with outcome-propensity scores, thereby removing all other aspects of common causes from the analysis of treatment effects.


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …