Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 19 of 19

Full-Text Articles in Physical Sciences and Mathematics

Stacked Generalization: An Introduction To Super Learning, Ashley Naimi, Laura Balzer Dec 2017

Stacked Generalization: An Introduction To Super Learning, Ashley Naimi, Laura Balzer

Laura B. Balzer

Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into a host of methods among which is the ‘‘Super Learner’’. Super Learner uses V -fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the area under the receiver operating characteristic curve. Although relatively simple in nature, use of Super Learner by epidemiologists has been hampered by …


Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen Feb 2017

Evaluation Of Progress Towards The Unaids 90-90-90 Hiv Care Cascade: A Description Of Statistical Methods Used In An Interim Analysis Of The Intervention Communities In The Search Study, Laura Balzer, Joshua Schwab, Mark J. Van Der Laan, Maya L. Petersen

Laura B. Balzer

WHO guidelines call for universal antiretroviral treatment, and UNAIDS has set a global target to virally suppress most HIV-positive individuals. Accurate estimates of population-level coverage at each step of the HIV care cascade (testing, treatment, and viral suppression) are needed to assess the effectiveness of "test and treat" strategies implemented to achieve this goal. The data available to inform such estimates, however, are susceptible to informative missingness: the number of HIV-positive individuals in a population is unknown; individuals tested for HIV may not be representative of those whom a testing intervention fails to reach, and HIV-positive individuals with a viral …


Adaptive Pre-Specification In Randomized Trials With And Without Pair-Matching, Laura Balzer, M. Van Der Laan, M. Petersen, The Search Collaboration Nov 2016

Adaptive Pre-Specification In Randomized Trials With And Without Pair-Matching, Laura Balzer, M. Van Der Laan, M. Petersen, The Search Collaboration

Laura B. Balzer

In randomized trials, adjustment for measured covariates during the analysis can reduce variance and increase power. To avoid misleading inference, the analysis plan must be pre-specified. However, it is often unclear a priori which baseline covariates (if any) should be adjusted for in the analysis.  Consider, for example, the Sustainable East Africa Research in Community Health (SEARCH) trial for HIV prevention and treatment.  There are 16 matched pairs of communities and many potential adjustment variables, including region, HIV prevalence, male circumcision coverage and measures of community-level viral load.  In this paper, we propose a rigorous procedure to data-adaptively select the …


Targeted Estimation And Inference For The Sample Average Treatment Effect In Trials With And Without Pair-Matching, Laura Balzer, M. Petersen, M. Van Der Laan, The Search Collaboration Oct 2016

Targeted Estimation And Inference For The Sample Average Treatment Effect In Trials With And Without Pair-Matching, Laura Balzer, M. Petersen, M. Van Der Laan, The Search Collaboration

Laura B. Balzer

In cluster randomized trials, the study units usually are not a simple random sample from some clearly defined
target population. Instead, the target population tends to be hypothetical or ill-defined, and the selection of study
units tends to be systematic, driven by logistical and practical considerations. As a result, the population average
treatment effect (PATE) may be neither well-defined nor easily interpretable. In contrast, the sample average
treatment effect (SATE) is the mean difference in the counterfactual outcomes for the study units. The sample
parameter is easily interpretable and arguably the most relevant when the study units are not sampled …


Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Oct 2016

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …


Estimating Effects With Rare Outcomes And High Dimensional Covariates: Knowledge Is Power, Laura Balzer, J. Ahern, S. Galea, M. Van Der Laan Sep 2016

Estimating Effects With Rare Outcomes And High Dimensional Covariates: Knowledge Is Power, Laura Balzer, J. Ahern, S. Galea, M. Van Der Laan

Laura B. Balzer

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect or association of an exposure on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional mean of the outcome, given the exposure and measured confounders. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding …


Targeted Estimation Of Marginal Absolute And Relative Associations In Case-Control Data: An Application In Social Epidemiology, M. Pearl, Laura Balzer, J. Ahern Aug 2016

Targeted Estimation Of Marginal Absolute And Relative Associations In Case-Control Data: An Application In Social Epidemiology, M. Pearl, Laura Balzer, J. Ahern

Laura B. Balzer

Background: Case-control studies are useful for rare outcomes, but typical analyses limit investigators to parametric estimation of conditional odds ratios. Several methods exist for obtaining marginal risk differences and risk ratios in a case-control setting, including a recently described semiparametric targeted approach optimized for rare outcomes.
Methods: Using case-control data from a study of neighborhood poverty and very preterm birth, we demonstrate estimation of marginal risk differences and risk ratios and compare a parametric substitution estimator based on maximum likelihood estimation with targeted maximum likelihood estimation (TMLE), and a refinement of TMLE for rare outcomes that incorporates bounds on the …


Introduction To Targeted Learning, Laura Balzer Dec 2015

Introduction To Targeted Learning, Laura Balzer

Laura B. Balzer

No abstract provided.


Adaptive Pre-Specification In Randomized Trials With And Without Pair-Matching, Laura B. Balzer, Mark J. Van Der Laan, Maya L. Petersen May 2015

Adaptive Pre-Specification In Randomized Trials With And Without Pair-Matching, Laura B. Balzer, Mark J. Van Der Laan, Maya L. Petersen

Laura B. Balzer

In randomized trials, adjustment for measured covariates during the analysis can reduce variance and increase power. To avoid misleading inference, the analysis plan must be pre-specified. However, it is unclear a priori which baseline covariates (if any) should be included in the analysis. Consider, for example, the Sustainable East Africa Research in Community Health (SEARCH) trial for HIV prevention and treatment. There are 16 matched pairs of communities and many potential adjustment variables, including region, HIV prevalence, male circumcision coverage and measures of community-level viral load. In this paper, we propose a rigorous procedure to data-adaptively select the adjustment set …


Targeted Estimation And Inference For The Sample Average Treatment Effect, Laura B. Balzer, Maya L. Petersen, Mark J. Van Der Laan Mar 2015

Targeted Estimation And Inference For The Sample Average Treatment Effect, Laura B. Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

While the population average treatment effect has been the subject of extensive methods and applied research, less consideration has been given to the sample average treatment effect: the mean difference in the counterfactual outcomes for the study units. The sample parameter is easily interpretable and is arguably the most relevant when the study units are not representative of a greater population or when the exposure's impact is heterogeneous. Formally, the sample effect is not identifiable from the observed data distribution. Nonetheless, targeted maximum likelihood estimation (TMLE) can provide an asymptotically unbiased and efficient estimate of both the population and sample …


2015_Balzer_Adaptive.Pdf, Laura Balzer Dec 2014

2015_Balzer_Adaptive.Pdf, Laura Balzer

Laura B. Balzer

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially
increase study power. In a common design, candidate units are identified, and their baseline characteristics used
to create the best n/2 matched pairs.Within the resulting pairs, the intervention is randomized, and the outcomes
measured at the end of follow-up.We consider this design to be adaptive, because the construction of thematched
pairs depends on the baseline covariates of all candidate units. As a consequence, the observed data cannot be
considered as n/2 independent, identically distributed pairs of units, as common practice assumes. Instead, the
observed …


Adaptive Pair-Matching In Randomized Trials With Unbiased And Efficient Effect Estimation, Laura Balzer, M Petersen, M Van Der Laan, The Search Consortium Dec 2014

Adaptive Pair-Matching In Randomized Trials With Unbiased And Efficient Effect Estimation, Laura Balzer, M Petersen, M Van Der Laan, The Search Consortium

Laura B. Balzer

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially
increase study power. In a common design, candidate units are identified, and their baseline characteristics used
to create the best n∕2 matched pairs.Within the resulting pairs, the intervention is randomized, and the outcomes
measured at the end of follow-up.We consider this design to be adaptive, because the construction of thematched
pairs depends on the baseline covariates of all candidate units. As a consequence, the observed data cannot be
considered as n∕2 independent, identically distributed pairs of units, as common practice assumes. Instead, the
observed …


Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Jan 2014

Adaptive Pair-Matching In The Search Trial And Estimation Of The Intervention Effect, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

In randomized trials, pair-matching is an intuitive design strategy to protect study validity and to potentially increase study power. In a common design, candidate units are identified, and their baseline characteristics used to create the best n/2 matched pairs. Within the resulting pairs, the intervention is randomized, and the outcomes measured at the end of follow-up. We consider this design to be adaptive, because the construction of the matched pairs depends on the baseline covariates of all candidate units. As consequence, the observed data cannot be considered as n/2 independent, identically distributed (i.i.d.) pairs of units, as current practice assumes. …


Designing The Search Trial: Ph250b In Practice, Laura Balzer Sep 2013

Designing The Search Trial: Ph250b In Practice, Laura Balzer

Laura B. Balzer

No abstract provided.


Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan May 2013

Estimating Effects On Rare Outcomes: Knowledge Is Power, Laura B. Balzer, Mark J. Van Der Laan

Laura B. Balzer

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect of an exposure or treatment on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional risk of the outcome, given the exposure and covariates. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding provides …


Adaptive Matching In Randomized Trials And Observational Studies, Mark J. Van Der Laan, Laura Balzer, Maya L. Petersen Nov 2012

Adaptive Matching In Randomized Trials And Observational Studies, Mark J. Van Der Laan, Laura Balzer, Maya L. Petersen

Laura B. Balzer

In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, …


Estimating The Impact Of Community-Level Interventions: The Search Trial And Hiv Prevention In Sub-Saharan Africa, Laura Balzer, Maya Petersen, Joshua Schwab, Mark Van Der Laan May 2012

Estimating The Impact Of Community-Level Interventions: The Search Trial And Hiv Prevention In Sub-Saharan Africa, Laura Balzer, Maya Petersen, Joshua Schwab, Mark Van Der Laan

Laura B. Balzer

Evaluation of community level interventions to prevent HIV infection presents significant methodological challenges. Even when it is feasible to randomly assign a treatment versus control level of the intervention to each community in a sample, measurement of incident HIV infection remains difficult. In this talk we describe an experimental design developed for the SEARCH Trial, a large community randomized trial that will evaluate the impact of expanded treatment on incident HIV and other outcomes. Regular community-wide testing campaigns are conducted and a random sample of community members who fail to attend a campaign are tracked. The data generated by this …


Why Match In Individually And Cluster Randomized Trials?, Laura B. Balzer, Maya L. Petersen, Mark J. Van Der Laan May 2012

Why Match In Individually And Cluster Randomized Trials?, Laura B. Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

The decision to match individuals or clusters in randomized trials is motivated by both practical and statistical concerns. Matching protects against chance imbalances in baseline covariate distributions and is thought to improve study credibility. Matching is also implemented to increase study power. This article compares the asymptotic efficiency of the pair-matched design, where units are matched on baseline covariates and the treatment randomized within pairs, to the independent design, where units are randomly paired and the treatment randomized within pairs. We focus on estimating the average treatment effect and use the efficient influence curve to understand the information provided by …


Adaptive Matching In Randomized Trials And Observational Studies, M. Van Der Laan, Laura Balzer, M. Petersen Dec 2011

Adaptive Matching In Randomized Trials And Observational Studies, M. Van Der Laan, Laura Balzer, M. Petersen

Laura B. Balzer

In many randomized and observational studies the allocation of treatment among a sample of n independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, …