Open Access. Powered by Scholars. Published by Universities.®

Biostatistics Commons

Open Access. Powered by Scholars. Published by Universities.®

2016

Discipline
Institution
Keyword
Publication
Publication Type
File Type

Articles 31 - 60 of 182

Full-Text Articles in Biostatistics

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Oct 2016

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

Laura B. Balzer

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …


High-Throughput Allele-Specific Expression Across 250 Environmental Conditions, Gregory A. Moyerbrailean, Allison L. Richards, Daniel Kurtz, Cynthia A. Kalita, Gordon O. Davis, Chris T. Harvey, Adnan Alazizi, Donovan Watza, Yoram Sorokin, Nancy J. Hauff, Xiang Zhou, Xiaoquan Wen, Roger Pique-Regi, Francesca Luca Oct 2016

High-Throughput Allele-Specific Expression Across 250 Environmental Conditions, Gregory A. Moyerbrailean, Allison L. Richards, Daniel Kurtz, Cynthia A. Kalita, Gordon O. Davis, Chris T. Harvey, Adnan Alazizi, Donovan Watza, Yoram Sorokin, Nancy J. Hauff, Xiang Zhou, Xiaoquan Wen, Roger Pique-Regi, Francesca Luca

Center for Molecular Medicine and Genetics

Gene-by-environment (GxE) interactions determine common disease risk factors and biomedically relevant complex traits. However, quantifying how the environment modulates genetic effects on human quantitative phenotypes presents unique challenges. Environmental covariates are complex and difficult to measure and control at the organismal level, as found in GWAS and epidemiological studies. An alternative approach focuses on the cellular environment using in vitro treatments as a proxy for the organismal environment. These cellular environments simplify the organism-level environmental exposures to provide a tractable influence on subcellular phenotypes, such as gene expression. Expression quantitative trait loci (eQTL) mapping studies identified GxE interactions in response …


Online Cross-Validation-Based Ensemble Learning, David Benkeser, Samuel D. Lendle, Cheng Ju, Mark J. Van Der Laan Oct 2016

Online Cross-Validation-Based Ensemble Learning, David Benkeser, Samuel D. Lendle, Cheng Ju, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Online estimators update a current estimate with a new incoming batch of data without having to revisit past data thereby providing streaming estimates that are scalable to big data. We develop flexible, ensemble-based online estimators of an infinite-dimensional target parameter, such as a regression function, in the setting where data are generated sequentially by a common conditional data distribution given summary measures of the past. This setting encompasses a wide range of time-series models and as special case, models for independent and identically distributed data. Our estimator considers a large library of candidate online estimators and uses online cross-validation to …


Doubly-Robust Nonparametric Inference On The Average Treatment Effect, David Benkeser, Marco Carone, Mark J. Van Der Laan, Peter Gilbert Oct 2016

Doubly-Robust Nonparametric Inference On The Average Treatment Effect, David Benkeser, Marco Carone, Mark J. Van Der Laan, Peter Gilbert

U.C. Berkeley Division of Biostatistics Working Paper Series

Doubly-robust estimators are widely used to draw inference about the average effect of a treatment. Such estimators are consistent for the effect of interest if either one of two nuisance parameters is consistently estimated. However, if flexible, data-adaptive estimators of these nuisance parameters are used, double-robustness does not readily extend to inference. We present a general theoretical study of the behavior of doubly-robust estimators of an average treatment effect when one of the nuisance parameters is inconsistently estimated. We contrast different approaches for constructing such estimators and investigate the extent to which they may be modified to also allow doubly-robust …


Comparing Performance Of Non-Tree-Based And Tree-Based Association Mapping Methods, Katherine L. Thompson, David W. Fardo Oct 2016

Comparing Performance Of Non-Tree-Based And Tree-Based Association Mapping Methods, Katherine L. Thompson, David W. Fardo

Statistics Faculty Publications

A central goal in the biomedical and biological sciences is to link variation in quantitative traits to locations along the genome (single nucleotide polymorphisms). Sequencing technology has rapidly advanced in recent decades, along with the statistical methodology to analyze genetic data. Two classes of association mapping methods exist: those that account for the evolutionary relatedness among individuals, and those that ignore the evolutionary relationships among individuals. While the former methods more fully use implicit information in the data, the latter methods are more flexible in the types of data they can handle. This study presents a comparison of the 2 …


Causal Effect Estimation In Sequencing Studies: A Bayesian Method To Account For Confounder Adjustment Uncertainty, Chi Wang, Jinpeng Liu, David W. Fardo Oct 2016

Causal Effect Estimation In Sequencing Studies: A Bayesian Method To Account For Confounder Adjustment Uncertainty, Chi Wang, Jinpeng Liu, David W. Fardo

Biostatistics Faculty Publications

Estimating the causal effect of a single nucleotide variant (SNV) on clinical phenotypes is of interest in many genetic studies. The effect estimation may be confounded by other SNVs as a result of linkage disequilibrium as well as demographic and clinical characteristics. Because a large number of these other variables, which we call potential confounders, are collected, it is challenging to select and adjust for the variables that truly confound the causal effect. The Bayesian adjustment for confounding (BAC) method has been proposed as a general method to estimate the average causal effect in the presence of a large number …


On Combining Family- And Population- Based Sequencing Data, Yuriko Katsumata, David W. Fardo Oct 2016

On Combining Family- And Population- Based Sequencing Data, Yuriko Katsumata, David W. Fardo

Biostatistics Faculty Publications

Several statistical group-based approaches have been proposed to detect effects of variation within a gene for each of the population- and family-based designs. However, unified tests to combine gene-phenotype associations obtained from these 2 study designs are not yet well established. In this study, we investigated the efficient combination of population-based and family-based sequencing data to evaluate best practices using the Genetic Analysis Workshop 19 (GAW19) data set. Because one design employed whole genome sequencing and the other whole exome sequencing, we examined variants overlapping both data sets. We used the family-based sequence kernel association test (famSKAT) to analyze the …


Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan Oct 2016

Performance-Constrained Binary Classification Using Ensemble Learning: An Application To Cost-Efficient Targeted Prep Strategies, Wenjing Zheng, Laura Balzer, Maya L. Petersen, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Binary classifications problems are ubiquitous in health and social science applications. In many cases, one wishes to balance two conflicting criteria for an optimal binary classifier. For instance, in resource-limited settings, an HIV prevention program based on offering Pre-Exposure Prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program to deliver. In this article, we consider a general class of performance-constrained binary classification problems wherein the objective function and the …


Matching The Efficiency Gains Of The Logistic Regression Estimator While Avoiding Its Interpretability Problems, In Randomized Trials, Michael Rosenblum, Jon Arni Steingrimsson Oct 2016

Matching The Efficiency Gains Of The Logistic Regression Estimator While Avoiding Its Interpretability Problems, In Randomized Trials, Michael Rosenblum, Jon Arni Steingrimsson

Johns Hopkins University, Dept. of Biostatistics Working Papers

Adjusting for prognostic baseline variables can lead to improved power in randomized trials. For binary outcomes, a logistic regression estimator is commonly used for such adjustment. This has resulted in substantial efficiency gains in practice, e.g., gains equivalent to reducing the required sample size by 20-28% were observed in a recent survey of traumatic brain injury trials. Robinson and Jewell (1991) proved that the logistic regression estimator is guaranteed to have equal or better asymptotic efficiency compared to the unadjusted estimator (which ignores baseline variables). Unfortunately, the logistic regression estimator has the following dangerous vulnerabilities: it is only interpretable when …


Using Low-Dose Radiation To Potentiate The Effect Of Induction Chemotherapy In Head And Neck Cancer: Results Of A Prospective Phase 2 Trial, Susanne M. Arnold, Mahesh Kudrimoti, Emily V. Dressler, John F. Gleason, Natalie L. Silver, William F. Regine, Joseph Valentino Oct 2016

Using Low-Dose Radiation To Potentiate The Effect Of Induction Chemotherapy In Head And Neck Cancer: Results Of A Prospective Phase 2 Trial, Susanne M. Arnold, Mahesh Kudrimoti, Emily V. Dressler, John F. Gleason, Natalie L. Silver, William F. Regine, Joseph Valentino

Internal Medicine Faculty Publications

Purpose: Low-dose fractionated radiation therapy (LDFRT) induces effective cell killing through hyperradiation sensitivity and potentiates effects of chemotherapy. We report our second investigation of LDFRT as a potentiator of the chemotherapeutic effect of induction carboplatin and paclitaxel in locally advanced squamous cell cancer of the head and neck (SCCHN).

Experimental Design: Two cycles of induction therapy were given every 21 days: paclitaxel (75 mg/m2) on days 1, 8, and 15; carboplatin (area under the curve 6) day 1; and LDFRT 50 cGy fractions (2 each on days 1, 2, 8, and 15). Objectives included primary site complete response …


Pleiotropic Effects Of Csf Levels Of Alzheimer’S Disease Proteins, Olga A. Vsevolozhskaya, Ilai Keren, David W. Fardo, Dmitri V. Zaykin Oct 2016

Pleiotropic Effects Of Csf Levels Of Alzheimer’S Disease Proteins, Olga A. Vsevolozhskaya, Ilai Keren, David W. Fardo, Dmitri V. Zaykin

Biostatistics Presentations

Cerebrospinal fluid (CSF) analytes harbor potential as diagnostic biomarkers for Alzheimer’s Disease (AD). Quantitative measures of CSF proteins comprise a set of often highly correlated endophenotypes that have previously shown promise in genetic analyses (Cruchaga et al., 2013; Kauwe et al., 2014). Pleiotropic impact of genetic variations on this set may provide additional insights into AD pathology at its earliest stages. To determine which specific endophenotypes are pleiotropic, one can employ methods based on the reverse regression of genotype on phenotypes. Recently, we proposed a method based functional linear models (Vsevolozhskaya et al, 2016) that utilizes reverse regression and simultaneously …


Estimating Effects With Rare Outcomes And High Dimensional Covariates: Knowledge Is Power, Laura Balzer, J. Ahern, S. Galea, M. Van Der Laan Sep 2016

Estimating Effects With Rare Outcomes And High Dimensional Covariates: Knowledge Is Power, Laura Balzer, J. Ahern, S. Galea, M. Van Der Laan

Laura B. Balzer

Many of the secondary outcomes in observational studies and randomized trials are rare. Methods for estimating causal effects and associations with rare outcomes, however, are limited, and this represents a missed opportunity for investigation. In this article, we construct a new targeted minimum loss-based estimator (TMLE) for the effect or association of an exposure on a rare outcome. We focus on the causal risk difference and statistical models incorporating bounds on the conditional mean of the outcome, given the exposure and measured confounders. By construction, the proposed estimator constrains the predicted outcomes to respect this model knowledge. Theoretically, this bounding …


Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder Sep 2016

Estimation Of P(X > Y) When X And Y Are Dependent Random Variables Using Different Bivariate Sampling Schemes, Hani M. Samawi, Amal Helu, Haresh Rochani, Jingjing Yin, Daniel Linder

Biostatistics Faculty Publications

The stress-strength models have been intensively investigated in the literature in regards of estimating the reliability θ = P (X > Y) using parametric and nonparametric approaches under different sampling schemes when X and Y are independent random variables. In this paper, we consider the problem of estimating θ when (X, Y) are dependent random variables with a bivariate underlying distribution. The empirical and kernel estimates of θ = P (X > Y), based on bivariate ranked set sampling (BVRSS) are considered, when (X, Y) are paired dependent continuous random variables. The estimators obtained are compared to their counterpart, bivariate simple random …


Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang Sep 2016

Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang

Biostatistics Faculty Publications

Background: It has been demonstrated that a pathway-based feature selection method that incorporates biological information within pathways during the process of feature selection usually outperforms a gene-based feature selection algorithm in terms of predictive accuracy and stability. Significance analysis of microarray-gene set reduction algorithm (SAMGSR), an extension to a gene set analysis method with further reduction of the selected pathways to their respective core subsets, can be regarded as a pathway-based feature selection method.

Methods: In SAMGSR, whether a gene is selected is mainly determined by its expression difference between the phenotypes, and partially by the number of pathways to …


Fto Genotype And Weight Loss: Systematic Review And Meta-Analysis Of 9563 Individual Participant Data From Eight Randomised Controlled Trials., Katherine M Livingstone, Carlos Celis-Morales, George D Papandonatos, Bahar Erar, Jose C Florez, Kathleen A Jablonski, Cristina Razquin, Amelia Marti, Yoriko Heianza, Tao Huang, Frank M Sacks, Mathilde Svendstrup, Xuemei Sui, Timothy S Church, Tiina Jääskeläinen, Jaana Lindström, Jaakko Tuomilehto, Matti Uusitupa, Tuomo Rankinen, Wim H M Saris, Torben Hansen, Oluf Pedersen, Arne Astrup, Thorkild I A Sørensen, Lu Qi, George A Bray, Miguel A Martinez-Gonzalez, J Alfredo Martinez, Paul W Franks, Jeanne M Mccaffery, Jose Lara, John C Mathers Sep 2016

Fto Genotype And Weight Loss: Systematic Review And Meta-Analysis Of 9563 Individual Participant Data From Eight Randomised Controlled Trials., Katherine M Livingstone, Carlos Celis-Morales, George D Papandonatos, Bahar Erar, Jose C Florez, Kathleen A Jablonski, Cristina Razquin, Amelia Marti, Yoriko Heianza, Tao Huang, Frank M Sacks, Mathilde Svendstrup, Xuemei Sui, Timothy S Church, Tiina Jääskeläinen, Jaana Lindström, Jaakko Tuomilehto, Matti Uusitupa, Tuomo Rankinen, Wim H M Saris, Torben Hansen, Oluf Pedersen, Arne Astrup, Thorkild I A Sørensen, Lu Qi, George A Bray, Miguel A Martinez-Gonzalez, J Alfredo Martinez, Paul W Franks, Jeanne M Mccaffery, Jose Lara, John C Mathers

Epidemiology Faculty Publications

OBJECTIVE: To assess the effect of the FTO genotype on weight loss after dietary, physical activity, or drug based interventions in randomised controlled trials.

DESIGN: Systematic review and random effects meta-analysis of individual participant data from randomised controlled trials.

DATA SOURCES: Ovid Medline, Scopus, Embase, and Cochrane from inception to November 2015.

ELIGIBILITY CRITERIA FOR STUDY SELECTION: Randomised controlled trials in overweight or obese adults reporting reduction in body mass index, body weight, or waist circumference by FTO genotype (rs9939609 or a proxy) after dietary, physical activity, or drug based interventions. Gene by treatment interaction models were fitted to individual …


Model Averaged Double Robust Estimation, Matthew Cefalu, Francesca Dominici, Nils D. Arvold Md, Giovanni Parmigiani Sep 2016

Model Averaged Double Robust Estimation, Matthew Cefalu, Francesca Dominici, Nils D. Arvold Md, Giovanni Parmigiani

Harvard University Biostatistics Working Paper Series

Existing methods in causal inference do not account for the uncertainty in the selection of confounders. We propose a new class of estimators for the average causal effect, the model averaged double robust estimators, that formally account for model uncertainty in both the propensity score and outcome model through the use of Bayesian model averaging. These estimators build on the desirable double robustness property by only requiring the true propensity score model or the true outcome model be within a specified class of models to maintain consistency. We provide asymptotic results and conduct a large scale simulation study that indicates …


Targeted Estimation Of Marginal Absolute And Relative Associations In Case-Control Data: An Application In Social Epidemiology, M. Pearl, Laura Balzer, J. Ahern Aug 2016

Targeted Estimation Of Marginal Absolute And Relative Associations In Case-Control Data: An Application In Social Epidemiology, M. Pearl, Laura Balzer, J. Ahern

Laura B. Balzer

Background: Case-control studies are useful for rare outcomes, but typical analyses limit investigators to parametric estimation of conditional odds ratios. Several methods exist for obtaining marginal risk differences and risk ratios in a case-control setting, including a recently described semiparametric targeted approach optimized for rare outcomes.
Methods: Using case-control data from a study of neighborhood poverty and very preterm birth, we demonstrate estimation of marginal risk differences and risk ratios and compare a parametric substitution estimator based on maximum likelihood estimation with targeted maximum likelihood estimation (TMLE), and a refinement of TMLE for rare outcomes that incorporates bounds on the …


Addition To Pglr Chap 6, Joseph M. Hilbe Aug 2016

Addition To Pglr Chap 6, Joseph M. Hilbe

Joseph M Hilbe

Addition to Chapter 6 in Practical Guide to Logistic Regression. Added section on Bayesian logistic regression using Stata.


Distance-Based Analysis Of Variance For Brain Connectivity, Russell T. Shinohara, Haochang Shou, Marco Carone, Robert Schultz, Birkan Tunc, Drew Parker, Ragini Verma Aug 2016

Distance-Based Analysis Of Variance For Brain Connectivity, Russell T. Shinohara, Haochang Shou, Marco Carone, Robert Schultz, Birkan Tunc, Drew Parker, Ragini Verma

UPenn Biostatistics Working Papers

The field of neuroimaging dedicated to mapping connections in the brain is increasingly being recognized as key for understanding neurodevelopment and pathology. Networks of these connections are quantitatively represented using complex structures including matrices, functions, and graphs, which require specialized statistical techniques for estimation and inference about developmental and disorder-related changes. Unfortunately, classical statistical testing procedures are not well suited to high-dimensional testing problems. In the context of global or regional tests for differences in neuroimaging data, traditional analysis of variance (ANOVA) is not directly applicable without first summarizing the data into univariate or low-dimensional features, a process that may …


Diffuse Optical Measurements Of Head And Neck Tumor Hemodynamics For Early Prediction Of Chemoradiation Therapy Outcomes, Lixin Dong, Mahesh Kudrimoti, Daniel Irwin, Li Chen, Sameera Kumar, Yu Shang, Chong Huang, Ellis L. Johnson, Scott D. Stevens, Brent J. Shelton, Guoqiang Yu Aug 2016

Diffuse Optical Measurements Of Head And Neck Tumor Hemodynamics For Early Prediction Of Chemoradiation Therapy Outcomes, Lixin Dong, Mahesh Kudrimoti, Daniel Irwin, Li Chen, Sameera Kumar, Yu Shang, Chong Huang, Ellis L. Johnson, Scott D. Stevens, Brent J. Shelton, Guoqiang Yu

Biomedical Engineering Faculty Publications

This study used a hybrid near-infrared diffuse optical instrument to monitor tumor hemodynamic responses to chemoradiation therapy for early prediction of treatment outcomes in patients with head and neck cancer. Forty-seven patients were measured once per week to evaluate the hemodynamic status of clinically involved cervical lymph nodes as surrogates for the primary tumor response. Patients were classified into two groups: complete response (CR) (n = 29) and incomplete response (IR) (n = 18). Tumor hemodynamic responses were found to be associated with clinical outcomes (CR/IR), wherein the associations differed depending on human papillomavirus (HPV-16) status. In HPV-16 …


The Use Of Permutation Tests For The Analysis Of Parallel And Stepped-Wedge Cluster Randomized Trials, Rui Wang, Victor Degruttola Aug 2016

The Use Of Permutation Tests For The Analysis Of Parallel And Stepped-Wedge Cluster Randomized Trials, Rui Wang, Victor Degruttola

Harvard University Biostatistics Working Paper Series

We investigate the use of permutation tests for the analysis of parallel and stepped-wedge cluster randomized trials. Permutation tests for parallel designs with exponential family endpoints have been extensively studied. The optimal permutation tests developed for exponential family alternatives require information on intraclass correlation, a quantity not yet defined for time-to-event endpoints. Therefore, it is unclear how efficient permutation tests can be constructed for cluster-randomized trials with such endpoints. We consider a class of test statistics formed by a weighted average of pair-specific treatment effect estimates and offer practical guidance on the choice of weights to improve efficiency. We apply …


Improving Precision By Adjusting For Baseline Variables In Randomized Trials With Binary Outcomes, Without Regression Model Assumptions, Jon Arni Steingrimsson, Daniel F. Hanley, Michael Rosenblum Aug 2016

Improving Precision By Adjusting For Baseline Variables In Randomized Trials With Binary Outcomes, Without Regression Model Assumptions, Jon Arni Steingrimsson, Daniel F. Hanley, Michael Rosenblum

Johns Hopkins University, Dept. of Biostatistics Working Papers

In randomized clinical trials with baseline variables that are prognostic for the primary outcome, there is potential to improve precision and reduce sample size by appropriately adjusting for these variables. A major challenge is that there are multiple statistical methods to adjust for baseline variables, but little guidance on which is best to use in a given context. The choice of method can have important consequences. For example, one commonly used method leads to uninterpretable estimates if there is any treatment effect heterogeneity, which would jeopardize the validity of trial conclusions. We give practical guidance on how to avoid this …


Left And Right Ventricular Dyssynchrony And Strains From Cardiovascular Magnetic Resonance Feature Tracking Do Not Predict Deterioration Of Ventricular Function In Patients With Repaired Tetralogy Of Fallot, Linyuan Jing, Gregory J. Wehner, Jonathan D. Suever, Richard Charnigo, Sudad Alhadad, Evan Stearns, Dimitri Mojsejenko, Christopher M. Haggerty, Kelsey Hickey, Anne Marie Valente, Tal Geva, Andrew J. Powell, Brandon K. Fornwalt Aug 2016

Left And Right Ventricular Dyssynchrony And Strains From Cardiovascular Magnetic Resonance Feature Tracking Do Not Predict Deterioration Of Ventricular Function In Patients With Repaired Tetralogy Of Fallot, Linyuan Jing, Gregory J. Wehner, Jonathan D. Suever, Richard Charnigo, Sudad Alhadad, Evan Stearns, Dimitri Mojsejenko, Christopher M. Haggerty, Kelsey Hickey, Anne Marie Valente, Tal Geva, Andrew J. Powell, Brandon K. Fornwalt

Saha Cardiovascular Research Center Faculty Publications

Background: Patients with repaired tetralogy of Fallot (rTOF) suffer from progressive ventricular dysfunction decades after their surgical repair. We hypothesized that measures of ventricular strain and dyssynchrony would predict deterioration of ventricular function in patients with rTOF.

Methods: A database search identified all patients at a single institution with rTOF who underwent cardiovascular magnetic resonance (CMR) at least twice, > 6 months apart, without intervening surgical or catheter procedures. Seven primary predictors were derived from the first CMR using a custom feature tracking algorithm: left (LV), right (RV) and inter-ventricular dyssynchrony, LV and RV peak global circumferential strains, and LV and …


Mediation Analysis For A Survival Outcome With Time-Varying Exposures, Mediators, And Confounders, Sheng-Hsuan Lin, Jessica G. Young, Roger Logan, Tyler J. Vanderweele Aug 2016

Mediation Analysis For A Survival Outcome With Time-Varying Exposures, Mediators, And Confounders, Sheng-Hsuan Lin, Jessica G. Young, Roger Logan, Tyler J. Vanderweele

Harvard University Biostatistics Working Paper Series

We propose an approach to conduct mediation analysis for survival data with time-varying exposures, mediators, and confounders. We identify certain interventional direct and indirect effects through a survival mediational g-formula and describe the required assumptions. We also provide a feasible parametric approach along with an algorithm and software to estimate these effects. We apply this method to analyze the Framingham Heart Study data to investigate the causal mechanism of smoking on mortality through coronary artery disease. The risk ratio of smoking 30 cigarettes per day for ten years compared with no smoking on mortality is 2.34 (95 % CI = …


Diabetes Is Associated With Cerebrovascular But Not Alzheimer's Disease Neuropathology, Erin L. Abner, Peter T. Nelson, Richard J. Kryscio, Frederick A. Schmitt, David W. Fardo, Randall L. Woltjer, Nigel J. Cairns, Lei Yu, Hiroko H. Dodge, Chengjie Xiong, Kamal Masaki, Suzanne L. Tyas, David A. Bennett, Julie A. Schneider, Zoe Arvanitakis Aug 2016

Diabetes Is Associated With Cerebrovascular But Not Alzheimer's Disease Neuropathology, Erin L. Abner, Peter T. Nelson, Richard J. Kryscio, Frederick A. Schmitt, David W. Fardo, Randall L. Woltjer, Nigel J. Cairns, Lei Yu, Hiroko H. Dodge, Chengjie Xiong, Kamal Masaki, Suzanne L. Tyas, David A. Bennett, Julie A. Schneider, Zoe Arvanitakis

Sanders-Brown Center on Aging Faculty Publications

INTRODUCTION: The relationship of diabetes to specific neuropathologic causes of dementia is incompletely understood.

METHODS: We used logistic regression to evaluate the association between diabetes and infarcts, Braak neurofibrillary tangle stage, and neuritic plaque score in 2365 autopsied persons. In a subset of >1300 persons with available cognitive data, we examined the association between diabetes and cognition using Poisson regression.

RESULTS: Diabetes increased odds of brain infarcts (odds ratio [OR] = 1.57, P < .0001), specifically lacunes (OR = 1.71, P < .0001), but not Alzheimer's disease neuropathology. Diabetes plus infarcts was associated with lower cognitive scores at end of life than infarcts or diabetes alone, and diabetes plus high level of Alzheimer's neuropathologic changes was associated with lower mini-mental state examination scores than the pathology alone.

DISCUSSION: This study supports the conclusions that diabetes increases the risk of cerebrovascular but not Alzheimer's disease pathology, and at least some of diabetes' relationship to …


The Impact Of Patient Navigation On The Delivery Of Diagnostic Breast Cancer Care In The National Patient Navigation Research Program: A Prospective Meta-Analysis., Tracy A Battaglia, Julie S Darnell, Naomi Ko, Fred Snyder, Electra D Paskett, Kristen J Wells, Elizabeth M Whitley, Jennifer J Griggs, Anand Karnad, Heather Young, Victoria Warren-Mears, Melissa A Simon, Elizabeth Calhoun Aug 2016

The Impact Of Patient Navigation On The Delivery Of Diagnostic Breast Cancer Care In The National Patient Navigation Research Program: A Prospective Meta-Analysis., Tracy A Battaglia, Julie S Darnell, Naomi Ko, Fred Snyder, Electra D Paskett, Kristen J Wells, Elizabeth M Whitley, Jennifer J Griggs, Anand Karnad, Heather Young, Victoria Warren-Mears, Melissa A Simon, Elizabeth Calhoun

Epidemiology Faculty Publications

Patient navigation is emerging as a standard in breast cancer care delivery, yet multi-site data on the impact of navigation at reducing delays along the continuum of care are lacking. The purpose of this study was to determine the effect of navigation on reaching diagnostic resolution at specific time points after an abnormal breast cancer screening test among a national sample. A prospective meta-analysis estimated the adjusted odds of achieving timely diagnostic resolution at 60, 180, and 365 days. Exploratory analyses were conducted on the pooled sample to identify which groups had the most benefit from navigation. Clinics from six …


Learning From Data: Plant Breeding Applications Of Machine Learning, Alencar Xavier Aug 2016

Learning From Data: Plant Breeding Applications Of Machine Learning, Alencar Xavier

Open Access Dissertations

Increasingly, new sources of data are being incorporated into plant breeding pipelines. Enormous amounts of data from field phenomics and genotyping technologies places data mining and analysis into a completely different level that is challenging from practical and theoretical standpoints. Intelligent decision-making relies on our capability of extracting from data useful information that may help us to achieve our goals more efficiently. Many plant breeders, agronomists and geneticists perform analyses without knowing relevant underlying assumptions, strengths or pitfalls of the employed methods. The study endeavors to assess statistical learning properties and plant breeding applications of supervised and unsupervised machine learning …


Multilevel Models For Longitudinal Data, Aastha Khatiwada Aug 2016

Multilevel Models For Longitudinal Data, Aastha Khatiwada

Electronic Theses and Dissertations

Longitudinal data arise when individuals are measured several times during an ob- servation period and thus the data for each individual are not independent. There are several ways of analyzing longitudinal data when different treatments are com- pared. Multilevel models are used to analyze data that are clustered in some way. In this work, multilevel models are used to analyze longitudinal data from a case study. Results from other more commonly used methods are compared to multilevel models. Also, comparison in output between two software, SAS and R, is done. Finally a method consisting of fitting individual models for each …


Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro Aug 2016

Variable Selection Via Penalized Regression And The Genetic Algorithm Using Information Complexity, With Applications For High-Dimensional -Omics Data, Tyler J. Massaro

Doctoral Dissertations

This dissertation is a collection of examples, algorithms, and techniques for researchers interested in selecting influential variables from statistical regression models. Chapters 1, 2, and 3 provide background information that will be used throughout the remaining chapters, on topics including but not limited to information complexity, model selection, covariance estimation, stepwise variable selection, penalized regression, and especially the genetic algorithm (GA) approach to variable subsetting.

In chapter 4, we fully develop the framework for performing GA subset selection in logistic regression models. We present advantages of this approach against stepwise and elastic net regularized regression in selecting variables from a …


Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia Aug 2016

Propensity Score Based Methods For Estimating The Treatment Effects Based On Observational Studies., Younathan Abdia

Electronic Theses and Dissertations

This dissertation consists of two interconnected research projects. The first project was a study of propensity scores based statistical methods for estimating the average treatment effect (ATE) and the average treatment effect among treated (ATT) when there are two treatment groups. The ATE is defined as the mean of the individual causal effects in the whole population, while ATT is defined as the treatment effect for the treated population. Propensity score based statistical methods, such as matching, regression, stratification, inverse probability weighting (IPW), and doubly robust (DR) methods were used to estimate the ATE and ATT. Simulation studies and case …