Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Biostatistics

PDF

The University of Michigan Department of Biostatistics Working Paper Series

Articles 1 - 30 of 55

Full-Text Articles in Physical Sciences and Mathematics

Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang Jan 2020

Shrinkage Priors For Isotonic Probability Vectors And Binary Data Modeling, Philip S. Boonstra, Daniel R. Owen, Jian Kang

The University of Michigan Department of Biostatistics Working Paper Series

This paper outlines a new class of shrinkage priors for Bayesian isotonic regression modeling a binary outcome against a predictor, where the probability of the outcome is assumed to be monotonically non-decreasing with the predictor. The predictor is categorized into a large number of groups, and the set of differences between outcome probabilities in consecutive categories is equipped with a multivariate prior having support over the set of simplexes. The Dirichlet distribution, which can be derived from a normalized cumulative sum of gamma-distributed random variables, is a natural choice of prior, but using mathematical and simulation-based arguments, we show that …


A Modular Framework For Early-Phase Seamless Oncology Trials, Philip S. Boonstra, Thomas M. Braun, Elizabeth C. Chase Jan 2020

A Modular Framework For Early-Phase Seamless Oncology Trials, Philip S. Boonstra, Thomas M. Braun, Elizabeth C. Chase

The University of Michigan Department of Biostatistics Working Paper Series

Background: As our understanding of the etiology and mechanisms of cancer becomes more sophisticated and the number of therapeutic options increases, phase I oncology trials today have multiple primary objectives. Many such designs are now 'seamless', meaning that the trial estimates both the maximum tolerated dose and the efficacy at this dose level. Sponsors often proceed with further study only with this additional efficacy evidence. However, with this increasing complexity in trial design, it becomes challenging to articulate fundamental operating characteristics of these trials, such as (i) what is the probability that the design will identify an acceptable, i.e. safe …


Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss Oct 2019

Inferring A Consensus Problem List Using Penalized Multistage Models For Ordered Data, Philip S. Boonstra, John C. Krauss

The University of Michigan Department of Biostatistics Working Paper Series

A patient's medical problem list describes his or her current health status and aids in the coordination and transfer of care between providers, among other things. Because a problem list is generated once and then subsequently modified or updated, what is not usually observable is the provider-effect. That is, to what extent does a patient's problem in the electronic medical record actually reflect a consensus communication of that patient's current health status? To that end, we report on and analyze a unique interview-based design in which multiple medical providers independently generate problem lists for each of three patient case abstracts …


Incorporating Historical Models With Adaptive Bayesian Updates, Philip S. Boonstra, Ryan P. Barbaro Mar 2018

Incorporating Historical Models With Adaptive Bayesian Updates, Philip S. Boonstra, Ryan P. Barbaro

The University of Michigan Department of Biostatistics Working Paper Series

This paper considers Bayesian approaches for incorporating information from a historical model into a current analysis when the historical model includes only a subset of covariates currently of interest. The statistical challenge is two-fold. First, the parameters in the nested historical model are not generally equal to their counterparts in the larger current model, neither in value nor interpretation. Second, because the historical information will not be equally informative for all parameters in the current analysis, additional regularization may be required beyond that provided by the historical information. We propose several novel extensions of the so-called power prior that adaptively …


Variable Selection For Estimating The Optimal Treatment Regimes In The Presence Of A Large Number Of Covariate, Baqun Zhang, Min Zhang Jul 2016

Variable Selection For Estimating The Optimal Treatment Regimes In The Presence Of A Large Number Of Covariate, Baqun Zhang, Min Zhang

The University of Michigan Department of Biostatistics Working Paper Series

Most of existing methods for optimal treatment regimes, with few exceptions, focus on estimation and are not designed for variable selection with the objective of optimizing treatment decisions. In clinical trials and observational studies, often numerous baseline variables are collected and variable selection is essential for deriving reliable optimal treatment regimes. Although many variable selection methods exist, they mostly focus on selecting variables that are important for prediction (predictive variables) instead of variables that have a qualitative interaction with treatment (prescriptive variables) and hence are important for making treatment decisions. We propose a variable selection method within a general classification …


A Weighted Instrumental Variable Estimator To Control For Instrument-Outcome Confounders, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li Apr 2016

A Weighted Instrumental Variable Estimator To Control For Instrument-Outcome Confounders, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

No abstract provided.


Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li Mar 2016

Conditional Screening For Ultra-High Dimensional Covariates With Survival Outcomes, Hyokyoung Grace Hong, Jian Kang, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Identifying important biomarkers that are predictive for cancer patients' prognosis is key in gaining better insights into the biological influences on the disease and has become a critical component of precision medicine. The emergence of large-scale biomedical survival studies, which typically involve excessive number of biomarkers, has brought high demand in designing efficient screening tools for selecting predictive biomarkers. The vast amount of biomarkers defies any existing variable selection methods via regularization. The recently developed variable screening methods, though powerful in many practical setting, fail to incorporate prior information on the importance of each biomarker and are less powerful in …


Strengthening Instrumental Variables Through Weighting, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li Mar 2016

Strengthening Instrumental Variables Through Weighting, Douglas Lehmann, Yun Li, Rajiv Saran, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Instrumental variable (IV) methods are widely used to deal with the issue of unmeasured confounding and are becoming popular in health and medical research. IV models are able to obtain consistent estimates in the presence of unmeasured confounding, but rely on assumptions that are hard to verify and often criticized. An instrument is a variable that influences or encourages individuals toward a particular treatment without directly affecting the outcome. Estimates obtained using instruments with a weak influence over the treatment are known to have larger small-sample bias and to be less robust to the critical IV assumption that the instrument …


A Pairwise Likelihood Augmented Estimator For The Cox Model Under Left-Truncation, Fan Wu, Sehee Kim, Jing Qin, Rajiv Saran, Yi Li Sep 2015

A Pairwise Likelihood Augmented Estimator For The Cox Model Under Left-Truncation, Fan Wu, Sehee Kim, Jing Qin, Rajiv Saran, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Survival data collected from prevalent cohorts are subject to left-truncation and the analysis is challenging. Conditional approaches for left-truncated data under the Cox model are inefficient as they typically ignore the information in the marginal likelihood of the truncation times. Length-biased sampling methods can improve the estimation efficiency but only when the stationarity assumption of the disease incidence holds, i.e., the truncation distribution is uniform; otherwise they may generate biased estimates. In this paper, we propose a semi-parametric method for the Cox model under general left-truncation, where the truncation distribution is unspecified. Our approach is to make inference based on …


C-Learning: A New Classification Framework To Estimate Optimal Dynamic Treatment Regimes, Baqun Zhang, Min Zhang Aug 2015

C-Learning: A New Classification Framework To Estimate Optimal Dynamic Treatment Regimes, Baqun Zhang, Min Zhang

The University of Michigan Department of Biostatistics Working Paper Series

Personalizing treatment to accommodate patient heterogeneity and the evolving nature of a disease over time has received considerable attention lately. A dynamic treatment regime is a set of decision rules, each corresponding to a decision point, that determine that next treatment based on each individual’s own available characteristics and treatment history up to that point. We show that identifying the optimal dynamic treatment regime can be recast as a sequential classification problem and is equivalent to sequentially minimizing a weighted expected misclassification error. This general classification perspective targets the exact goal of optimally individualizing treatments and is new and fundamentally …


Variable Selection With False Discovery Control, Kevin He, Yanming Li, Ji Zhu, Hongliang Liu, Jeffrey E. Lee, Christopher I. Amos, Terry Hyslop, Jiashun Jin, Qinyi Wei, Yi Li Jan 2015

Variable Selection With False Discovery Control, Kevin He, Yanming Li, Ji Zhu, Hongliang Liu, Jeffrey E. Lee, Christopher I. Amos, Terry Hyslop, Jiashun Jin, Qinyi Wei, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Technological advances that allow routine identification of high-dimensional risk factors have led to high demand for statistical techniques that enable full utilization of these rich sources of information for genome-wide association studies (GWAS). Variable selection for censored outcome data as well as control of false discoveries (i.e. inclusion of irrelevant variables) in the presence of high-dimensional predictors present serious challenges. In the context of survival analysis with high-dimensional covariates, this paper develops a computationally feasible method for building general risk prediction models, while controlling false discoveries. We have proposed a high-dimensional variable selection method by incorporating stability selection to control …


Tests For Gene-Environment Interactions And Joint Effects With Exposure Misclassification, Philip S. Boonstra, Bhramar Mukherjee, Stephen B. Gruber, Jaeil Ahn, Stephanie L. Schmit, Nilanjan Chatterjee Jan 2015

Tests For Gene-Environment Interactions And Joint Effects With Exposure Misclassification, Philip S. Boonstra, Bhramar Mukherjee, Stephen B. Gruber, Jaeil Ahn, Stephanie L. Schmit, Nilanjan Chatterjee

The University of Michigan Department of Biostatistics Working Paper Series

The number of methods for genome-wide testing of gene-environment interactions (GEI) continues to increase with the hope of discovering new genetic risk factors and obtaining insight into the disease-gene-environment relationship. The relative performance of these methods based on family-wise type 1 error rate and power depends on underlying disease-gene-environment associations, estimates of which may be biased in the presence of exposure misclassification. This simulation study expands on a previously published simulation study of methods for detecting GEI by evaluating the impact of exposure misclassification. We consider seven single step and modular screening methods for identifying GEI at a genome-wide level …


Pgs: A Tool For Association Study Of High-Dimensional Microrna Expression Data With Repeated Measures, Yinan Zheng, Zhe Fei, Wei Zhang, Justin Starren, Lei Liu, Andrea Baccarelli, Yi Li, Lifang Hou Jun 2014

Pgs: A Tool For Association Study Of High-Dimensional Microrna Expression Data With Repeated Measures, Yinan Zheng, Zhe Fei, Wei Zhang, Justin Starren, Lei Liu, Andrea Baccarelli, Yi Li, Lifang Hou

The University of Michigan Department of Biostatistics Working Paper Series

Motivation: MicroRNAs (miRNAs) are short single-stranded non-coding molecules that usually function as negative regulators to silence or suppress gene expression. Due to interested in the dynamic nature of the miRNA and reduced microarray and sequencing costs, a growing number of researchers are now measuring high-dimensional miRNAs expression data using repeated or multiple measures in which each individual has more than one sample collected and measured over time. However, the commonly used site-by-site multiple testing may impair the value of repeated or multiple measures data by ignoring the inherent dependent structure, which lead to problems including underpowered results after multiple comparison …


A Global Partial Likelihood Estimator Of The Time-Varying Effects For Time-Dependent Treatment, Huazhen Lin, Zhe Fei, Yi Li Mar 2014

A Global Partial Likelihood Estimator Of The Time-Varying Effects For Time-Dependent Treatment, Huazhen Lin, Zhe Fei, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

The timing of time-dependent treatment - e.g., when to perform kidney transplantation - is an important factor for evaluating treatment efficacy. A naive comparison between the treatment and nontreatment groups, while ignoring the timing of treatment, typically yields results that might biasedly favor the treatment group, as only patients who survive long enough will get treated. On the other hand, studying the effect of time-dependent treatment is often complex, as it involves modeling treatment history and accounting for the possible time-varying nature of the treatment effect. We propose a varying-coefficient Cox model that investigates the efficacy of time-dependent treatment by …


Clustering Survival Outcomes Using Dirichlet Process Mixture, Lili Zhao, Jingchunzi Shi, Tempie H. Shearon, Yi Li Mar 2014

Clustering Survival Outcomes Using Dirichlet Process Mixture, Lili Zhao, Jingchunzi Shi, Tempie H. Shearon, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Motivated by the national evaluation of mortality rates at kidney transplant centers in the United States, we sought to assess transplant center long- term survival outcomes by applying a methodology developed in Bayesian non-parametrics literature. We described a Dirichlet process model and a Dirichlet process mixture model with a Half-Cauchy for the estimation of the risk- adjusted effects of the transplant centers. To improve the model performance and interpretability, we centered the Dirichlet process. We also proposed strategies to increase model's classification ability. Finally we derived statistical measures and created graphical tools to rate transplant centers and identify outlying centers …


Set-Based Tests For Genetic Association In Longitudinal Studies, Zihuai He, Min Zhang, Seunggeun Lee, Jennifer A. Smith, Xiuqing Guo, Walter Palmas, Sharon L.R. Kardia, Ana V. Diez Roux, Bhramar Mukherjee Jan 2014

Set-Based Tests For Genetic Association In Longitudinal Studies, Zihuai He, Min Zhang, Seunggeun Lee, Jennifer A. Smith, Xiuqing Guo, Walter Palmas, Sharon L.R. Kardia, Ana V. Diez Roux, Bhramar Mukherjee

The University of Michigan Department of Biostatistics Working Paper Series

Genetic association studies with longitudinal markers of chronic diseases (e.g., blood pressure, body mass index) provide a valuable opportunity to explore how genetic variants affect traits over time by utilizing the full trajectory of longitudinal outcomes. Since these traits are likely influenced by the joint effect of multiple variants in a gene, a joint analysis of these variants considering linkage disequilibrium (LD) may help to explain additional phenotypic variation. In this article, we propose a longitudinal genetic random field model (LGRF), to test the association between a phenotype measured repeatedly during the course of an observational study and a set …


Varying Index Coefficient Models, Shujie Ma, Peter Xuekun Song May 2013

Varying Index Coefficient Models, Shujie Ma, Peter Xuekun Song

The University of Michigan Department of Biostatistics Working Paper Series

It has been a long history of utilizing interactions in regression analysis to investigate interactive effects of covariates on response variables. In this paper we aim to address two kinds of new challenges resulted from the inclusion of such high-order effects in the regression model for complex data. The first kind arises from a situation where interaction effects of individual covariates are weak but those of combined covariates are strong, and the other kind pertains to the presence of nonlinear interactive effects. Generalizing the single index coefficient regression model (Xia and Li, 1999), we propose a new class of semiparametric …


Penalized Smoothed Partial Rank Estimator For The Nonparametric Transformation Survival Model With High-Dimensional Covariates, Wei Dai, Yi Li May 2013

Penalized Smoothed Partial Rank Estimator For The Nonparametric Transformation Survival Model With High-Dimensional Covariates, Wei Dai, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Microarray technology has the potential to lead to a better understanding of biological processes and diseases such as cancer. When failure time outcomes are also available, one might be interested in relating gene expression profiles to the survival outcome such as time to cancer recurrence or time to death. This is statistically challenging because the number of covariates greatly exceeds the number of observations. While the majority of work has focused on regularized Cox regression model and accelerated failure time model, they may be restrictive in practice. We relax the model assumption and and consider a nonparametric transformation model that …


Surrogacy Assessment Using Principal Stratification When Surrogate And Outcome Measures Are Multivariate Normal, Anna Conlon, Jeremy M.G. Taylor, Michael R. Elliott Feb 2013

Surrogacy Assessment Using Principal Stratification When Surrogate And Outcome Measures Are Multivariate Normal, Anna Conlon, Jeremy M.G. Taylor, Michael R. Elliott

The University of Michigan Department of Biostatistics Working Paper Series

No abstract provided.


Missing At Random And Ignorability For Inferences About Subsets Of Parameters With Missing Data, Roderick J. Little, Sahar Zanganeh Feb 2013

Missing At Random And Ignorability For Inferences About Subsets Of Parameters With Missing Data, Roderick J. Little, Sahar Zanganeh

The University of Michigan Department of Biostatistics Working Paper Series

For likelihood-based inferences from data with missing values, Rubin (1976) showed that the missing data mechanism can be ignored when (a) the missing data are missing at random (MAR), in the sense that missingness does not depend on the missing values after conditioning on the observed data, and (b) the parameters of the data model and the missing-data mechanism are distinct; that is, there are no a priori ties, via parameter space restrictions or prior distributions, between the parameters of the data model and the parameters of the model for the mechanism. Rubin described (a) and (b) as the "weakest …


In Praise Of Simplicity Not Mathematistry! Ten Simple Powerful Ideas For The Statistical Scientist, Roderick J. Little Jan 2013

In Praise Of Simplicity Not Mathematistry! Ten Simple Powerful Ideas For The Statistical Scientist, Roderick J. Little

The University of Michigan Department of Biostatistics Working Paper Series

Ronald Fisher was by all accounts a first-rate mathematician, but he saw himself as a scientist, not a mathematician, and he railed against what George Box called (in his Fisher lecture) "mathematistry". Mathematics is the indispensable foundation for statistics, but our subject is constantly under assault by people who want to turn statistics into a branch of mathematics, making the subject as impenetrable to non-mathematicians as possible. Valuing simplicity, I describe ten simple and powerful ideas that have influenced my thinking about statistics, in my areas of research interest: missing data, causal inference, survey sampling, and statistical modeling in general. …


Selection Of Latent Variables For Multiple Mixed-Outcome Models, Ling Zhou, Huazhen Lin, Xin-Yuan Song, Yi Li Jan 2013

Selection Of Latent Variables For Multiple Mixed-Outcome Models, Ling Zhou, Huazhen Lin, Xin-Yuan Song, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Latent variable models have been widely used for modeling the dependence structure of multiple outcomes data. As the formulation of a latent variable model is often unknown a priori, misspecification could distort the dependence structure and lead to unreliable model inference. More- over, the multiple outcomes are often of varying types (e.g., continuous and ordinal), which presents analytical challenges. In this article, we present a class of general latent variable models that can accommodate mixed types of outcomes, and further propose a novel selection approach that simultaneously selects latent variables and estimates model parameters. We show that the proposed estimators …


A Latent Variable Transformation Model Approach For Exploring Dysphagia, Anna Snavely, David P. Harrington, Yi Li Jan 2013

A Latent Variable Transformation Model Approach For Exploring Dysphagia, Anna Snavely, David P. Harrington, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

No abstract provided.


A Frailty Approach For Survival Analysis With Error-Prone Covariate, Sehee Kim, Yi Li, Donna Spiegelman Jan 2013

A Frailty Approach For Survival Analysis With Error-Prone Covariate, Sehee Kim, Yi Li, Donna Spiegelman

The University of Michigan Department of Biostatistics Working Paper Series

This paper discovers an inherent relationship between the survival model with covariate measurement error and the frailty model. The discovery motivates our using a frailty-based estimating equation to draw inference for the proportional hazards model with error-prone covariates. Our established framework accommodates general distributional structures for the error-prone covariates, not restricted to a linear additive measurement error model or Gaussian measurement error. When the conditional distribution of the frailty given the surrogate is unknown, it is estimated through a semiparametric copula function. The proposed copula-based approach enables us to fit flexible measurement error models without the curse of dimensionality as …


Ultrahigh Dimensional Time Course Feature Selection, Peirong Xu, Lixing Zhu, Yi Li Jan 2013

Ultrahigh Dimensional Time Course Feature Selection, Peirong Xu, Lixing Zhu, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Statistical challenges arise from modern biomedical studies that produce time course genomic data with ultrahigh dimensions. In a renal cancer study that motivated this paper, the pharmacokinetic measures of a tumor suppressor (CCI-779) and expression levels of 12625 genes were measured for each of 33 patients at 8 and 16 weeks after the start of treatments, with the goal of identifying predictive gene transcripts and the interactions with time in peripheral blood mononuclear cells for pharmacokinetics over the time course. The resulting dataset defies analysis even with regularized regression. Although some remedies have been proposed for both linear and generalized …


Covariance-Enhanced Discriminant Analysis, Peirong Xu, Ji Zhu, Lixing Zhu, Yi Li Jan 2013

Covariance-Enhanced Discriminant Analysis, Peirong Xu, Ji Zhu, Lixing Zhu, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Linear discriminant analysis (LDA), a classical method in pattern recognition and machine learning, has been widely used to characterize or separate multiple classes via linear combinations of features. However, the high-dimensionality of the high-throughput features obtained from modern biological experiments, for example, microarray or proteomics, defies traditional discriminant analysis techniques. The possible interfeature correlations present additional challenges and are often under-utilized in modeling. In this paper, by incorporating the possible inter-feature correlations, we propose a Covariance-Enhanced Discriminant Analysis (CEDA) method that simultaneously and consistently selects informative features and identifies the corresponding discriminable classes. We show that, under mild regularity conditions, …


Semiparametric Latent Variable Transformation Models For Multiple Mixed Outcomes, Huazhen Lin, Ling Zhou, Robert Elashoff, Yi Li Jan 2013

Semiparametric Latent Variable Transformation Models For Multiple Mixed Outcomes, Huazhen Lin, Ling Zhou, Robert Elashoff, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

No abstract provided.


Semiparametric Transformation Models For Semicompeting Survival Data, Huazhen Lin, Ling Zhou, Chunhong Li, Yi Li Jan 2013

Semiparametric Transformation Models For Semicompeting Survival Data, Huazhen Lin, Ling Zhou, Chunhong Li, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Semicompeting risk outcome data, e.g. time to disease progression and time to death, are commonly collected in clinical trials, but complicated analytical tools hamper the analysis and the interpretation of the results. We propose a novel semiparametric transformation model for such data. Compared with the existing models, our model is advantageous in the following distinctive ways. First, it allows us to provide direct estimators of the regression analysis and the association parameter. Second, the measure of surrogacy, for example, the proportion of treatment effect and relative effect, can also be directly obtained. We propose a two-stage estimation procedure for inference …


Score Test Variable Screening, Sihai Dave Zhao, Yi Li Jan 2013

Score Test Variable Screening, Sihai Dave Zhao, Yi Li

The University of Michigan Department of Biostatistics Working Paper Series

Variable screening has emerged as a crucial first step in the analysis of high-throughput data, but existing procedures can be computationally cumbersome, difficult to justify theoretically, or inapplicable to certain types of analyses. Motivated by a high-dimensional censored quantile regression problem in multiple myeloma genomics, this paper makes three contributions. First, we establish a score test-based screening framework, which is widely applicable, extremely computationally efficient, and relatively simple to justify. Secondly, we propose a resampling-based procedure for selecting the number of variables to retain after screening according to the principle of reproducibility. Finally, we propose a new iterative score test …


A Phase I Bayesian Adaptive Design To Simultaneously Optimize Dose And Schedule Assignments Both Among And Within Patients, Thomas M. Braun, Jin Zhang Aug 2012

A Phase I Bayesian Adaptive Design To Simultaneously Optimize Dose And Schedule Assignments Both Among And Within Patients, Thomas M. Braun, Jin Zhang

The University of Michigan Department of Biostatistics Working Paper Series

In traditional schedule or dose-schedule finding designs, patients are assumed to receive their assigned dose-schedule combination throughout the trial even though the combination may be found to have an undesirable toxicity profile, which contradicts actual clinical practice. Since no systematic approach exists to optimize intra-patient dose-schedule as- signment, we propose a Phase I clinical trial design that extends existing approaches that optimize dose and schedule solely among patients by incorporating adaptive variations to dose-schedule assignments within patients as the study proceeds. Our design is based on a Bayesian non-mixture cure rate model that incorporates multiple administrations each patient receives with …