Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

PDF

The University of Michigan Department of Biostatistics Working Paper Series

2004

Discipline
Keyword

Articles 1 - 28 of 28

Full-Text Articles in Physical Sciences and Mathematics

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan Dec 2004

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

Case-cohort designs are increasingly commonly used in large epidemiological cohort studies. Nan, Yu, and Kalbeisch (2004) provided the asymptotic results for censored linear regression models in case-cohort studies. In this article, we consider computational aspects of their proposed rank based estimating methods. We show that the rank based discontinuous estimating functions for case-cohort studies are monotone, a property established for cohort data in the literature, when generalized Gehan type of weights are used. Though the estimating problem can be formulated to a linear programming problem as that for cohort data, due to its easily uncontrollable large scale even for a …


A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch Nov 2004

A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch

The University of Michigan Department of Biostatistics Working Paper Series

A goal of radiation therapy is to deliver maximum dose to the target tumor while minimizing complications due to irradiation of critical organs. Technological advances in 3D conformal radiation therapy has allowed great strides in realizing this goal, however complications may still arise. Critical organs may be adjacent to tumors or in the path of the radiation beam. Several mathematical models have been proposed that describe a relationship between dose and observed functional complication, however only a few published studies have successfully fit these models to data using modern statistical methods which make efficient use of the data. One complication …


Survival Analysis Using Auxiliary Variables Via Nonparametric Multiple Imputation, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray, Daniel Commenges Nov 2004

Survival Analysis Using Auxiliary Variables Via Nonparametric Multiple Imputation, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray, Daniel Commenges

The University of Michigan Department of Biostatistics Working Paper Series

We develop an approach, based on multiple imputation, that estimates the marginal survival distribution in survival analysis using auxiliary variable to recover information for censored observations. To conduct the imputation, we use two working survival model to define the nearest neighbor imputing risk set. One model is for the event times and the other for the censoring times. Based on the imputing risk set, two nonparametric multiple imputation methods are considered: risk set imputation, and Kaplan-Meier estimator. For both methods a future event or censoring time is imputed for each censored observation. With a categorical auxiliary variable, we show that …


Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh Nov 2004

Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Summary: We study a binary regression model where the response variable $\Delta$ is the indicator of an event of interest (for example, the incidence of cancer) and the set of covariates can be partitioned as $(X,Z)$ where $Z$ (real valued) is the covariate of primary interest and $X$ (vector valued) denotes a set of control variables. For any fixed $X$, the conditional probability of the event of interest is assumed to be a monotonic function of $Z$. The effect of the control variables is captured by a regression parameter $\beta$. We show that the baseline conditional probability function (corresponding to …


A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia Nov 2004

A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia

The University of Michigan Department of Biostatistics Working Paper Series

An important step in building a multiple regression model is the selection of predictors. In genomic and epidemiologic studies, datasets with a small sample size and a large number of predictors are common. In such settings, most standard methods for identifying a good subset of predictors are unstable. Furthermore, there is an increasing emphasis towards identification of interactions, which has not been studied much in the statistical literature. We propose a method, called BSI (Bayesian Selection of Interactions), for selecting predictors in a regression setting when the number of predictors is considerably larger than the sample size with a focus …


Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh Oct 2004

Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …


Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh Oct 2004

Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Abstract: In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We also discuss adjustment for covariates in this class of models and provide a simple …


Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch Oct 2004

Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch

The University of Michigan Department of Biostatistics Working Paper Series

Right censored data from a classical case-cohort design and a stratified case-cohort design are considered. In the classical case-cohort design, the subcohort is obtained as a simple random sample of the entire cohort, whereas in the stratified design, the subcohort is selected by independent Bernoulli sampling with arbitrary selection probabilities. For each design and under a linear regression model, methods for estimating the regression parameters are proposed and analyzed. These methods are derived by modifying the linear ranks tests and estimating equations that arise from full-cohort data using methods that are similar to the "pseudo-likelihood" estimating equation that has been …


New Estimating Methods For Surrogate Outcome Data, Bin Nan Jun 2004

New Estimating Methods For Surrogate Outcome Data, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

Surrogate outcome data arise frequently in medical research. The true outcomes of interest are expensive or hard to ascertain, but measurements of surrogate outcomes (or more generally speaking, the correlates of the true outcomes) are usually available. In this paper we assume that the conditional expectation of the true outcome given covariates is known up to a finite dimensional parameter. When the true outcome is missing at random, the e±cient score function for the parameter in the conditional mean model has a simple form, which is similar to the generalized estimating functions. There is no integral equation involved as in …


Asymptotic Results For Simultaneous Group Sequential Analysis Of Rank-Based And Weighted Kaplan-Meier Tests With Paired Survival Data In The Presence Of Censoring. Technical Report, Adin-Cristian Andrei, Susan Murray Jun 2004

Asymptotic Results For Simultaneous Group Sequential Analysis Of Rank-Based And Weighted Kaplan-Meier Tests With Paired Survival Data In The Presence Of Censoring. Technical Report, Adin-Cristian Andrei, Susan Murray

The University of Michigan Department of Biostatistics Working Paper Series

This research sequentially monitors paired survival differences using a new class of non-parametric tests based on functionals of standardized paired weighted log-rank (PWLR) and standardized paired weighted Kaplan-Meier (PWKM) tests. During a trial these tests may alternately assume the role of the more extreme statistic. By monitoring PEMAX, the maximum between the absolute values of the standardized PWLR and PWKM, one combines advantages of rank-based and non rank-based paired testing paradigms. Simulations show that monitoring treatment differences using PEMAX maintains type I error and is nearly as powerful as using the more advantageous of the two tests, in proportional hazards …


Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh Jun 2004

Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Due to the advent of high-throughput genomic technology, it has become possible to globally monitor cellular activities on a genomewide basis. With these new methods, scientists can begin to address important biological questions. One such question involves the identification of replication origins, which are regions in chromosomes where DNA replication is initiated. In addition, one hypothesis regarding replication origins is that their locations are non-random throughout the genome. In this article, we develop methods for identification of and cluster inference regarding replication origins involving genomewide expression data. We compare several nonparametric regression methods for the identification of replication origin locations. …


Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan Jun 2004

Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan

The University of Michigan Department of Biostatistics Working Paper Series

The use of microarray data has become quite commonplace in medical and scientific experiments. We focus here on microarray data generated from cancer studies. It is potentially important for the discovery of biomarkers to identify genes whose expression levels correlate with tumor progression. In this article, we develop statistical procedures for the identification of such genes, which we term tumor progression genes. Two methods are considered in this paper. The first is use of a proportional odds procedure, combined with false discovery rate estimation techniques to adjust for the multiple testing problem. The second method is based on order-restricted estimation …


The False Discovery Rate: A Variable Selection Perspective, Debashis Ghosh, Wei Chen, Trivellore E. Raghuanthan Jun 2004

The False Discovery Rate: A Variable Selection Perspective, Debashis Ghosh, Wei Chen, Trivellore E. Raghuanthan

The University of Michigan Department of Biostatistics Working Paper Series

In many scientific and medical settings, large-scale experiments are generating large quantities of data that lead to inferential problems involving multiple hypotheses. This has led to recent tremendous interest in statistical methods regarding the false discovery rate (FDR). Several authors have studied the properties involving FDR in a univariate mixture model setting. In this article, we turn the problem on its side; in this manuscript, we show that FDR is a by-product of Bayesian analysis of variable selection problem for a hierarchical linear regression model. This equivalence gives many Bayesian insights as to why FDR is a natural quantity to …


Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh May 2004

Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used for receiver operating characteristic (ROC) curve modelling when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We show that the Van der Waerden rank score …


Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh May 2004

Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been some recent work in the statistical literature for modelling the relationship between the size of primary cancers and the occurrences of metastases. While nonparametric methods have been proposed for estimation of the tumor size distribution at which metastatic transition occurs, their asymptotic properties have not been studied. In addition, no testing or regression methods are available so that potential confounders and prognostic factors can be adjusted for. We develop a unified approach to nonparametric and semiparametric analysis of modelling tumor size-metastasis data in this article. An equivalence between the models considered by previous authors with survival data …


Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh May 2004

Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been much work on developing statistical procedures for associating tumor size with the probability of detecting a metastasis. Recently, Ghosh (2004) developed a unified statistical framework in which equivalences with censored data structures and models for tumor size and metastasis were examined. Based on this framework, we consider model checking techniques for semiparametric regression models in this paper. The procedures are for checking the additive hazards model. Goodness of fit methods are described for assessing functional form of covariates as well as the additive hazards assumption. The finite-sample properties of the methods are assessed using simulation studies.


Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas May 2004

Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas

The University of Michigan Department of Biostatistics Working Paper Series

There is a lot of interest in the development and characterization of new biomarkers for screening large populations for disease. In much of the literature on diagnostic testing, increased levels of a biomarker correlate with increased disease risk. However, parametric forms are typically used to associate these quantities. In this article, we specify a monotonic relationship between biomarker levels with disease risk. This leads to consideration of a nonparametric regression model for a single biomarker. Estimation results using isotonic regression-type estimators and asymptotic results are given. We also discuss confidence set estimation in this setting and propose three procedures for …


Does Weighting For Nonresponse Increase The Variance Of Survey Means?, Rod Little, Sonya L. Vartivarian Apr 2004

Does Weighting For Nonresponse Increase The Variance Of Survey Means?, Rod Little, Sonya L. Vartivarian

The University of Michigan Department of Biostatistics Working Paper Series

Nonresponse weighting is a common method for handling unit nonresponse in surveys. A widespread view is that the weighting method is aimed at reducing nonresponse bias, at the expense of an increase in variance. Hence, the efficacy of weighting adjustments becomes a bias-variance trade-off. This note suggests that this view is an oversimplification -- nonresponse weighting can in fact lead to a reduction in variance as well as bias. A covariate for a weighting adjustment must have two characteristics to reduce nonresponse bias - it needs to be related to the probability of response, and it needs to be related …


Resampling Methods For Estimating Functions With U-Statistic Structure, Wenyu Jiang, Jack Kalbfleisch Apr 2004

Resampling Methods For Estimating Functions With U-Statistic Structure, Wenyu Jiang, Jack Kalbfleisch

The University of Michigan Department of Biostatistics Working Paper Series

Suppose that inference about parameters of interest is to be based on an unbiased estimating function that is U-statistic of degree 1 or 2. We define suitable studentized versions of such estimating functions and consider asymptotic approximations as well as an estimating function bootstrap (EFB) method based on resampling the estimated terms in the estimating functions. These methods are justified asymptotically and lead to confidence intervals produced directly from the studentized estimating functions. Particular examples in this class of estimating functions arise in La estimation as well as Wilcoxon rank regression and other related estimation problems. The proposed methods are …


Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan Apr 2004

Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan

The University of Michigan Department of Biostatistics Working Paper Series

There is tremendous scientific interest in the analysis of gene expression data in clinical settings, such as oncology. In this paper, we describe the importance of adjusting for confounders and other prognostic factors in order to select for differentially expressed genes for followup validation studies. We develop two approaches to the analysis of microarray data in nonrandomized clinical settings. The first is an extension of the current significance analysis of microarray procedures, where other covariates are taken into account. The second is a novel covariate-adjusted regression modelling based on the receiver operating characteristic curve for the analysis of gene expression …


Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin Mar 2004

Causal Inference In Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, Xihong Lin

The University of Michigan Department of Biostatistics Working Paper Series

Randomized allocation of treatments is a cornerstone of experimental design, but has drawbacks when a limited set of individuals are willing to be randomized, or the act of randomization undermines the success of the treatment. Choice-based experimental designs allow a subset of the participants to choose their treatments. We discuss here causal inferences for experimental designs where some participants are randomly allocated to treatments and others receive their treatment preference. This paper was motivated by the “Women Take Pride” (WTP) study (Janevic et al., 2001), a doubly randomized preference trail (DRPT) to assess behavioral interventions for women with heart disease. …


A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson Mar 2004

A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

In a common ROC study design, several readers are asked to rate diagnostics of the same cases processed under different modalities. We describe a Bayesian hierarchical model that facilitates the analysis of this study design by explicitly modeling the three sources of variation inherent to it. In so doing, we achieve substantial reductions in the posterior uncertainty associated with estimates of the differences in areas under the estimated ROC curves and corresponding reductions in the mean squared error (MSE) of these estimates. Based on simulation studies, both the widths of confidence intervals and MSE of estimates of differences in the …


A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson Feb 2004

A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

This article describes an extension of classical x 2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involvesevaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptoti-cally distributed as a x2 random variable on K-1 degrees of freedom, indepen-dently of the dimension of the underlying parameter vector. By averaging over the posterior distribution of this statistic, a global goodness-of-fit diagnostic is obtained. Advantages of this diagnostic{which may be interpreted as the area under an ROC curve{include ease of interpretation, computational conve-nience, and favorable power properties. The proposed …


Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray Feb 2004

Multiple Imputation For Interval Censored Data With Auxiliary Variables, Chiu-Hsieh Hsu, Jeremy Taylor, Susan Murray

The University of Michigan Department of Biostatistics Working Paper Series

We propose a nonparametric multiple imputation scheme, NPMLE imputation, for the analysis of interval censored survival data. Features of the method are that it converts interval-censored data problems to complete data or right censored data problems to which many standard approaches can be used, and the measures of uncertainty are easily obtained. In addition to the event time of primary interest, there are frequently other auxiliary variables that are associated with the event time. For the goal of estimating the marginal survival distribution, these auxiliary variables may provide some additional information about the event time for the interval censored observations. …


Individualized Predictions Of Disease Progression Following Radiation Therapy For Prostate Cancer., Jeremy Taylor, Menggang Yu, Howard M. Sandler Feb 2004

Individualized Predictions Of Disease Progression Following Radiation Therapy For Prostate Cancer., Jeremy Taylor, Menggang Yu, Howard M. Sandler

The University of Michigan Department of Biostatistics Working Paper Series

Background: Following treatment for localized prostate cancer, men are monitored with serial PSA measurements. Refining the predictive value of post-treatment PSA determinations may add to clinical management and we have developed a model that predicts for an individual patient future PSA values and estimates the time to future clinical recurrence.

Methods: Data from 934 patients treated for prostate cancer between 1987 and 2000 were used to develop a comprehensive statistical model to fit the clinical recurrence events and pattern of PSA data. A logistic regression model was used for the probability of cure, non-linear hierarchical mixed models were used for …


Piecewise Constant Cross-Ratio Estimation For Association In Bivariate Survival Data With Application To Studying Markers Of Menopausal Transition, Bin Nan, Xihong Lin, Lynda D. Lisabet, Sioban Harlow Feb 2004

Piecewise Constant Cross-Ratio Estimation For Association In Bivariate Survival Data With Application To Studying Markers Of Menopausal Transition, Bin Nan, Xihong Lin, Lynda D. Lisabet, Sioban Harlow

The University of Michigan Department of Biostatistics Working Paper Series

A question of significant interest in female reproductive aging is to identify bleeding criteria for the menopausal transition. Although various bleeding criteria, or markers, have been proposed for the menopausal transition, their validity has not been adequately examined. The Tremin Trust data are collected from a long-term cohort study that followed a group of women throughout their whole reproductive life, and provide a unique opportunity for assessing the association between age at onset of a bleeding marker and age onset of menopause. Formal statistical analysis of this dependence is challenging give the fact that both the marker event and menopause …


Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler Feb 2004

Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler

The University of Michigan Department of Biostatistics Working Paper Series

For monitoring patients treated for prostate cancer, Prostate Specific Antigen (PSA) is measured periodically after they receive treatment. Increases in PSA are suggestive of recurrence of the cancer and are used in making decisions about possible new treatments. The data from studies of such patients typically consist of longitudinal PSA measurements, censored event times and baseline covariates. Methods for the combined analysis of both longitudinal and survival data have been developed in recent years, with the main emphasis being on modeling and estimation. We analyze data from a prostate cancer study that has been extended by adding a mixture structure …


Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh Feb 2004

Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

The use of DNA microarrays has become quite popular in many scientific and medical disciplines, such as in cancer research. One common goal of these studies is to determine which genes are differentially expressed between cancer and healthy tissue, or more generally, between two experimental conditions. A major complication in the molecular profiling of tumors using gene expression data is that the data represent a combination of tumor and normal cells. Much of the methodology developed for assessing differential expression with microarray data has assumed that tissue samples are homogeneous. In this article, we outline a general framework for determining …