Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 19 of 19

Full-Text Articles in Statistical Models

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan Dec 2004

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

Case-cohort designs are increasingly commonly used in large epidemiological cohort studies. Nan, Yu, and Kalbeisch (2004) provided the asymptotic results for censored linear regression models in case-cohort studies. In this article, we consider computational aspects of their proposed rank based estimating methods. We show that the rank based discontinuous estimating functions for case-cohort studies are monotone, a property established for cohort data in the literature, when generalized Gehan type of weights are used. Though the estimating problem can be formulated to a linear programming problem as that for cohort data, due to its easily uncontrollable large scale even for a …


A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch Nov 2004

A Bayesian Mixture Model Relating Dose To Critical Organs And Functional Complication In 3d Conformal Radiation Therapy, Tim Johnson, Jeremy Taylor, Randall K. Ten Haken, Avraham Eisbruch

The University of Michigan Department of Biostatistics Working Paper Series

A goal of radiation therapy is to deliver maximum dose to the target tumor while minimizing complications due to irradiation of critical organs. Technological advances in 3D conformal radiation therapy has allowed great strides in realizing this goal, however complications may still arise. Critical organs may be adjacent to tumors or in the path of the radiation beam. Several mathematical models have been proposed that describe a relationship between dose and observed functional complication, however only a few published studies have successfully fit these models to data using modern statistical methods which make efficient use of the data. One complication …


Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh Nov 2004

Semiparametric Binary Regression Under Monotonicity Constraints, Moulinath Banerjee, Pinaki Biswas, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Summary: We study a binary regression model where the response variable $\Delta$ is the indicator of an event of interest (for example, the incidence of cancer) and the set of covariates can be partitioned as $(X,Z)$ where $Z$ (real valued) is the covariate of primary interest and $X$ (vector valued) denotes a set of control variables. For any fixed $X$, the conditional probability of the event of interest is assumed to be a monotonic function of $Z$. The effect of the control variables is captured by a regression parameter $\beta$. We show that the baseline conditional probability function (corresponding to …


A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia Nov 2004

A Bayesian Method For Finding Interactions In Genomic Studies, Wei Chen, Debashis Ghosh, Trivellore E. Raghuanthan, Sharon Kardia

The University of Michigan Department of Biostatistics Working Paper Series

An important step in building a multiple regression model is the selection of predictors. In genomic and epidemiologic studies, datasets with a small sample size and a large number of predictors are common. In such settings, most standard methods for identifying a good subset of predictors are unstable. Furthermore, there is an increasing emphasis towards identification of interactions, which has not been studied much in the statistical literature. We propose a method, called BSI (Bayesian Selection of Interactions), for selecting predictors in a regression setting when the number of predictors is considerably larger than the sample size with a focus …


Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh Oct 2004

Finding Cancer Subtypes In Microarray Data Using Random Projections, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

One of the benefits of profiling of cancer samples using microarrays is the generation of molecular fingerprints that will define subtypes of disease. Such subgroups have typically been found in microarray data using hierarchical clustering. A major problem in interpretation of the output is determining the number of clusters. We approach the problem of determining disease subtypes using mixture models. A novel estimation procedure of the parameters in the mixture model is developed based on a combination of random projections and the expectation-maximization algorithm. Because the approach is probabilistic, our approach provides a measure for the number of true clusters …


Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh Oct 2004

Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Abstract: In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We also discuss adjustment for covariates in this class of models and provide a simple …


Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch Oct 2004

Censored Linear Regression For Case-Cohort Studies, Bin Nan, Menggang Yu, Jack Kalbfleisch

The University of Michigan Department of Biostatistics Working Paper Series

Right censored data from a classical case-cohort design and a stratified case-cohort design are considered. In the classical case-cohort design, the subcohort is obtained as a simple random sample of the entire cohort, whereas in the stratified design, the subcohort is selected by independent Bernoulli sampling with arbitrary selection probabilities. For each design and under a linear regression model, methods for estimating the regression parameters are proposed and analyzed. These methods are derived by modifying the linear ranks tests and estimating equations that arise from full-cohort data using methods that are similar to the "pseudo-likelihood" estimating equation that has been …


Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh Jun 2004

Nonparametric Methods For Analyzing Replication Origins In Genomewide Data, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Due to the advent of high-throughput genomic technology, it has become possible to globally monitor cellular activities on a genomewide basis. With these new methods, scientists can begin to address important biological questions. One such question involves the identification of replication origins, which are regions in chromosomes where DNA replication is initiated. In addition, one hypothesis regarding replication origins is that their locations are non-random throughout the genome. In this article, we develop methods for identification of and cluster inference regarding replication origins involving genomewide expression data. We compare several nonparametric regression methods for the identification of replication origin locations. …


Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan Jun 2004

Semiparametric Methods For Identification Of Tumor Progression Genes From Microarray Data, Debashis Ghosh, Arul Chinnaiyan

The University of Michigan Department of Biostatistics Working Paper Series

The use of microarray data has become quite commonplace in medical and scientific experiments. We focus here on microarray data generated from cancer studies. It is potentially important for the discovery of biomarkers to identify genes whose expression levels correlate with tumor progression. In this article, we develop statistical procedures for the identification of such genes, which we term tumor progression genes. Two methods are considered in this paper. The first is use of a proportional odds procedure, combined with false discovery rate estimation techniques to adjust for the multiple testing problem. The second method is based on order-restricted estimation …


The False Discovery Rate: A Variable Selection Perspective, Debashis Ghosh, Wei Chen, Trivellore E. Raghuanthan Jun 2004

The False Discovery Rate: A Variable Selection Perspective, Debashis Ghosh, Wei Chen, Trivellore E. Raghuanthan

The University of Michigan Department of Biostatistics Working Paper Series

In many scientific and medical settings, large-scale experiments are generating large quantities of data that lead to inferential problems involving multiple hypotheses. This has led to recent tremendous interest in statistical methods regarding the false discovery rate (FDR). Several authors have studied the properties involving FDR in a univariate mixture model setting. In this article, we turn the problem on its side; in this manuscript, we show that FDR is a by-product of Bayesian analysis of variable selection problem for a hierarchical linear regression model. This equivalence gives many Bayesian insights as to why FDR is a natural quantity to …


Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh May 2004

Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used for receiver operating characteristic (ROC) curve modelling when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We show that the Van der Waerden rank score …


Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh May 2004

Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been some recent work in the statistical literature for modelling the relationship between the size of primary cancers and the occurrences of metastases. While nonparametric methods have been proposed for estimation of the tumor size distribution at which metastatic transition occurs, their asymptotic properties have not been studied. In addition, no testing or regression methods are available so that potential confounders and prognostic factors can be adjusted for. We develop a unified approach to nonparametric and semiparametric analysis of modelling tumor size-metastasis data in this article. An equivalence between the models considered by previous authors with survival data …


Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh May 2004

Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been much work on developing statistical procedures for associating tumor size with the probability of detecting a metastasis. Recently, Ghosh (2004) developed a unified statistical framework in which equivalences with censored data structures and models for tumor size and metastasis were examined. Based on this framework, we consider model checking techniques for semiparametric regression models in this paper. The procedures are for checking the additive hazards model. Goodness of fit methods are described for assessing functional form of covariates as well as the additive hazards assumption. The finite-sample properties of the methods are assessed using simulation studies.


Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas May 2004

Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas

The University of Michigan Department of Biostatistics Working Paper Series

There is a lot of interest in the development and characterization of new biomarkers for screening large populations for disease. In much of the literature on diagnostic testing, increased levels of a biomarker correlate with increased disease risk. However, parametric forms are typically used to associate these quantities. In this article, we specify a monotonic relationship between biomarker levels with disease risk. This leads to consideration of a nonparametric regression model for a single biomarker. Estimation results using isotonic regression-type estimators and asymptotic results are given. We also discuss confidence set estimation in this setting and propose three procedures for …


Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan Apr 2004

Covariate Adjustment In The Analysis Of Microarray Data From Clinical Studies, Debashis Ghosh, Arul Chinnaiyan

The University of Michigan Department of Biostatistics Working Paper Series

There is tremendous scientific interest in the analysis of gene expression data in clinical settings, such as oncology. In this paper, we describe the importance of adjusting for confounders and other prognostic factors in order to select for differentially expressed genes for followup validation studies. We develop two approaches to the analysis of microarray data in nonrandomized clinical settings. The first is an extension of the current significance analysis of microarray procedures, where other covariates are taken into account. The second is a novel covariate-adjusted regression modelling based on the receiver operating characteristic curve for the analysis of gene expression …


A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson Mar 2004

A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

In a common ROC study design, several readers are asked to rate diagnostics of the same cases processed under different modalities. We describe a Bayesian hierarchical model that facilitates the analysis of this study design by explicitly modeling the three sources of variation inherent to it. In so doing, we achieve substantial reductions in the posterior uncertainty associated with estimates of the differences in areas under the estimated ROC curves and corresponding reductions in the mean squared error (MSE) of these estimates. Based on simulation studies, both the widths of confidence intervals and MSE of estimates of differences in the …


A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson Feb 2004

A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

This article describes an extension of classical x 2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involvesevaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptoti-cally distributed as a x2 random variable on K-1 degrees of freedom, indepen-dently of the dimension of the underlying parameter vector. By averaging over the posterior distribution of this statistic, a global goodness-of-fit diagnostic is obtained. Advantages of this diagnostic{which may be interpreted as the area under an ROC curve{include ease of interpretation, computational conve-nience, and favorable power properties. The proposed …


Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler Feb 2004

Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler

The University of Michigan Department of Biostatistics Working Paper Series

For monitoring patients treated for prostate cancer, Prostate Specific Antigen (PSA) is measured periodically after they receive treatment. Increases in PSA are suggestive of recurrence of the cancer and are used in making decisions about possible new treatments. The data from studies of such patients typically consist of longitudinal PSA measurements, censored event times and baseline covariates. Methods for the combined analysis of both longitudinal and survival data have been developed in recent years, with the main emphasis being on modeling and estimation. We analyze data from a prostate cancer study that has been extended by adding a mixture structure …


Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh Feb 2004

Mixture Models For Assessing Differential Expression In Complex Tissues Using Microarray Data, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

The use of DNA microarrays has become quite popular in many scientific and medical disciplines, such as in cancer research. One common goal of these studies is to determine which genes are differentially expressed between cancer and healthy tissue, or more generally, between two experimental conditions. A major complication in the molecular profiling of tumors using gene expression data is that the data represent a combination of tumor and normal cells. Much of the methodology developed for assessing differential expression with microarray data has assumed that tissue samples are homogeneous. In this article, we outline a general framework for determining …