Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 21 of 21

Full-Text Articles in Statistical Models

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan Dec 2004

A Hybrid Newton-Type Method For The Linear Regression In Case-Cohort Studies, Menggang Yu, Bin Nan

The University of Michigan Department of Biostatistics Working Paper Series

Case-cohort designs are increasingly commonly used in large epidemiological cohort studies. Nan, Yu, and Kalbeisch (2004) provided the asymptotic results for censored linear regression models in case-cohort studies. In this article, we consider computational aspects of their proposed rank based estimating methods. We show that the rank based discontinuous estimating functions for case-cohort studies are monotone, a property established for cohort data in the literature, when generalized Gehan type of weights are used. Though the estimating problem can be formulated to a linear programming problem as that for cohort data, due to its easily uncontrollable large scale even for a …


Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway Dec 2004

Ranking Usrds Provider-Specific Smrs From 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, Greg Ridgeway

Johns Hopkins University, Dept. of Biostatistics Working Papers

Provider profiling (ranking, "league tables") is prevalent in health services research. Similarly, comparing educational institutions and identifying differentially expressed genes depend on ranking. Effective ranking procedures must be structured by a hierarchical (Bayesian) model and guided by a ranking-specific loss function, however even optimal methods can perform poorly and estimates must be accompanied by uncertainty assessments. We use the 1998-2001 Standardized Mortality Ratio (SMR) data from United States Renal Data System (USRDS) as a platform to identify issues and approaches. Our analyses extend Liu et al. (2004) by combining evidence over multiple years via an AR(1) model; by considering estimates …


Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang Dec 2004

Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang

UW Biostatistics Working Paper Series

In this paper, we proposed a semi-parametric single-index two-part regression model to weaken assumptions in parametric regression methods that were frequently used in the analysis of skewed data with additional zero values. The estimation procedure for the parameters of interest in the model was easily implemented. The proposed estimators were shown to be consistent and asymptotically normal. Through a simulation study, we showed that the proposed estimators have reasonable finite-sample performance. We illustrated the application of the proposed method in one real study on the analysis of health care costs.


Bayesian Hierarchical Distributed Lag Models For Summer Ozone Exposure And Cardio-Respiratory Mortality, Yi Huang, Francesca Dominici, Michelle L. Bell Oct 2004

Bayesian Hierarchical Distributed Lag Models For Summer Ozone Exposure And Cardio-Respiratory Mortality, Yi Huang, Francesca Dominici, Michelle L. Bell

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994.

At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP associated with short-term exposure to summer ozone. At the second stage, we specify a class of distributions for the true city-specific relative rates to estimate an overall effect by …


Data Adaptive Estimation Of The Treatment Specific Mean, Yue Wang, Oliver Bembom, Mark J. Van Der Laan Oct 2004

Data Adaptive Estimation Of The Treatment Specific Mean, Yue Wang, Oliver Bembom, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

An important problem in epidemiology and medical research is the estimation of the causal effect of a treatment action at a single point in time on the mean of an outcome, possibly within strata of the target population defined by a subset of the baseline covariates. Current approaches to this problem are based on marginal structural models, i.e., parametric models for the marginal distribution of counterfactural outcomes as a function of treatment and effect modifiers. The various estimators developed in this context furthermore each depend on a high-dimensional nuisance parameter whose estimation currently also relies on parametric models. Since misspecification …


Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh Oct 2004

Semiparametric Methods For The Binormal Model With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

Abstract: In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We also discuss adjustment for covariates in this class of models and provide a simple …


Estimating The Retransformed Mean In A Heteroscedastic Two-Part Model, Alan H. Welsh, Xiao-Hua Zhou Sep 2004

Estimating The Retransformed Mean In A Heteroscedastic Two-Part Model, Alan H. Welsh, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Two distribution free estimators are proposed to estimate the mean of a dependent variable after fitting a semiparametric two-part heteroscedastic regression model to a transformation of the dependent variable. We show that the proposed estimators are consistent and have asymptotic normal distributions. We also compare their finite-sample performance in a simulation study. Finally, we illustrate the proposed methods in a real-world example of predicting in-patient health care costs.


History-Adjusted Marginal Structural Models And Statically-Optimal Dynamic Treatment Regimes, Mark J. Van Der Laan, Maya L. Petersen Sep 2004

History-Adjusted Marginal Structural Models And Statically-Optimal Dynamic Treatment Regimes, Mark J. Van Der Laan, Maya L. Petersen

U.C. Berkeley Division of Biostatistics Working Paper Series

Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a treatment. These models, introduced by Robins, model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at a final time point. However, the utility of these models for some applications has been limited by their inability to incorporate modification of the causal effect of treatment by time-varying covariates. …


A Marginal Model Approach For Analysis Of Multi-Reader Multi-Test Receiver Operating Characteristic (Roc) Data, Xiao Song, Xiao-Hua Zhou Sep 2004

A Marginal Model Approach For Analysis Of Multi-Reader Multi-Test Receiver Operating Characteristic (Roc) Data, Xiao Song, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve is a popular tool to characterize the capabilities of diagnostic tests with continuous or ordinal responses. One common design for assessing the accuracy of diagnostic tests is to have each patient examined by multiple readers with multiple tests; this design is most commonly used in a radiology setting, where the results of diagnostic tests depend on a radiologist's subjective interpretation. The most widely used approach for analyzing data from such a study is the Dorfman-Berbaum-Metz (DBM) method (Dorfman, Berbaum and Metz, 1992) which utilizes a standard analysis of variance (ANOVA) model for the jackknife …


A Hierarchical Multivariate Two-Part Model For Profiling Providers' Effects On Healthcare Charges, John W. Robinson, Scott L. Zeger, Christopher B. Forrest Aug 2004

A Hierarchical Multivariate Two-Part Model For Profiling Providers' Effects On Healthcare Charges, John W. Robinson, Scott L. Zeger, Christopher B. Forrest

Johns Hopkins University, Dept. of Biostatistics Working Papers

Procedures for analyzing and comparing healthcare providers' effects on health services delivery and outcomes have been referred to as provider profiling. In a typical profiling procedure, patient-level responses are measured for clusters of patients treated by providers that in turn, can be regarded as statistically exchangeable. Thus, a hierarchical model naturally represents the structure of the data. When provider effects on multiple responses are profiled, a multivariate model rather than a series of univariate models, can capture associations among responses at both the provider and patient levels. When responses are in the form of charges for healthcare services and sampled …


Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang Jun 2004

Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang

UW Biostatistics Working Paper Series

We compare simple logistic regression with an alternative robust procedure for constructing linear predictors to be used for the two state classification task. Theoritical advantages of the robust procedure over logistic regression are: (i) although it assumes a generalized linear model for the dichotomous outcome variable, it does not require specification of the link function; (ii) it accommodates case-control designs even when the model is not logistic; and (iii) it yields sensible results even when the generalized linear model assumption fails to hold. Surprisingly, we find that the linear predictor derived from the logistic regression likelihood is very robust in …


Seasonal Analyses Of Air Pollution And Mortality In 100 U.S. Cities, Roger D. Peng, Francesca Dominici, Roberto Pastor-Barriuso, Scott L. Zeger, Jonathan M. Samet May 2004

Seasonal Analyses Of Air Pollution And Mortality In 100 U.S. Cities, Roger D. Peng, Francesca Dominici, Roberto Pastor-Barriuso, Scott L. Zeger, Jonathan M. Samet

Johns Hopkins University, Dept. of Biostatistics Working Papers

Time series models relating short-term changes in air pollution levels to daily mortality counts typically assume that the effects of air pollution on the log relative rate of mortality do not vary with time. However, these short-term effects might plausibly vary by season. Changes in the sources of air pollution and meteorology can result in changes in characteristics of the air pollution mixture across seasons. The authors develop Bayesian semi-parametric hierarchical models for estimating time-varying effects of pollution on mortality in multi-site time series studies. The methods are applied to the updated National Morbidity and Mortality Air Pollution Study database …


Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh May 2004

Semiparametic Models And Estimation Procedures For Binormal Roc Curves With Multiple Biomarkers, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

In diagnostic medicine, there is great interest in developing strategies for combining biomarkers in order to optimize classification accuracy. A popular model that has been used for receiver operating characteristic (ROC) curve modelling when one biomarker is available is the binormal model. Extension of the model to accommodate multiple biomarkers has not been considered in this literature. Here, we consider a multivariate binormal framework for combining biomarkers using copula functions that leads to a natural multivariate extension of the binormal model. Estimation in this model will be done using rank-based procedures. We show that the Van der Waerden rank score …


Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh May 2004

Nonparametric And Semiparametric Inference For Models Of Tumor Size And Metastasis, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been some recent work in the statistical literature for modelling the relationship between the size of primary cancers and the occurrences of metastases. While nonparametric methods have been proposed for estimation of the tumor size distribution at which metastatic transition occurs, their asymptotic properties have not been studied. In addition, no testing or regression methods are available so that potential confounders and prognostic factors can be adjusted for. We develop a unified approach to nonparametric and semiparametric analysis of modelling tumor size-metastasis data in this article. An equivalence between the models considered by previous authors with survival data …


Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh May 2004

Model Checking Techniques For Regression Models In Cancer Screening, Debashis Ghosh

The University of Michigan Department of Biostatistics Working Paper Series

There has been much work on developing statistical procedures for associating tumor size with the probability of detecting a metastasis. Recently, Ghosh (2004) developed a unified statistical framework in which equivalences with censored data structures and models for tumor size and metastasis were examined. Based on this framework, we consider model checking techniques for semiparametric regression models in this paper. The procedures are for checking the additive hazards model. Goodness of fit methods are described for assessing functional form of covariates as well as the additive hazards assumption. The finite-sample properties of the methods are assessed using simulation studies.


Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas May 2004

Binary Isotonic Regression Procedures, With Application To Cancer Biomarkers, Debashis Ghosh, Moulinath Banerjee, Pinaki Biswas

The University of Michigan Department of Biostatistics Working Paper Series

There is a lot of interest in the development and characterization of new biomarkers for screening large populations for disease. In much of the literature on diagnostic testing, increased levels of a biomarker correlate with increased disease risk. However, parametric forms are typically used to associate these quantities. In this article, we specify a monotonic relationship between biomarker levels with disease risk. This leads to consideration of a nonparametric regression model for a single biomarker. Estimation results using isotonic regression-type estimators and asymptotic results are given. We also discuss confidence set estimation in this setting and propose three procedures for …


On Corrected Score Approach For Proportional Hazards Model With Covariate Measurement Error, Xiao Song, Yijian Huang May 2004

On Corrected Score Approach For Proportional Hazards Model With Covariate Measurement Error, Xiao Song, Yijian Huang

UW Biostatistics Working Paper Series

In the presence of covariate measurement error with the proportional hazards model, several functional modeling methods have been proposed. These include the conditional score estimator (Tsiatis and Davidian, 2001), the parametric correction estimator (Nakamura, 1992) and the nonparametric correction estimator (Huang and Wang, 2000, 2003) in the order of weaker assumptions on the error. Although they are all consistent, each suffers from potential difficulties with small samples and substantial measurement error. In this article, upon noting that the conditional score and parametric correction estimators are asymptotically equivalent in the case of normal error, we investigate their relative finite sample performance …


A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson Mar 2004

A Bayesian Hierarchical Approach To Multirater Correlated Roc Analysis, Tim Johnson, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

In a common ROC study design, several readers are asked to rate diagnostics of the same cases processed under different modalities. We describe a Bayesian hierarchical model that facilitates the analysis of this study design by explicitly modeling the three sources of variation inherent to it. In so doing, we achieve substantial reductions in the posterior uncertainty associated with estimates of the differences in areas under the estimated ROC curves and corresponding reductions in the mean squared error (MSE) of these estimates. Based on simulation studies, both the widths of confidence intervals and MSE of estimates of differences in the …


A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson Feb 2004

A Bayesian Chi-Squared Test For Goodness Of Fit, Valen Johnson

The University of Michigan Department of Biostatistics Working Paper Series

This article describes an extension of classical x 2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involvesevaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptoti-cally distributed as a x2 random variable on K-1 degrees of freedom, indepen-dently of the dimension of the underlying parameter vector. By averaging over the posterior distribution of this statistic, a global goodness-of-fit diagnostic is obtained. Advantages of this diagnostic{which may be interpreted as the area under an ROC curve{include ease of interpretation, computational conve-nience, and favorable power properties. The proposed …


Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler Feb 2004

Individual Prediction In Prostate Cancer Studies Using A Joint Longitudinal-Survival-Cure Model, Menggang Yu, Jeremy Taylor, Howard M. Sandler

The University of Michigan Department of Biostatistics Working Paper Series

For monitoring patients treated for prostate cancer, Prostate Specific Antigen (PSA) is measured periodically after they receive treatment. Increases in PSA are suggestive of recurrence of the cancer and are used in making decisions about possible new treatments. The data from studies of such patients typically consist of longitudinal PSA measurements, censored event times and baseline covariates. Methods for the combined analysis of both longitudinal and survival data have been developed in recent years, with the main emphasis being on modeling and estimation. We analyze data from a prostate cancer study that has been extended by adding a mixture structure …


Overlap Bias In The Case-Crossover Design, With Application To Air Pollution Exposures, Holly Janes, Lianne Sheppard, Thomas Lumley Jan 2004

Overlap Bias In The Case-Crossover Design, With Application To Air Pollution Exposures, Holly Janes, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

The case-crossover design uses cases only, and compares exposures just prior to the event times to exposures at comparable control, or “referent” times, in order to assess the effect of short-term exposure on the risk of a rare event. It has commonly been used to study the effect of air pollution on the risk of various adverse health events. Proper selection of referents is crucial, especially with air pollution exposures, which are shared, highly seasonal, and often have a long term time trend. Hence, careful referent selection is important to control for time-varying confounders, and in order to ensure that …