Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 17 of 17

Full-Text Articles in Physical Sciences and Mathematics

Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley Dec 2007

Model-Robust Bayesian Regression And The Sandwich Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley

UW Biostatistics Working Paper Series

PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 338 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper338).

In applied regression problems there is often sufficient data for accurate estimation, but standard parametric models do not accurately describe the source of the data, so associated uncertainty estimates are not reliable. We describe a simple Bayesian approach to inference in linear regression that recovers least-squares point estimates while providing correct uncertainty bounds by explicitly recognizing that standard modeling assumptions need not be valid. Our model-robust development parallels frequentist estimating equations and leads to intervals …


Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd Dec 2007

Estimating Sensitivity And Specificity From A Phase 2 Biomarker Study That Allows For Early Termination, Margaret S. Pepe Phd

UW Biostatistics Working Paper Series

Development of a disease screening biomarker involves several phases. In phase 2 its sensitivity and specificity is compared with established thresholds for minimally acceptable performance. Since we anticipate that most candidate markers will not prove to be useful and availability of specimens and funding is limited, early termination of a study is appropriate if accumulating data indicate that the marker is inadequate. Yet, for markers that complete phase 2, we seek estimates of sensitivity and specificity to proceed with the design of subsequent phase 3 studies.

We suggest early stopping criteria and estimation procedures that adjust for bias caused by …


Longitudinal Data With Follow-Up Truncated By Death: Finding A Match Between Analysis Method And Research Aims, Brenda Kurland, Laura Lee Johnson, Paula Diehr Nov 2007

Longitudinal Data With Follow-Up Truncated By Death: Finding A Match Between Analysis Method And Research Aims, Brenda Kurland, Laura Lee Johnson, Paula Diehr

UW Biostatistics Working Paper Series

Diverse analysis approaches have been proposed to distinguish data missing due to death from nonresponse, and to summarize trajectories of longitudinal data truncated by death. We demonstrate how these analysis approaches arise from factorizations of the distribution of longitudinal data and survival information. Models are illustrated using hypothetical data examples (cognitive functioning in older adults, and quality of life under hospice care) and up to 10 annual assessments of longitudinal cognitive functioning data for 3814 participants in an observational study. For unconditional models, deaths do not occur, deaths are independent of the longitudinal response, or the unconditional longitudinal response averages …


A Parametric Roc Model Based Approach For Evaluating The Predictiveness Of Continuous Markers In Case-Control Studies, Ying Huang, Margaret Pepe Nov 2007

A Parametric Roc Model Based Approach For Evaluating The Predictiveness Of Continuous Markers In Case-Control Studies, Ying Huang, Margaret Pepe

UW Biostatistics Working Paper Series

The predictiveness curve shows the population distribution of risk endowed by a marker or risk prediction model. It provides a means for assessing the model's capacity for risk stratification. Methods for making inference about the predictiveness curve have been developed using cross-sectional or cohort data. Here we consider inference based on case-control studies and prior knowledge about prevalence or incidence of the outcome. We exploit the relationship between the ROC curve and the predictiveness curve given disease prevalence. Methods are developed for deriving the predictiveness curve from a parametric ROC model. Estimation of the whole range and of a portion …


Identifiability And Estimation Of Causal Effects In Randomized Trials With Noncompliance And Completely Non-Ignorable Missing-Data, Hua Chen, Zhi Geng, Xiao-Hua Zhou Nov 2007

Identifiability And Estimation Of Causal Effects In Randomized Trials With Noncompliance And Completely Non-Ignorable Missing-Data, Hua Chen, Zhi Geng, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

In this paper we first studied parameter identifiability in randomized clinical trials with noncompliance and missing outcomes. We showed that under certain conditions the parameters of interest were identifiable even under different types of completely non-ignorable missing data, that is, the missing mechanism depends on the outcome.We then derived their maximum likelihood (ML) and moment estimators and evaluated their finite-sample properties in simulation studies in terms of bias, efficiency and robustness. Our sensitive analysis showed the assumed non-ignorable missing- data model had an important impact on the estimated complier average causal effect (CACE) parameter. Our new method provides some new …


Nonparametric And Semiparametric Group Sequential Methods For Comparing Accuracy Of Diagnostic Tests, Liansheng Tang, Scott S. Emerson, Xiao-Hua Zhou Oct 2007

Nonparametric And Semiparametric Group Sequential Methods For Comparing Accuracy Of Diagnostic Tests, Liansheng Tang, Scott S. Emerson, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Comparison of the accuracy of two diagnostic tests using the receiver operating characteristic (ROC) curves from two diagnostic tests has been typically conducted using fixed sample designs. On the other hand, the human experimentation inherent in a comparison of diagnostic modalities argues for periodic monitoring of the accruing data to address many issues related to the ethics and efficiency of the medical study. To date, very little research has been done in the use of sequential sampling plans for comparative ROC studies, even when these studies may use expensive and unsafe diagnostic procedures. In this paper, we propose a nonparametric …


Roc Surfaces In The Presence Of Verification Bias, Yueh-Yun Chi, Xiao-Hua (Andrew) Zhou Sep 2007

Roc Surfaces In The Presence Of Verification Bias, Yueh-Yun Chi, Xiao-Hua (Andrew) Zhou

UW Biostatistics Working Paper Series

In diagnostic medicine, the Receiver Operating Characteristic (ROC) surface is one of the established tools for assessing the accuracy of a diagnostic test in discriminating three disease states, and the volume under the ROC surface has served as a summary index for diagnostic accuracy. In practice, the selection for definitive disease examination may be based on initial test measurements, and induces verification bias in the assessment. We propose here a nonparametric likelihood-based approach to construct the empirical ROC surface in the presence of differential verification, and to estimate the volume under the ROC surface. Estimators of the standard deviation are …


A Censored Multinomial Regression Model For Perinatal Mother To Child Transmission Of Hiv, Charlotte C. Gard, Elizabeth R. Brown Jul 2007

A Censored Multinomial Regression Model For Perinatal Mother To Child Transmission Of Hiv, Charlotte C. Gard, Elizabeth R. Brown

UW Biostatistics Working Paper Series

In studies designed to estimate rates of perinatal mother to child transmission of HIV, HIV assays are scheduled at multiple points in time. Still infection status for some infants at some time points is often unknown, particularly when interim analyses are conducted. Logistic regression and Cox proportional hazards regression are commonly used to estimate covariate-adjusted transmission rates, but their methods for handling missing data may be inadequate. Here, we propose using censored multinomial regression models to estimate cumulative and conditional rates of HIV transmission. Through simulation, we show that the proposed methods perform better than standard logistic models in terms …


Reporting And Interpretation In Genome-Wide Association Studies, Jon Wakefield Jul 2007

Reporting And Interpretation In Genome-Wide Association Studies, Jon Wakefield

UW Biostatistics Working Paper Series

In the context of genome-wide association studies we critique a number of methods that have been suggested for flagging associations for further investigation. The p-value is by far the most commonly used measure, but requires careful calibration when the a priori probability of an association is small, and discards information by not considering the power associated with each test. The q-value is a frequentist method by which the false discovery rate (FDR) may be controlled. We advocate the use of the Bayes factor as a summary of the information in the data with respect to the comparison of the null …


Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin May 2007

Evaluating The Roc Performance Of Markers For Future Events, Margaret Pepe, Yingye Zheng, Yuying Jin

UW Biostatistics Working Paper Series

Receiver operating characteristic (ROC) curves play a central role in the evaluation of biomarkers and tests for disease diagnosis. Predictors for event time outcomes can also be evaluated with ROC curves, but the time lag between marker measurement and event time must be acknowledged. We discuss different definitions of time-dependent ROC curves in the context of real applications. Several approaches have been proposed for estimation. We contrast retrospective versus prospective methods in regards to assumptions and flexibility, including their capacities to incorporate censored data, competing risks and different sampling schemes. Applications to two datasets are presented.


Adjusting For Covariates In Studies Of Diagnostic, Screening, Or Prognostic Markers: An Old Concept In A New Setting, Holly Janes, Margaret Pepe May 2007

Adjusting For Covariates In Studies Of Diagnostic, Screening, Or Prognostic Markers: An Old Concept In A New Setting, Holly Janes, Margaret Pepe

UW Biostatistics Working Paper Series

The concept of covariate adjustment is well established in therapeutic and etiologic studies. However, it has received little attention in the growing area of medical research devoted to the development of markers for disease diagnosis, screening, or prognosis, where classification accuracy, rather than association, is of primary interest. In this paper, we demonstrate the need for covariate adjustment in studies of classification accuracy, discuss methods for adjusting for covariates, and distinguish covariate adjustment from several other related but fundamentally different uses for covariates. We draw analogies and contrasts throughout with studies of association.


Ecologic Studies Revisited, Jon Wakefield May 2007

Ecologic Studies Revisited, Jon Wakefield

UW Biostatistics Working Paper Series

Ecologic studies use data aggregated over groups, rather than data on individuals. Such studies are popular since they may make use of existing data bases, and can offer large exposure variation if based on broad geographical areas. Unfortunately the aggregation of data that defines ecologic studies results in a loss of information that can lead to ecologic bias. Specifically, ecologic bias arises from the inability of ecologic data to characterize within-area variability in exposures and confounders. We describe in detail particular forms of ecologic bias so that their potential impact on any particular study may be assessed. The only way …


Gamma Generalized Linear Models For Pharmacokinetic Data, Ruth Salway, Jon Wakefield May 2007

Gamma Generalized Linear Models For Pharmacokinetic Data, Ruth Salway, Jon Wakefield

UW Biostatistics Working Paper Series

This paper considers the modeling of single dose pharmacoki- netic data. Traditionally, so-called compartmental models have been used to analyze such data. Unfortunately the mean function of such models are sums of exponentials for which inference and computation may not be straightfor- ward. We present an alternative to these models based on generalized linear models, for which desirable statistical properties exist, with a logarithmic link and gamma distribution. The latter has a constant coefficient of variation which is often appropriate for pharmacokinetic data. Inference is convenient from either a likelihood or a Bayesian perspective. We consider models for both single …


Evaluating A Group Sequential Design In The Setting Of Nonproportional Hazards, Daniel L. Gillen, Scott S. Emerson May 2007

Evaluating A Group Sequential Design In The Setting Of Nonproportional Hazards, Daniel L. Gillen, Scott S. Emerson

UW Biostatistics Working Paper Series

Group sequential methods have been widely described and implemented in a clinical trial setting where parametric and semiparametric models are deemed suitable. In these situations, the evaluation of the operating characteristics of a group sequential stopping rule remains relatively straightforward. However, in the presence of nonproportional hazards survival data nonparametric methods are often used, and the evaluation of stopping rules is no longer a trivial task. Specifically, nonparametric test statistics do not necessarily correspond to a parameter of clinical interest, thus making it difficult to characterize alternatives at which operating characteristics are to be computed. We describe an approach for …


Biomarker Evaluation Using The Controls As A Reference Population, Ying Huang, Margaret Pepe Apr 2007

Biomarker Evaluation Using The Controls As A Reference Population, Ying Huang, Margaret Pepe

UW Biostatistics Working Paper Series

The classification accuracy of a continuous marker is typically evaluated with the Receiver Operating Characteristic Curve. In this paper, we study an alternative conceptual framework, the "percentile value". In particular the controls only provide a reference distribution to standardize the marker. The analysis proceeds by analyzing the standardized marker only in cases. The approach is shown to be equivalent to ROC analysis. Advantages are that it provides a framework more familiar to biostatisticians and it opens up avenues for new statistical techniques in biomarker evaluation. We develop several new procedures based on this framework for comparing biomarkers and for comparing …


What Is The Best Reference Rna? And Other Questions Regarding The Design And Analysis Of Two-Color Microarray Experiments, Kathleen F. Kerr, Kyle A. Serikawa, Caimiao Wei, Mette A. Peters, Roger E. Bumgarner Apr 2007

What Is The Best Reference Rna? And Other Questions Regarding The Design And Analysis Of Two-Color Microarray Experiments, Kathleen F. Kerr, Kyle A. Serikawa, Caimiao Wei, Mette A. Peters, Roger E. Bumgarner

UW Biostatistics Working Paper Series

The reference design is a practical and popular choice for microarray studies using two-color platforms. In the reference design, the reference RNA uses half of all array resources, leading investigators to ask: What is the best reference RNA? We propose a novel method for evaluating reference RNAs and present the results of an experiment that was specially designed to evaluate three common choices of reference RNA. We found no compelling evidence in favor of any particular reference. In particular, a commercial reference showed no advantage in our data. Our experimental design also enabled a new way to test the effectiveness …


Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond Feb 2007

Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond

UW Biostatistics Working Paper Series

Whole-genome studies are becoming a mainstay of biomedical research. Examples include expression array experiments, comparative genomic hybridization analyses and large case-control studies for detecting polymorphism/disease associations. The tactic of applying a regression model to every locus to obtain test statistics is useful in such studies. However, this approach ignores potential correlation structure in the data that could be used to gain power, particularly when a Bonferroni correction is applied to adjust for multiple testing. In this article, we propose using regression techniques for misspecified multivariate outcomes to increase statistical power over independence-based modeling at each locus. Even when the outcome …