Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 53

Full-Text Articles in Entire DC Network

Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall Nov 2018

Concentrations Of Criteria Pollutants In The Contiguous U.S., 1979 – 2015: Role Of Model Parsimony In Integrated Empirical Geographic Regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, Julian D. Marshall

UW Biostatistics Working Paper Series

BACKGROUND: National- or regional-scale prediction models that estimate individual-level air pollution concentrations commonly include hundreds of geographic variables. However, these many variables may not be necessary and parsimonious approach including small numbers of variables may achieve sufficient prediction ability. This parsimonious approach can also be applied to most criteria pollutants. This approach will be powerful when generating publicly available datasets of model predictions that support research in environmental health and other fields. OBJECTIVES: We aim to (1) build annual-average integrated empirical geographic (IEG) regression models for the contiguous U.S. for six criteria pollutants, for all years with regulatory monitoring data …


Estimation Of Long-Term Area-Average Pm2.5 Concentrations For Area-Level Health Analyses, Sun-Young Kim, Casey Olives, Neal Fann, Joel Kaufman, Sverre Vedal, Lianne Sheppard Jul 2016

Estimation Of Long-Term Area-Average Pm2.5 Concentrations For Area-Level Health Analyses, Sun-Young Kim, Casey Olives, Neal Fann, Joel Kaufman, Sverre Vedal, Lianne Sheppard

UW Biostatistics Working Paper Series

Introduction: There is increasing evidence of an association between individual long-term PM2.5 exposure and human health. Mortality and morbidity data collected at the area-level are valuable resources for investigating corresponding population-level health effects. However, PM2.5 monitoring data are available for limited periods of time and locations, and are not adequate for estimating area-level concentrations. We developed a general approach to estimate county-average concentrations representative of population exposures for 1980-2010 in the continental U.S.

Methods: We predicted annual average PM2.5 concentrations at about 70,000 census tract centroids, using a point prediction model previously developed for estimating annual average …


Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret Jan 2016

Models For Hsv Shedding Must Account For Two Levels Of Overdispersion, Amalia Magaret

UW Biostatistics Working Paper Series

We have frequently implemented crossover studies to evaluate new therapeutic interventions for genital herpes simplex virus infection. The outcome measured to assess the efficacy of interventions on herpes disease severity is the viral shedding rate, defined as the frequency of detection of HSV on the genital skin and mucosa. We performed a simulation study to ascertain whether our standard model, which we have used previously, was appropriately considering all the necessary features of the shedding data to provide correct inference. We simulated shedding data under our standard, validated assumptions and assessed the ability of 5 different models to reproduce the …


Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty Sep 2013

Net Reclassification Index: A Misleading Measure Of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, Bruce M. Psaty

UW Biostatistics Working Paper Series

The evaluation of biomarkers to improve risk prediction is a common theme in modern research. Since its introduction in 2008, the net reclassification index (NRI) (Pencina et al. 2008, Pencina et al. 2011) has gained widespread use as a measure of prediction performance with over 1,200 citations as of June 30, 2013. The NRI is considered by some to be more sensitive to clinically important changes in risk than the traditional change in the AUC (Delta AUC) statistic (Hlatky et al. 2009). Recent statistical research has raised questions, however, about the validity of conclusions based on the NRI. (Hilden and …


Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe Aug 2013

Net Reclassification Indices For Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn Mcclelland, Bruce M. Psaty, Margaret S. Pepe

UW Biostatistics Working Paper Series

Background Net Reclassification Indices (NRI) have recently become popular statistics for measuring the prediction increment of new biomarkers.

Methods In this review, we examine the various types of NRI statistics and their correct interpretations. We evaluate the advantages and disadvantages of the NRI approach. For pre-defined risk categories, we relate NRI to existing measures of the prediction increment. We also consider statistical methodology for constructing confidence intervals for NRI statistics and evaluate the merits of NRI-based hypothesis testing.

Conclusions Investigators using NRI statistics should report them separately for events (cases) and nonevents (controls). When there are two risk categories, the …


The Net Reclassification Index (Nri): A Misleading Measure Of Prediction Improvement With Miscalibrated Or Overfit Models, Margaret Pepe, Jin Fang, Ziding Feng, Thomas Gerds, Jorgen Hilden Mar 2013

The Net Reclassification Index (Nri): A Misleading Measure Of Prediction Improvement With Miscalibrated Or Overfit Models, Margaret Pepe, Jin Fang, Ziding Feng, Thomas Gerds, Jorgen Hilden

UW Biostatistics Working Paper Series

The Net Reclassification Index (NRI) is a very popular measure for evaluating the improvement in prediction performance gained by adding a marker to a set of baseline predictors. However, the statistical properties of this novel measure have not been explored in depth. We demonstrate the alarming result that the NRI statistic calculated on a large test dataset using risk models derived from a training set is likely to be positive even when the new marker has no predictive information. A related theoretical example is provided in which a miscalibrated risk model that includes an uninformative marker is proven to erroneously …


A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman Dec 2012

A Regionalized National Universal Kriging Model Using Partial Least Squares Regression For Estimating Annual Pm2.5 Concentrations In Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, Joel Kaufman

UW Biostatistics Working Paper Series

Many cohort studies in environmental epidemiology require accurate modeling and prediction of fine scale spatial variation in ambient air quality across the U.S. This modeling requires the use of small spatial scale geographic or “land use” regression covariates and some degree of spatial smoothing. Furthermore, the details of the prediction of air quality by land use regression and the spatial variation in ambient air quality not explained by this regression should be allowed to vary across the continent due to the large scale heterogeneity in topography, climate, and sources of air pollution. This paper introduces a regionalized national universal kriging …


Using The Stages Of Change Model To Choose An Optimal Health Marketing Target, Paula Diehr, Peggy A. Hannon, Barbara Pizacani, Mark Forehand, Jeffrey Harris, Hendrika Meischke, Susan J. Curry, Diane P. Martin, Marcia R. Weaver Mar 2010

Using The Stages Of Change Model To Choose An Optimal Health Marketing Target, Paula Diehr, Peggy A. Hannon, Barbara Pizacani, Mark Forehand, Jeffrey Harris, Hendrika Meischke, Susan J. Curry, Diane P. Martin, Marcia R. Weaver

UW Biostatistics Working Paper Series

Background: In the transtheoretical model of behavior change, “stages of change” are defined as Precontemplation (not even thinking about changing), Contemplation, Preparation, Action, and Maintenance (maintaining the behavior change). Marketing principles suggest that efforts should be targeted at persons most likely to “buy the product.”

Objectives: To examine the effect of intervening at different stages in populations of smokers, with various numbers of people in each “stage of change.” One type of intervention would increase by 10% the probability of a person moving to the next higher stage of change, such as from Precontemplation to Contemplation. The second type would …


Influence Of Prediction Approaches For Spatially-Dependent Air Pollution Exposure On Health Effect Estimation, Sun-Young Kim, Lianne Sheppard, Ho Kim Jun 2008

Influence Of Prediction Approaches For Spatially-Dependent Air Pollution Exposure On Health Effect Estimation, Sun-Young Kim, Lianne Sheppard, Ho Kim

UW Biostatistics Working Paper Series

Background: Air pollution studies increasingly estimate individual-level exposures from area-based measurements by using exposure prediction methods such as nearest monitor and kriging predictions. However, little is known about the properties of these methods for health effects estimation. This simulation study explores how two common prediction approaches for fine particulate matter (PM2.5) affect relative risk estimates for cardiovascular events in a single geographic area.

Methods: We estimated two sets of parameters to define correlation structures from 2002 PM2.5 data in the Los Angeles (LA) area and selected additional parameters to evaluate different correlation features. For each structure, annual average PM2.5 was …


Adjusting For Covariates In Studies Of Diagnostic, Screening, Or Prognostic Markers: An Old Concept In A New Setting, Holly Janes, Margaret Pepe May 2007

Adjusting For Covariates In Studies Of Diagnostic, Screening, Or Prognostic Markers: An Old Concept In A New Setting, Holly Janes, Margaret Pepe

UW Biostatistics Working Paper Series

The concept of covariate adjustment is well established in therapeutic and etiologic studies. However, it has received little attention in the growing area of medical research devoted to the development of markers for disease diagnosis, screening, or prognosis, where classification accuracy, rather than association, is of primary interest. In this paper, we demonstrate the need for covariate adjustment in studies of classification accuracy, discuss methods for adjusting for covariates, and distinguish covariate adjustment from several other related but fundamentally different uses for covariates. We draw analogies and contrasts throughout with studies of association.


Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond Feb 2007

Power Boosting In Genome-Wide Studies Via Methods For Multivariate Outcomes, Mary J. Emond

UW Biostatistics Working Paper Series

Whole-genome studies are becoming a mainstay of biomedical research. Examples include expression array experiments, comparative genomic hybridization analyses and large case-control studies for detecting polymorphism/disease associations. The tactic of applying a regression model to every locus to obtain test statistics is useful in such studies. However, this approach ignores potential correlation structure in the data that could be used to gain power, particularly when a Bonferroni correction is applied to adjust for multiple testing. In this article, we propose using regression techniques for misspecified multivariate outcomes to increase statistical power over independence-based modeling at each locus. Even when the outcome …


Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd Oct 2006

Statistical Analysis Of Air Pollution Panel Studies: An Illustration, Holly Janes, Lianne Sheppard, Kristen Shepherd

UW Biostatistics Working Paper Series

The panel study design is commonly used to evaluate the short-term health effects of air pollution. Standard statistical methods for analyzing longitudinal data are available, but the literature reveals that the techniques are not well understood by practitioners. We illustrate these methods using data from the 1999 to 2002 Seattle panel study. Marginal, conditional, and transitional approaches for modeling longitudinal data are reviewed and contrasted with respect to their parameter interpretation and methods for accounting for correlation and dealing with missing data. We also discuss and illustrate techniques for controlling for time-dependent and time-independent confounding, and for exploring and summarizing …


Assessing The Adequacy Of Variance Function In Heteroscedastic Regression Models, Lan Wang, Xiao-Hua Andrew Zhou Sep 2006

Assessing The Adequacy Of Variance Function In Heteroscedastic Regression Models, Lan Wang, Xiao-Hua Andrew Zhou

UW Biostatistics Working Paper Series

Heteroscedastic data arise in many applications. In a heteroscedastic regression model, the variance is often taken as a parametric function of the covariate or the regression mean. This paper presents a kernel-smoothing based nonparametric test for checking the adequacy of such a postulated variance structure. The test does not need to specify a parametric distribution for the random errors. It has an asymptotical normal distribution under the null hypothesis and is powerful against a large class of alternatives. Numerical simulations and an illustrative example are provided.


Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma Jul 2006

Relative Risk Regression In Medical Research: Models, Contrasts, Estimators, And Algorithms, Thomas Lumley, Richard Kronmal, Shuangge Ma

UW Biostatistics Working Paper Series

The relative risk or prevalence ratio is a natural and familiar summary of association between a binary outcome and an exposure or intervention. For rare events, the relative risk can be approximately estimated by logistic regression. For common events estimation is more difficult. We review proposed estimation algorithms for relative risk regression. Some of these give inconsistent estimates or invalid standard errors. We show that the methods that give correct inference can be viewed as arising from a family of quasilikelihood estimating functions for the same generalized linear model, differing in their efficiency and in their robustness to outlying values …


Hierarchical Models For Combining Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield May 2006

Hierarchical Models For Combining Ecological And Case-Control Data, Sebastien Haneuse, Jon Wakefield

UW Biostatistics Working Paper Series

The ecological study design suffers from a broad range of biases that result from the loss of information regarding the joint distribution of individual-level outcomes, exposures and confounders. The consequent non-identifiability of individual-level models cannot be overcome without additional information; we combine ecological data with a sample of individual-level case-control data. The focus of this paper is hierarchical models to account for between-group heterogeneity. Estimation and inference pose serious compu- tational challenges. We present a Bayesian implementation, based on a data augmentation scheme where the unobserved data are treated as auxiliary variables. The methods are illustrated with a dataset of …


Disease Mapping And Spatial Regression With Count Data, Jon Wakefield May 2006

Disease Mapping And Spatial Regression With Count Data, Jon Wakefield

UW Biostatistics Working Paper Series

In this paper we provide critical reviews of methods suggested for the analysis of aggregate count data in the context of disease mapping and spatial regression. We introduce a new method for picking prior distributions, and propose a number of refinements of previously-used models. We also consider ecological bias, mutual standardization, and choice of both spatial model and prior specification. We analyze male lip cancer incidence data collected in Scotland over the period 1975–1980, and outline a number of problems with previous analyses of these data. A number of recommendations are provided. In disease mapping studies, hierarchical models can provide …


Reliability, Effect Size, And Responsiveness And Intraclass Correlation Of Health Status Measures Used In Randomized And Cluster-Randomized Trials, Paula Diehr, Lu Chen, Donald L. Patrick, Ziding Feng, Yutaka Yasui Mar 2006

Reliability, Effect Size, And Responsiveness And Intraclass Correlation Of Health Status Measures Used In Randomized And Cluster-Randomized Trials, Paula Diehr, Lu Chen, Donald L. Patrick, Ziding Feng, Yutaka Yasui

UW Biostatistics Working Paper Series

Background: New health status instruments are described by psychometric properties, such as Reliability, Effect Size, and Responsiveness. For cluster-randomized trials, another important statistic is the Intraclass Correlation for the instrument within clusters. Studies using better instruments can be performed with smaller sample sizes, but better instruments may be more expensive in terms of dollars, lost opportunities, or poorer data quality due to the response burden of longer instruments. Investigators often need to estimate the psychometric properties of a new instrument, or of an established instrument in a new setting. Optimal sample sizes for estimating these properties have not been studied …


Different Public Health Interventions Have Varying Effects, Paula Diehr, Anne B. Newman, Liming Cai, Ann Derleth Feb 2006

Different Public Health Interventions Have Varying Effects, Paula Diehr, Anne B. Newman, Liming Cai, Ann Derleth

UW Biostatistics Working Paper Series

Objective: To compare performance of one-time health interventions to those that change the probability of transitioning from one health state to another. Study Design and Setting: We used multi-state life table methods to estimate the impact of eight types of interventions on several outcomes. Results: In a cohort beginning at age 65, curing all the sick persons at baseline would increase life expectancy by 0.23 years and increase years of healthy life by .54 years. An equal amount of improvement could be obtained with a 12% decrease in the probability of getting sick, a 16% increase in the probability of …


A Hybrid Model For Reducing Ecological Bias, Ruth Salway, Jon Wakefield Dec 2005

A Hybrid Model For Reducing Ecological Bias, Ruth Salway, Jon Wakefield

UW Biostatistics Working Paper Series

A major drawback of epidemiological ecological studies, in which the association between area-level summaries of risk and exposure are used to make inference about individual risk, is the difficulty in characterising within-area variability in exposure and confounder variables. To avoid ecological bias, samples of individual exposure/confounder data within each area are required. Unfortunately these may be difficult or expensive to obtain, particularly if large samples are required. In this paper we propose a new approach suitable for use with small samples. We combine a Bayesian non-parametric Dirichlet process prior with an estimating functions approach, and show that this model gives …


Health-Exposure Modelling And The Ecological Fallacy, Jon Wakefield, Gavin Shaddick Dec 2005

Health-Exposure Modelling And The Ecological Fallacy, Jon Wakefield, Gavin Shaddick

UW Biostatistics Working Paper Series

Recently there has been increased interest in modelling the association between aggregate disease counts and environmental exposures measured, for example via air pollution monitors, at point locations. This paper has two aims: first we develop a model for such data in order to avoid ecological bias; second we illustrate that modelling the exposure surface and estimating exposures may lead to bias in estimation of health effects. Design issues are also briefly considered, in particular the loss of information in moving from individual to ecological data, and the at-risk populations to consider in relation to the pollution monitor locations. The approach …


Is The Number Of Sick Persons In A Cohort Constant Over Time?, Paula Diehr, Ann Derleth, Anne Newman, Liming Cai Oct 2005

Is The Number Of Sick Persons In A Cohort Constant Over Time?, Paula Diehr, Ann Derleth, Anne Newman, Liming Cai

UW Biostatistics Working Paper Series

Objectives: To estimate the number of persons in a cohort who are sick, over time.

Methods: We calculated the number of sick persons in the Cardiovascular Health Study (CHS), a cohort study of older adults followed up to 14 years, using eight definitions of “healthy” and “sick”. We projected the number in each health state over time for a birth cohort.

Results: The number of sick persons in CHS was approximately constant for 14 years, for all definitions of “sick”. The estimated number of sick persons in the birth cohort was approximately constant from ages 55-75, after which it decreased. …


Attributable Risk Function In The Proportional Hazards Model, Ying Qing Chen, Chengcheng Hu, Yan Wang May 2005

Attributable Risk Function In The Proportional Hazards Model, Ying Qing Chen, Chengcheng Hu, Yan Wang

UW Biostatistics Working Paper Series

As an epidemiological parameter, the population attributable fraction is an important measure to quantify the public health attributable risk of an exposure to morbidity and mortality. In this article, we extend this parameter to the attributable fraction function in survival analysis of time-to-event outcomes, and further establish its estimation and inference procedures based on the widely used proportional hazards models. Numerical examples and simulations studies are presented to validate and demonstrate the proposed methods.


Insights Into Latent Class Analysis, Margaret S. Pepe, Holly Janes Jan 2005

Insights Into Latent Class Analysis, Margaret S. Pepe, Holly Janes

UW Biostatistics Working Paper Series

Latent class analysis is a popular statistical technique for estimating disease prevalence and test sensitivity and specificity. It is used when a gold standard assessment of disease is not available but results of multiple imperfect tests are. We derive analytic expressions for the parameter estimates in terms of the raw data, under the conditional independence assumption. These expressions indicate explicitly how observed two- and three-way associations between test results are used to infer disease prevalence and test operating characteristics. Although reasonable if the conditional independence model holds, the estimators have no basis when it fails. We therefore caution against using …


Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton Jan 2005

Standardizing Markers To Evaluate And Compare Their Performances, Margaret S. Pepe, Gary M. Longton

UW Biostatistics Working Paper Series

Introduction: Markers that purport to distinguish subjects with a condition from those without a condition must be evaluated rigorously for their classification accuracy. A single approach to statistically evaluating and comparing markers is not yet established.

Methods: We suggest a standardization that uses the marker distribution in unaffected subjects as a reference. For an affected subject with marker value Y, the standardized placement value is the proportion of unaffected subjects with marker values that exceed Y.

Results: We apply the standardization to two illustrative datasets. In patients with pancreatic cancer placement values calculated for the CA 19-9 marker are smaller …


Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang, Gary M. Longton Jan 2005

Combining Predictors For Classification Using The Area Under The Roc Curve, Margaret S. Pepe, Tianxi Cai, Zheng Zhang, Gary M. Longton

UW Biostatistics Working Paper Series

No single biomarker for cancer is considered adequately sensitive and specific for cancer screening. It is expected that the results of multiple markers will need to be combined in order to yield adequately accurate classification. Typically the objective function that is optimized for combining markers is the likelihood function. In this paper we consider an alternative objective function -- the area under the empirical receiver operating characteristic curve (AUC). We note that it yields consistent estimates of parameters in a generalized linear model for the risk score but does not require specifying the link function. Like logistic regression it yields …


Referent Selection Strategies In Case-Crossover Analyses Of Air Pollution Exposure Data: Implications For Bias, Holly Janes, Lianne Sheppard, Thomas Lumley Dec 2004

Referent Selection Strategies In Case-Crossover Analyses Of Air Pollution Exposure Data: Implications For Bias, Holly Janes, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

The case-crossover design has been widely used to study the association between short term air pollution exposure and the risk of an acute adverse health event. The design uses cases only, and, for each individual, compares exposure just prior to the event with exposure at other control, or “referent” times. By making within-subject comparisons, time invariant confounders are controlled by design. Even more important in the air pollution setting is that, by matching referents to the index time, time varying confounders can also be controlled by design. Yet, the referent selection strategy is important for reasons other than control of …


Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang Dec 2004

Semi-Parametric Single-Index Two-Part Regression Models, Xiao-Hua Zhou, Hua Liang

UW Biostatistics Working Paper Series

In this paper, we proposed a semi-parametric single-index two-part regression model to weaken assumptions in parametric regression methods that were frequently used in the analysis of skewed data with additional zero values. The estimation procedure for the parameters of interest in the model was easily implemented. The proposed estimators were shown to be consistent and asymptotically normal. Through a simulation study, we showed that the proposed estimators have reasonable finite-sample performance. We illustrated the application of the proposed method in one real study on the analysis of health care costs.


Estimating The Retransformed Mean In A Heteroscedastic Two-Part Model, Alan H. Welsh, Xiao-Hua Zhou Sep 2004

Estimating The Retransformed Mean In A Heteroscedastic Two-Part Model, Alan H. Welsh, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

Two distribution free estimators are proposed to estimate the mean of a dependent variable after fitting a semiparametric two-part heteroscedastic regression model to a transformation of the dependent variable. We show that the proposed estimators are consistent and have asymptotic normal distributions. We also compare their finite-sample performance in a simulation study. Finally, we illustrate the proposed methods in a real-world example of predicting in-patient health care costs.


A Marginal Model Approach For Analysis Of Multi-Reader Multi-Test Receiver Operating Characteristic (Roc) Data, Xiao Song, Xiao-Hua Zhou Sep 2004

A Marginal Model Approach For Analysis Of Multi-Reader Multi-Test Receiver Operating Characteristic (Roc) Data, Xiao Song, Xiao-Hua Zhou

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve is a popular tool to characterize the capabilities of diagnostic tests with continuous or ordinal responses. One common design for assessing the accuracy of diagnostic tests is to have each patient examined by multiple readers with multiple tests; this design is most commonly used in a radiology setting, where the results of diagnostic tests depend on a radiologist's subjective interpretation. The most widely used approach for analyzing data from such a study is the Dorfman-Berbaum-Metz (DBM) method (Dorfman, Berbaum and Metz, 1992) which utilizes a standard analysis of variance (ANOVA) model for the jackknife …


Nonparametric Confidence Intervals For The One- And Two-Sample Problems, Xiao-Hua Zhou, Phillip Dinh Sep 2004

Nonparametric Confidence Intervals For The One- And Two-Sample Problems, Xiao-Hua Zhou, Phillip Dinh

UW Biostatistics Working Paper Series

Confidence intervals for the mean of one sample and the difference in means of two independent samples based on the ordinary-t statistic suffer deficiencies when samples come from skewed distributions. In this article, we evaluate several existing techniques and propose new methods to improve coverage accuracy. The methods examined include the ordinary-t, the bootstrap-t, the biased-corrected acceleration (BCa) bootstrap, and three new intervals based on transformation of the t-statistic. Our study shows that our new transformation intervals and the bootstrap-t intervals give best coverage accuracy for a variety of skewed distributions; and that our new transformation intervals have shorter interval …