Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

2003

Statistics and Probability

PDF

Institution
Keyword
Publication
Publication Type

Articles 1 - 30 of 248

Full-Text Articles in Physical Sciences and Mathematics

Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin Dec 2003

Robust Likelihood-Based Analysis Of Multivariate Data With Missing Values, Rod Little, An Hyonggin

The University of Michigan Department of Biostatistics Working Paper Series

The model-based approach to inference from multivariate data with missing values is reviewed. Regression prediction is most useful when the covariates are predictive of the missing values and the probability of being missing, and in these circumstances predictions are particularly sensitive to model misspecification. The use of penalized splines of the propensity score is proposed to yield robust model-based inference under the missing at random (MAR) assumption, assuming monotone missing data. Simulation comparisons with other methods suggest that the method works well in a wide range of populations, with little loss of efficiency relative to parametric models when the latter …


Uncertainty And The Value Of Diagnostic Information With Application To Axillary Lymph Node Dissection In Breast Cancer, Giovanni Parmigiani Dec 2003

Uncertainty And The Value Of Diagnostic Information With Application To Axillary Lymph Node Dissection In Breast Cancer, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

In clinical decision making, it is common to ask whether, and how much, a diagnostic procedure is contributing to subsequent treatment decisions. Statistically, quantification of the value of the information provided by a diagnostic procedure can be carried out using decision trees with multiple decision points, representing both the diagnostic test and the subsequent treatments that may depend on the test's results. This article investigates probabilistic sensitivity analysis approaches for exploring and communicating parameter uncertainty in such decision trees. Complexities arise because uncertainty about a model's inputs determines uncertainty about optimal decisions at all decision nodes of a tree. We …


Marginalized Transition Models For Longitudinal Binary Data With Ignorable And Nonignorable Dropout, Brenda F. Kurland, Patrick J. Heagerty Dec 2003

Marginalized Transition Models For Longitudinal Binary Data With Ignorable And Nonignorable Dropout, Brenda F. Kurland, Patrick J. Heagerty

UW Biostatistics Working Paper Series

We extend the marginalized transition model of Heagerty (2002) to accommodate nonignorable monotone dropout. Using a selection model, weakly identified dropout parameters are held constant and their effects evaluated through sensitivity analysis. For data missing at random (MAR), efficiency of inverse probability of censoring weighted generalized estimating equations (IPCW-GEE) is as low as 40% compared to a likelihood-based marginalized transition model (MTM) with comparable modeling burden. MTM and IPCW-GEE regression parameters both display misspecification bias for MAR and nonignorable missing data, and both reduce bias noticeably by improving model fit


Marginal Modeling Of Multilevel Binary Data With Time-Varying Covariates, Diana Miglioretti, Patrick Heagerty Dec 2003

Marginal Modeling Of Multilevel Binary Data With Time-Varying Covariates, Diana Miglioretti, Patrick Heagerty

UW Biostatistics Working Paper Series

We propose and compare two approaches for regression analysis of multilevel binary data when clusters are not necessarily nested: a GEE method that relies on a working independence assumption coupled with a three-step method for obtaining empirical standard errors; and a likelihood-based method implemented using Bayesian computational techniques. Implications of time-varying endogenous covariates are addressed. The methods are illustrated using data from the Breast Cancer Surveillance Consortium to estimate mammography accuracy from a repeatedly screened population.


Survival Model Predictive Accuracy And Roc Curves, Patrick Heagerty, Yingye Zheng Dec 2003

Survival Model Predictive Accuracy And Roc Curves, Patrick Heagerty, Yingye Zheng

UW Biostatistics Working Paper Series

The predictive accuracy of a survival model can be summarized using extensions of the proportion of variation explained by the model, or R^2, commonly used for continuous response models, or using extensions of sensitivity and specificity which are commonly used for binary response models.

In this manuscript we propose new time-dependent accuracy summaries based on time-specific versions of sensitivity and specificity calculated over risk sets. We connect the accuracy summaries to a previously proposed global concordance measure which is a variant of Kendall's tau. In addition, we show how standard Cox regression output can be used to obtain estimates of …


Partly Conditional Survival Models For Longitudinal Data, Yingye Zheng, Patrick Heagerty Dec 2003

Partly Conditional Survival Models For Longitudinal Data, Yingye Zheng, Patrick Heagerty

UW Biostatistics Working Paper Series

It is common in longitudinal studies to collect information on the time until a key clinical event, such as death, and to measure markers of patient health at multiple follow-up times. One approach to the joint analysis of survival and repeated measures data adopts a time-varying covariate regression model for the event time hazard. Using this standard approach the instantaneous risk of death at time t is specified as a possibly semi-parametric function of covariate information that has accrued through time t. In this manuscript we decouple the time scale for modeling the hazard from the time scale for accrual …


Semiparametric Estimation Of Time-Dependent: Roc Curves For Longitudinal Marker Data, Yingye Zheng, Patrick Heagerty Dec 2003

Semiparametric Estimation Of Time-Dependent: Roc Curves For Longitudinal Marker Data, Yingye Zheng, Patrick Heagerty

UW Biostatistics Working Paper Series

One approach to evaluating the strength of association between a longitudinal marker process and a key clinical event time is through predictive regression methods such as a time-dependent covariate hazard model. For example, a time-varying covariate Cox model specifies the instantaneous risk of the event as a function of the time-varying marker and additional covariates. In this manuscript we explore a second complementary approach which characterizes the distribution of the marker as a function of both the measurement time and the ultimate event time. Our goal is to flexibly extend the standard diagnostic accuracy concepts of sensitivity and specificity to …


Comparison Of The Inverse Probability Of Treatment Weighted (Iptw) Estimator With A Naïve Estimator In The Analysis Of Longitudinal Data With Time-Dependent Confounding: A Simulation Study, Thaddeus Haight, Romain Neugebauer, Ira B. Tager, Mark J. Van Der Laan Dec 2003

Comparison Of The Inverse Probability Of Treatment Weighted (Iptw) Estimator With A Naïve Estimator In The Analysis Of Longitudinal Data With Time-Dependent Confounding: A Simulation Study, Thaddeus Haight, Romain Neugebauer, Ira B. Tager, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

A simulation study was conducted to compare estimates from a naïve estimator, using standard conditional regression, and an IPTW (Inverse Probability of Treatment Weighted) estimator, to true causal parameters for a given MSM (Marginal Structural Model). The study was extracted from a larger epidemiological study (Longitudinal Study of Effects of Physical Activity and Body Composition on Functional Limitation in the Elderly, by Tager et. al [accepted, Epidemiology, September 2003]), which examined the causal effects of physical activity and body composition on functional limitation. The simulation emulated the larger study in terms of the exposure and outcome variables of interest-- physical …


Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard Dec 2003

Multiple Testing. Part Ii. Step-Down Procedures For Control Of The Family-Wise Error Rate, Mark J. Van Der Laan, Sandrine Dudoit, Katherine S. Pollard

U.C. Berkeley Division of Biostatistics Working Paper Series

The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate (FWER): the first procedure is based on maxima of test statistics (step-down maxT), while the second relies on minima of unadjusted p-values (step-down minP). A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the …


Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard Dec 2003

Multiple Testing. Part I. Single-Step Procedures For Control Of General Type I Error Rates, Sandrine Dudoit, Mark J. Van Der Laan, Katherine S. Pollard

U.C. Berkeley Division of Biostatistics Working Paper Series

The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate. A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which single-step common-quantile and common-cut-off procedures asymptotically …


Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng Dec 2003

Loss-Based Estimation With Cross-Validation: Applications To Microarray Data Analysis And Motif Finding, Sandrine Dudoit, Mark J. Van Der Laan, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, Siew Leng Teng

U.C. Berkeley Division of Biostatistics Working Paper Series

Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions, with typically unknown and intricate correlation patterns among variables. Addressing these inference questions satisfactorily requires: (i) an intensive and thorough search of the parameter space to generate good candidate estimators, (ii) an approach for selecting an optimal estimator among these candidates, and (iii) a method for reliably assessing the performance of the resulting estimator. We propose a unified loss-based methodology for estimator construction, selection, and performance assessment with cross-validation. In this approach, the parameter of interest is defined as the risk minimizer for a suitable …


Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang Dec 2003

Kernel Estimation Of Rate Function For Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, Chiung-Yu Huang

Johns Hopkins University, Dept. of Biostatistics Working Papers

Recurrent event data are largely characterized by the rate function but smoothing techniques for estimating the rate function have never been rigorously developed or studied in statistical literature. This paper considers the moment and least squares methods for estimating the rate function from recurrent event data. With an independent censoring assumption on the recurrent event process, we study statistical properties of the proposed estimators and propose bootstrap procedures for the bandwidth selection and for the approximation of confidence intervals in the estimation of the occurrence rate function. It is identified that the moment method without resmoothing via a smaller bandwidth …


The Relation Of Dietary Patterns To Future Survival, Health, And Cardiovascular Events In Older Adults, Paula Diehr Dec 2003

The Relation Of Dietary Patterns To Future Survival, Health, And Cardiovascular Events In Older Adults, Paula Diehr

Paula Diehr

BACKGROUND: There have been few long-term follow-up studies of older adults who follow different dietary patterns. METHODS: We cluster-analyzed data on dietary fat, fiber, protein, carbohydrate, and calorie consumption from the U.S. Cardiovascular Health Study (mean age=73), and examined the relationship of the dietary clusters to outcomes 10 years later. RESULTS: The five clusters were named "Healthy diet" (relatively high in fiber and carbohydrate and low in fat), "Unhealthy diet" (relatively high in protein and fat, relatively low in carbohydrates and fiber); "High Calorie," "Low Calorie," and "Low 4," which was distinguished by higher alcohol consumption. The clusters were strongly …


Optimization Of Breast Cancer Screening Modalities, Yu Shen, Giovanni Parmigiani Dec 2003

Optimization Of Breast Cancer Screening Modalities, Yu Shen, Giovanni Parmigiani

Johns Hopkins University, Dept. of Biostatistics Working Papers

Mathematical models and decision analyses based on microsimulations have been shown to be useful in evaluating relative merits of various screening strategies in terms of cost and mortality reduction. Most investigations regarding the balance between mortality reduction and costs have focused on a single modality, mammography. A systematic evaluation of the relative expenses and projected benefit of combining clinical breast examination and mammography is not at present available. The purpose of this report is to provide methodologic details including assumptions and data used in the process of modeling for complex decision analyses, when searching for optimal breast cancer screening strategies …


Modeling The Incubation Period Of Anthrax, Ron Brookmeyer, Elizabeth Johnson, Sarah Barry Dec 2003

Modeling The Incubation Period Of Anthrax, Ron Brookmeyer, Elizabeth Johnson, Sarah Barry

Johns Hopkins University, Dept. of Biostatistics Working Papers

Models of the incubation period of anthrax are important to public health planners because they can be used to predict the delay before outbreaks are detected, the size of an outbreak and the duration of time that persons should remain on antibiotics to prevent disease. The difficulty is that there is little direct data about the incubation period in humans. The objective of this paper is to develop and apply models for the incubation period of anthrax. Mechanistic models that account for the biology of spore clearance and germination are developed based on a competing risks formulation. The models predict …


Global Solutions To The Lake Equations With Isolated Vortex Regions, Chaocheng Huang Dec 2003

Global Solutions To The Lake Equations With Isolated Vortex Regions, Chaocheng Huang

Mathematics and Statistics Faculty Publications

The vorticity formulation for the lake equations in R2 is studied.


Determination Of Spatial Strata For Environmental Regulatory Purposes, John Edward Daniels Dec 2003

Determination Of Spatial Strata For Environmental Regulatory Purposes, John Edward Daniels

Dissertations

This dissertation introduces spatial strata modelling, a methodology that combines spatial statistics, cluster analysis, and geographic information system theories to analyze the background level of naturally occurring contaminants of concern (COCs). The objective of spatial strata modelling is to divide a geographic area of interest into mutually exclusive geographic zones (spatial strata): with each stratum representing a different level of COC concentration. An estimate of each stratum's COC concentration level, representing an upper regulatory limit, will also be provided. Data provided by the Michigan Department of Environmental Quality describing the spatial location and arsenic concentrations of 211 Michigan sites (arsenic …


A Robust Two-Sample Procedure To Estimate A Shift Parameter, Feridun Tasdan Dec 2003

A Robust Two-Sample Procedure To Estimate A Shift Parameter, Feridun Tasdan

Dissertations

This study estimates the location shift parameter in the two-sample problem. The classical method, Least Square(LS), obtains the shift parameter estimate under the normality assumption. A departure from normality assumption makes the estimate inefficient and unreliable. One alternative to the least square estimate is Hodges-Lehmann (HL) estimate which uses Wilcoxon ranks to estimate the shift parameter. This estimate is robust against contaminations and large outliers. The proposed method in this study combines two samples and uses convolution technique to find a density function for the combined sample. This new density function is later used in the construction of the log …


Comparison Of Bracket Bond Strength By Total, Self-Etch And Laser Treatment, Kyo Sung Shawn Kim Dec 2003

Comparison Of Bracket Bond Strength By Total, Self-Etch And Laser Treatment, Kyo Sung Shawn Kim

Loma Linda University Electronic Theses, Dissertations & Projects

Laser etching of enamel surfaces alters the physical and chemical characteristics of the enamel. These changes in characteristics enhance the bonding to enamel. The purpose of this study was to compare the shear and tensile bond strength of Er,Cr:YSGG hydrokinetic laser system (Biolase) with 37% phosphoric acid and self etching primer 30 minutes and 72 hours after bonding. Four different laser power output setting was observed: 1.5W, 2.0W, 2.5W, and 3.0W. Two hundred forty bovine teeth free of defect, caries, and dentin exposure were mounted in acrylic resin and divided into 24 groups of 10 teeth. Sixteen groups of 10 …


Clinical Effectiveness Of A Subperiosteal Anchorage Device, Monica Anne Witte Dec 2003

Clinical Effectiveness Of A Subperiosteal Anchorage Device, Monica Anne Witte

Loma Linda University Electronic Theses, Dissertations & Projects

The purpose of this pilot study was to evaluate the clinical efficacy of a subperiosteal anchorage device, the palatal OnPlant™, during orthodontic retraction of protruding anterior teeth in cases requiring maxillary premolar extraction. Seven subjects (5 female, 2 male), ages 13 to 55, were selected for the study. The OnPlant was surgically placed in the mid-palatal region through a well-defined subperiosteal tunnel. Following the manufacturer recommended osseointegration period of four months, the OnPlant was uncovered and attached to the first molars by means of a transpalatal bar. Standard orthodontic treatment then commenced to retract the anterior teeth after the first …


A Bond Strength Comparison Of Led And Halogen Light Curing Units, John Richard Kavanagh Dec 2003

A Bond Strength Comparison Of Led And Halogen Light Curing Units, John Richard Kavanagh

Loma Linda University Electronic Theses, Dissertations & Projects

The purpose of this research was to compare the tensile bond strength of orthodontic flat based buttons bonded to bovine teeth with four commercial light-emitting diode (LED) curing lights and one conventional quartz tungsten halogen (QTH) curing light.

The dental market has recently been introduced to a number of commercially available light-emitting diode (LED) curing lights. Tensile bond strength was evaluated for LED curing lights (Rembrandt® AllegroTM, Den-Mat Corp, Santa Maria, CA), (LEDemetron, Kerr/Demetron Corp, Danbury, Conn), (Ortholux LED, 3MTMESPETM, St. Paul, MN),and (FLASH-lite 1001, Discus Dental, Culver City, CA) are compared with one …


Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit Nov 2003

Unified Cross-Validation Methodology For Selection Among Estimators And A General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities And Examples, Mark J. Van Der Laan, Sandrine Dudoit

U.C. Berkeley Division of Biostatistics Working Paper Series

In Part I of this article we propose a general cross-validation criterian for selecting among a collection of estimators of a particular parameter of interest based on n i.i.d. observations. It is assumed that the parameter of interest minimizes the expectation (w.r.t. to the distribution of the observed data structure) of a particular loss function of a candidate parameter value and the observed data structure, possibly indexed by a nuisance parameter. The proposed cross-validation criterian is defined as the empirical mean over the validation sample of the loss function at the parameter estimate based on the training sample, averaged over …


Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little Nov 2003

Weighting Adjustments For Unit Nonresponse With Multiple Outcome Variables, Sonya L. Vartivarian, Rod Little

The University of Michigan Department of Biostatistics Working Paper Series

Weighting is a common form of unit nonresponse adjustment in sample surveys where entire questionnaires are missing due to noncontact or refusal to participate. Weights are inversely proportional to the probability of selection and response. A common approach computes the response weight adjustment cells based on covariate information. When the number of cells thus created is too large, a coarsening method such as response propensity stratification can be applied to reduce the number of adjustment cells. Simulations in Vartivarian and Little (2002) indicate improved efficiency and robustness of weighting adjustments based on the joint classification of the sample by two …


Underestimation Of Standard Errors In Multi-Site Time Series Studies, Michael Daniels, Francesca Dominici, Scott L. Zeger Nov 2003

Underestimation Of Standard Errors In Multi-Site Time Series Studies, Michael Daniels, Francesca Dominici, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

Multi-site time series studies of air pollution and mortality and morbidity have figured prominently in the literature as comprehensive approaches for estimating acute effects of air pollution on health. Hierarchical models are generally used to combine site-specific information and estimate pooled air pollution effects taking into account both within-site statistical uncertainty, and across-site heterogeneity.

Within a site, characteristics of time series data of air pollution and health (small pollution effects, missing data, highly correlated predictors, non linear confounding etc.) make modelling all sources of uncertainty challenging. One potential consequence is underestimation of the statistical variance of the site-specific effects to …


Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei Nov 2003

Estimating Predictors For Long- Or Short-Term Survivors, Lu Tian, Wei Wang, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Time-Series Studies Of Particulate Matter, Michelle L. Bell, Jonathan M. Samet, Francesca Dominici Nov 2003

Time-Series Studies Of Particulate Matter, Michelle L. Bell, Jonathan M. Samet, Francesca Dominici

Johns Hopkins University, Dept. of Biostatistics Working Papers

Studies of air pollution and human health have evolved from descriptive studies of the early phenomena of large increases in adverse health effects following extreme air pollution episodes, to time-series analyses and the development of sophisticated regression models. In fact, advanced statistical methods are necessary to address the many challenges inherent in the detection of a small pollution risk in the presence of many confounders. This paper reviews the history, methods, and findings of the time-series studies estimating health risks associated with short-term exposure to particulate matter, though much of the discussion is applicable to epidemiological studies of air pollution …


Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger Nov 2003

Smooth Quantile Ratio Estimation With Regression: Estimating Medical Expenditures For Smoking Attributable Diseases, Francesca Dominici, Scott L. Zeger

Johns Hopkins University, Dept. of Biostatistics Working Papers

In this paper we introduce a semi-parametric regression model for estimating the difference in the expected value of two positive and highly skewed random variables as a function of covariates. Our method extends Smooth Quantile Ratio Estimation (SQUARE), a novel estimator of the mean difference of two positive random variables, to a regression model.

The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, taking covariates into account. Let Y1 and Y2 be two positive …


A Corrected Pseudo-Score Approach For Additive Hazards Model With Longitudinal Covariates Measured With Error, Xiao Song, Yijian Huang Nov 2003

A Corrected Pseudo-Score Approach For Additive Hazards Model With Longitudinal Covariates Measured With Error, Xiao Song, Yijian Huang

UW Biostatistics Working Paper Series

In medical studies, it is often of interest to characterize the relationship between a time-to-event and covariates, not only time-independent but also time-dependent. Time-dependent covariates are generally measured intermittently and with error. Recent interests focus on the proportional hazards framework, with longitudinal data jointly modeled through a mixed effects model. However, approaches under this framework depend on the normality assumption of the error, and might encounter intractable numerical difficulties in practice. This motivates us to consider an alternative framework, that is, the additive hazards model, under which little has been done when time-dependent covariates are measured with error. We propose …


A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng Nov 2003

A Nonparametric Comparison Of Conditional Distributions With Nonnegligible Cure Fractions, Yi Li, Jin Feng

Harvard University Biostatistics Working Paper Series

No abstract provided.


Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan Nov 2003

Survival Analysis With Heterogeneous Covariate Measurement Error, Yi Li, Louise Ryan

Harvard University Biostatistics Working Paper Series

No abstract provided.