Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Theory

COBRA

2008

Keyword
Publication

Articles 1 - 30 of 32

Full-Text Articles in Physical Sciences and Mathematics

A Small Sample Correction For Estimating Attributable Risk In Case-Control Studies, Daniel B. Rubin Dec 2008

A Small Sample Correction For Estimating Attributable Risk In Case-Control Studies, Daniel B. Rubin

U.C. Berkeley Division of Biostatistics Working Paper Series

The attributable risk, often called the population attributable risk, is in many epidemiological contexts a more relevant measure of exposure-disease association than the excess risk, relative risk, or odds ratio. When estimating attributable risk with case-control data and a rare disease, we present a simple correction to the standard approach making it essentially unbiased, and also less noisy. As with analogous corrections given in Jewell (1986) for other measures of association, the adjustment often won't make a substantial difference unless the sample size is very small or point estimates are desired within fine strata, but we discuss the possible utility …


The Highest Confidence Density Region And Its Usage For Inferences About The Survival Function With Censored Data, Lu Tian, Rui Wang, Tianxi Cai, L. J. Wei Nov 2008

The Highest Confidence Density Region And Its Usage For Inferences About The Survival Function With Censored Data, Lu Tian, Rui Wang, Tianxi Cai, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Change-Point Problem And Regression: An Annotated Bibliography, Ahmad Khodadadi, Masoud Asgharian Nov 2008

Change-Point Problem And Regression: An Annotated Bibliography, Ahmad Khodadadi, Masoud Asgharian

COBRA Preprint Series

The problems of identifying changes at unknown times and of estimating the location of changes in stochastic processes are referred to as "the change-point problem" or, in the Eastern literature, as "disorder".

The change-point problem, first introduced in the quality control context, has since developed into a fundamental problem in the areas of statistical control theory, stationarity of a stochastic process, estimation of the current position of a time series, testing and estimation of change in the patterns of a regression model, and most recently in the comparison and matching of DNA sequences in microarray data analysis.

Numerous methodological approaches …


The Strength Of Statistical Evidence For Composite Hypotheses With An Application To Multiple Comparisons, David R. Bickel Nov 2008

The Strength Of Statistical Evidence For Composite Hypotheses With An Application To Multiple Comparisons, David R. Bickel

COBRA Preprint Series

The strength of the statistical evidence in a sample of data that favors one composite hypothesis over another may be quantified by the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function. Unlike the p-value and the Bayes factor, this measure of evidence is coherent in the sense that it cannot support a hypothesis over any hypothesis that it entails. Further, when comparing the hypothesis that the parameter lies outside a non-trivial interval to the hypotheses that it lies within the interval, the proposed measure of evidence almost always asymptotically favors the correct hypothesis …


Calibrating Parametric Subject-Specific Risk Estimation, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei Oct 2008

Calibrating Parametric Subject-Specific Risk Estimation, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Evaluating Subject-Level Incremental Values Of New Markers For Risk Classification Rule, Tianxi Cai, Lu Tian, Donald M. Lloyd-Jones, L. J. Wei Oct 2008

Evaluating Subject-Level Incremental Values Of New Markers For Risk Classification Rule, Tianxi Cai, Lu Tian, Donald M. Lloyd-Jones, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Generalized Multilevel Functional Regression, Ciprian M. Crainiceanu, Ana-Maria Staicu, Chongzhi Di Sep 2008

Generalized Multilevel Functional Regression, Ciprian M. Crainiceanu, Ana-Maria Staicu, Chongzhi Di

Johns Hopkins University, Dept. of Biostatistics Working Papers

We introduce Generalized Multilevel Functional Linear Models (GMFLM), a novel statistical framework motivated by and applied to the Sleep Heart Health Study (SHHS), the largest community cohort study of sleep. The primary goal of SHHS is to study the association between sleep disrupted breathing (SDB) and adverse health effects. An exposure of primary interest is the sleep electroencephalogram (EEG), which was observed for thousands of individuals at two visits, roughly 5 years apart. This unique study design led to the development of models where the outcome, e.g. hypertension, is in an exponential family and the exposure, e.g. sleep EEG, is …


Measurement Error Caused By Spatial Misalignment In Environmental Epidemiology, Alexandros Gryparis, Christopher J. Paciorek, Ariana Zeka, Joel Schwartz, Brent A. Coull Sep 2008

Measurement Error Caused By Spatial Misalignment In Environmental Epidemiology, Alexandros Gryparis, Christopher J. Paciorek, Ariana Zeka, Joel Schwartz, Brent A. Coull

Harvard University Biostatistics Working Paper Series

No abstract provided.


Practical Large-Scale Spatio-Temporal Modeling Of Particulate Matter Concentrations, Christopher J. Paciorek, Jeff D. Yanosky, Robin C. Puett, Francine Laden, Helen H. Suh Sep 2008

Practical Large-Scale Spatio-Temporal Modeling Of Particulate Matter Concentrations, Christopher J. Paciorek, Jeff D. Yanosky, Robin C. Puett, Francine Laden, Helen H. Suh

Harvard University Biostatistics Working Paper Series

The last two decades have seen intense scientific and regulatory interest in the health effects of particulate matter (PM). Influential epidemiological studies that characterize chronic exposure of individuals rely on monitoring data that are sparse in space and time, so they often assign the same exposure to participants in large geographic areas and across time. We estimate monthly PM during 1988-2002 in a large spatial domain for use in studying health effects in the Nurses' Health Study. We develop a conceptually simple spatio-temporal model that uses a rich set of covariates. The model is used to estimate concentrations of PM10 …


Confidence Intervals For Negative Binomial Random Variables Of High Dispersion, David Shilane, Alan E. Hubbard, S N. Evans Aug 2008

Confidence Intervals For Negative Binomial Random Variables Of High Dispersion, David Shilane, Alan E. Hubbard, S N. Evans

U.C. Berkeley Division of Biostatistics Working Paper Series

This paper considers the problem of constructing confidence intervals for the mean of a Negative Binomial random variable based upon sampled data. When the sample size is large, we traditionally rely upon a Normal distribution approximation to construct these intervals. However, we demonstrate that the sample mean of highly dispersed Negative Binomials exhibits a slow convergence to the Normal in distribution as a function of the sample size. As a result, standard techniques (such as the Normal approximation and bootstrap) that construct confidence intervals for the mean will typically be too narrow and significantly undercover in the case of high …


A New Method For Constructing Exact Tests Without Making Any Assumptions, Karl H. Schlag Aug 2008

A New Method For Constructing Exact Tests Without Making Any Assumptions, Karl H. Schlag

COBRA Preprint Series

We present a new method for constructing exact distribution-free tests (and con…fidence intervals) for variables that can generate more than two possible outcomes. This method separates the search for an exact test from the goal to create a non- randomized test. Randomization is used to extend any exact test relating to means of variables with fi…nitely many outcomes to variables with outcomes belonging to a given bounded set. Tests in terms of variance and covariance are reduced to tests relating to means. Randomness is then eliminated in a separate step. This method is used to create con…fidence intervals for the …


Trading Bias For Precision: Decision Theory For Intervals And Sets, Kenneth M. Rice, Thomas Lumley, Adam A. Szpiro Aug 2008

Trading Bias For Precision: Decision Theory For Intervals And Sets, Kenneth M. Rice, Thomas Lumley, Adam A. Szpiro

UW Biostatistics Working Paper Series

Interval- and set-valued decisions are an essential part of statistical inference. Despite this, the justification behind them is often unclear, leading in practice to a great deal of confusion about exactly what is being presented. In this paper we review and attempt to unify several competing methods of interval-construction, within a formal decision-theoretic framework. The result is a new emphasis on interval-estimation as a distinct goal, and not as an afterthought to point estimation. We also see that representing intervals as trade-offs between measures of precision and bias unifies many existing approaches -- as well as suggesting interpretable criteria to …


Fdr Controlling Procedure For Multi-Stage Analyses, Catherine Tuglus, Mark J. Van Der Laan Jul 2008

Fdr Controlling Procedure For Multi-Stage Analyses, Catherine Tuglus, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Multiple testing has become an integral component in genomic analyses involving microarray experiments where large number of hypotheses are tested simultaneously. However before applying more computationally intensive methods, it is often desirable to complete an initial truncation of the variable set using a simpler and faster supervised method such as univariate regression. Once such a truncation is completed, multiple testing methods applied to any subsequent analysis no longer control the appropriate Type I error rates. Here we propose a modified marginal Benjamini \& Hochberg step-up FDR controlling procedure for multi-stage analyses (FDR-MSA), which correctly controls Type I error in terms …


Bringing Game Theory To Hypothesis Testing: Establishing Finite Sample Bounds On Inference, Karl H. Schlag Jun 2008

Bringing Game Theory To Hypothesis Testing: Establishing Finite Sample Bounds On Inference, Karl H. Schlag

COBRA Preprint Series

Small sample properties are of fundamental interest when only limited data is available. Exact inference is limited by constraints imposed by specific nonrandomized tests and of course also by lack of more data. These effects can be separated as we propose to evaluate a test by comparing its type II error to the minimal type II error among all tests for the given sample. Game theory is used to establish this minimal type II error, the associated randomized test is characterized as part of a Nash equilibrium of a fictitious game against nature. We use this method to investigate sequential …


Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin Jun 2008

Estimation And Testing For The Effect Of A Genetic Pathway On A Disease Outcome Using Logistic Kernel Machine Regression Via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein Jun 2008

A Powerful And Flexible Multilocus Association Test For Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, Michael P. Epstein

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin Jun 2008

Nonparametric Regression Using Local Kernel Estimating Equations For Correlated Failure Time Data, Zhangsheng Yu, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin Jun 2008

A Comparison Of Methods For Estimating The Causal Effect Of A Treatment In Randomized Clinical Trials Subject To Noncompliance, Rod Little, Qi Long, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Semiparametric Maximum Likelihood Estimation In Normal Transformation Models For Bivariate Survival Data, Yi Li, Ross L. Prentice, Xihong Lin Jun 2008

Semiparametric Maximum Likelihood Estimation In Normal Transformation Models For Bivariate Survival Data, Yi Li, Ross L. Prentice, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Accounting For Errors From Predicting Exposures In Environmental Epidemiology And Environmental Statistics, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley Jun 2008

Accounting For Errors From Predicting Exposures In Environmental Epidemiology And Environmental Statistics, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

PLEASE NOTE THAT AN UPDATED VERSION OF THIS RESEARCH IS AVAILABLE AS WORKING PAPER 350 IN THE UNIVERSITY OF WASHINGTON BIOSTATISTICS WORKING PAPER SERIES (http://www.bepress.com/uwbiostat/paper350).

In environmental epidemiology and related problems in environmental statistics, it is typically not practical to directly measure the exposure for each subject. Environmental monitoring is employed with a statistical model to assign exposures to individuals. The result is a form of exposure misspecification that can result in complicated errors in the health effect estimates if the exposure is naively treated as known. The exposure error is neither “classical” nor “Berkson”, so standard regression calibration methods …


Supervised Distance Matrices: Theory And Applications To Genomics, Katherine S. Pollard, Mark J. Van Der Laan Jun 2008

Supervised Distance Matrices: Theory And Applications To Genomics, Katherine S. Pollard, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We propose a new approach to studying the relationship between a very high dimensional random variable and an outcome. Our method is based on a novel concept, the supervised distance matrix, which quantifies pairwise similarity between variables based on their association with the outcome. A supervised distance matrix is derived in two stages. The first stage involves a transformation based on a particular model for association. In particular, one might regress the outcome on each variable and then use the residuals or the influence curve from each regression as a data transformation. In the second stage, a choice of distance …


Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan Jun 2008

Confidence Intervals For The Population Mean Tailored To Small Sample Sizes, With Applications To Survey Sampling, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The validity of standard confidence intervals constructed in survey sampling is based on the central limit theorem. For small sample sizes, the central limit theorem may give a poor approximation, resulting in confidence intervals that are misleading. We discuss this issue and propose methods for constructing confidence intervals for the population mean tailored to small sample sizes.

We present a simple approach for constructing confidence intervals for the population mean based on tail bounds for the sample mean that are correct for all sample sizes. Bernstein's inequality provides one such tail bound. The resulting confidence intervals have guaranteed coverage probability …


Doubly Robust Ecological Inference, Daniel B. Rubin, Mark J. Van Der Laan May 2008

Doubly Robust Ecological Inference, Daniel B. Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

The ecological inference problem is a famous longstanding puzzle that arises in many disciplines. The usual formulation in epidemiology is that we would like to quantify an exposure-disease association by obtaining disease rates among the exposed and unexposed, but only have access to exposure rates and disease rates for several regions. The problem is generally intractable, but can be attacked under the assumptions of King's (1997) extended technique if we can correctly specify a model for a certain conditional distribution. We introduce a procedure that it is a valid approach if either this original model is correct or if we …


Properties Of Monotonic Effects On Directed Acyclic Graphs, Tyler J. Vanderweele, James M. Robins Apr 2008

Properties Of Monotonic Effects On Directed Acyclic Graphs, Tyler J. Vanderweele, James M. Robins

COBRA Preprint Series

Various relationships are shown hold between monotonic effects and weak monotonic effects and the monotonicity of certain conditional expectations. Counterexamples are provided to show that the results do not hold under less restrictive conditions. Monotonic effects are furthermore used to relate signed edges on a causal directed acyclic graph to qualitative effect modification. The theory is applied to an example concerning the direct effect of smoking on cardiovascular disease controlling for hypercholesterolemia. Monotonicity assumptions are used to construct a test for whether there is a variable that confounds the relationship between the mediator, hypercholesterolemia, and the outcome, cardiovascular disease.


The Construction And Analysis Of Adaptive Group Sequential Designs, Mark J. Van Der Laan Mar 2008

The Construction And Analysis Of Adaptive Group Sequential Designs, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In order to answer scientific questions of interest one often carries out an ordered sequence of experiments generating the appropriate data over time. The design of each experiment involves making various decisions such as 1) What variables to measure on the randomly sampled experimental unit?, 2) How regularly to monitor the unit, and for how long?, 3) How to randomly assign a treatment or drug-dose to the unit?, among others. That is, the design of each experiment involves selecting a so called treatment mechanism/monitoring mechanism/ missingness/censoring mechanism, where these mechanisms represent a formally defined conditional distribution of one of these …


Empirical Null And False Discovery Rate Inference For Exponential Families, Armin Schwartzman Feb 2008

Empirical Null And False Discovery Rate Inference For Exponential Families, Armin Schwartzman

Harvard University Biostatistics Working Paper Series

No abstract provided.


Marginal Structural Models For Partial Exposure Regimes, Stijn Vansteelandt, Karl Mertens, Carl Suetens, Els Goetghebeur Feb 2008

Marginal Structural Models For Partial Exposure Regimes, Stijn Vansteelandt, Karl Mertens, Carl Suetens, Els Goetghebeur

Harvard University Biostatistics Working Paper Series

Intensive care unit (ICU) patients are ell known to be highly susceptible for nosocomial (i.e. hospital-acquired) infections due to their poor health and many invasive therapeutic treatments. The effects of acquiring such infections in ICU on mortality are however ill understood. Our goal is to quantify these effects using data from the National Surveillance Study of Nosocomial Infections in Intensive Care

Units (Belgium). This is a challenging problem because of the presence of time-dependent confounders (such as exposure to mechanical ventilation)which lie on the causal path from infection to mortality. Standard statistical analyses may be severely misleading in such settings …


Covariate Adjustment For The Intention-To-Treat Parameter With Empirical Efficiency Maximization, Daniel B. Rubin, Mark J. Van Der Laan Feb 2008

Covariate Adjustment For The Intention-To-Treat Parameter With Empirical Efficiency Maximization, Daniel B. Rubin, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

In randomized experiments, the intention-to-treat parameter is defined as the difference in expected outcomes between groups assigned to treatment and control arms. There is a large literature focusing on how (possibly misspecified) working models can sometimes exploit baseline covariate measurements to gain precision, although covariate adjustment is not strictly necessary. In Rubin and van der Laan (2008), we proposed the technique of empirical efficiency maximization for improving estimation by forming nonstandard fits of such working models. Considering a more realistic randomization scheme than in our original article, we suggest a new class of working models for utilizing covariate information, show …


A Bayesian Approach To Effect Estimation Accounting For Adjustment Uncertainty, Chi Wang, Giovanni Parmigiani, Ciprian Crainiceanu, Francesca Dominici Jan 2008

A Bayesian Approach To Effect Estimation Accounting For Adjustment Uncertainty, Chi Wang, Giovanni Parmigiani, Ciprian Crainiceanu, Francesca Dominici

Johns Hopkins University, Dept. of Biostatistics Working Papers

Adjustment for confounding factors is a common goal in the analysis of both observational and controlled studies. The choice of which confounding factors should be included in the model used to estimate an effect of interest is both critical and uncertain. For this reason it is important to develop methods that estimate an effect, while accounting not only for confounders, but also for the uncertainty about which confounders should be included. In a recent article, Crainiceanu et al. (2008) have identified limitations and potential biases of Bayesian Model Averaging (BMA) (Raftery et al., 1997; Hoeting et al., 1999)when applied to …


Estimation Of Controlled Direct Effects, Sylvie Goetgeluk, Stijn Vansteelandt, Els Goetghebeur Jan 2008

Estimation Of Controlled Direct Effects, Sylvie Goetgeluk, Stijn Vansteelandt, Els Goetghebeur

Harvard University Biostatistics Working Paper Series

No abstract provided.