Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistical Theory

COBRA

2010

Keyword
Publication

Articles 1 - 29 of 29

Full-Text Articles in Physical Sciences and Mathematics

Oracle And Multiple Robustness Properties Of Survey Calibration Estimator In Missing Response Problem, Kwun Chuen Gary Chan Dec 2010

Oracle And Multiple Robustness Properties Of Survey Calibration Estimator In Missing Response Problem, Kwun Chuen Gary Chan

UW Biostatistics Working Paper Series

In the presence of missing response, reweighting the complete case subsample by the inverse of nonmissing probability is both intuitive and easy to implement. However, inverse probability weighting is not efficient in general and is not robust against misspecification of the missing probability model. Calibration was developed by survey statisticians for improving efficiency of inverse probability weighting estimators when population totals of auxiliary variables are known and when inclusion probability is known by design. In missing data problem we can calibrate auxiliary variables in the complete case subsample to the full sample. However, the inclusion probability is unknown in general …


Modification And Improvement Of Empirical Likelihood For Missing Response Problem, Kwun Chuen Gary Chan Dec 2010

Modification And Improvement Of Empirical Likelihood For Missing Response Problem, Kwun Chuen Gary Chan

UW Biostatistics Working Paper Series

An empirical likelihood (EL) estimator was proposed by Qin and Zhang (2007) for a missing response problem under a missing at random assumption. They showed by simulation studies that the finite sample performance of EL estimator is better than some existing estimators. However, the empirical likelihood estimator does not have a uniformly smaller asymptotic variance than other estimators in general. We consider several modifications to the empirical likelihood estimator and show that the proposed estimator dominates the empirical likelihood estimator and several other existing estimators in terms of asymptotic efficiencies. The proposed estimator also attains the minimum asymptotic variance among …


Modification And Improvement Of Empirical Liklihood For Missing Response Problem, Gary Chan Dec 2010

Modification And Improvement Of Empirical Liklihood For Missing Response Problem, Gary Chan

UW Biostatistics Working Paper Series

An empirical likelihood (EL) estimator was proposed by Qin and Zhang (2007) for a missing response problem under a missing at random assumption. They showed by simulation studies that the finite sample performance of EL estimator is better than some existing estimators. However, the empirical likelihood estimator does not have a uniformly smaller asymptotic variance than other estimators in general. We consider several modifications to the empirical likelihood estimator and show that the proposed estimator dominates the empirical likelihood estimator and several other existing estimators in terms of asymptotic efficiencies. The proposed estimator also attains the minimum asymptotic variance among …


Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


Efficient Measurement Error Correction With Spatially Misaligned Data, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley Dec 2010

Efficient Measurement Error Correction With Spatially Misaligned Data, Adam A. Szpiro, Lianne Sheppard, Thomas Lumley

UW Biostatistics Working Paper Series

Association studies in environmental statistics often involve exposure and outcome data that are misaligned in space. A common strategy is to employ a spatial model such as universal kriging to predict exposures at locations with outcome data and then estimate a regression parameter of interest using the predicted exposures. This results in measurement error because the predicted exposures do not correspond exactly to the true values. We characterize the measurement error by decomposing it into Berkson-like and classical-like components. One correction approach is the parametric bootstrap, which is effective but computationally intensive since it requires solving a nonlinear optimization problem …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Improving The Power Of Chronic Disease Surveillance By Incorporating Residential History, Justin Manjourides, Marcello Pagano Nov 2010

Improving The Power Of Chronic Disease Surveillance By Incorporating Residential History, Justin Manjourides, Marcello Pagano

Harvard University Biostatistics Working Paper Series

No abstract provided.


Multilevel Functional Principal Component Analysis For High-Dimensional Data, Vadim Zipunnikov, Brian Caffo, Ciprian Crainiceanu, David M. Yousem, Christos Davatzikos, Brian S. Schwartz Oct 2010

Multilevel Functional Principal Component Analysis For High-Dimensional Data, Vadim Zipunnikov, Brian Caffo, Ciprian Crainiceanu, David M. Yousem, Christos Davatzikos, Brian S. Schwartz

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose fast and scalable statistical methods for the analysis of hundreds or thousands of high dimensional vectors observed at multiple visits. The proposed inferential methods avoid the difficult task of loading the entire data set at once in the computer memory and use sequential access to data. This allows deployment of our methodology on low-resource computers where computations can be done in minutes on extremely large data sets. Our methods are motivated by and applied to a study where hundreds of subjects were scanned using Magnetic Resonance Imaging (MRI) at two visits roughly five years apart. The original data …


Landmark Prediction Of Survival, Layla Parast, Tianxi Cai Sep 2010

Landmark Prediction Of Survival, Layla Parast, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Longitudinal Penalized Functional Regression, Jeff Goldsmith, Ciprian M. Crainiceanu, Brian Caffo, Daniel Reich Sep 2010

Longitudinal Penalized Functional Regression, Jeff Goldsmith, Ciprian M. Crainiceanu, Brian Caffo, Daniel Reich

Johns Hopkins University, Dept. of Biostatistics Working Papers

We propose a new regression model and inferential tools for the case when both the outcome and the functional exposures are observed at multiple visits. This data structure is new but increasingly present in applications where functions or images are recorded at multiple times. This raises new inferential challenges that cannot be addressed with current methods and software. Our proposed model generalizes the Generalized Linear Mixed Effects Model (GLMM) by adding functional predictors. Smoothness of the functional coefficients is ensured using roughness penalties estimated by Restricted Maximum Likelihood (REML) in a corresponding mixed effects model. This method is computationally feasible …


On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Dai, Charles Kooperberg, Michael L. Leblanc, Ross Prentice Sep 2010

On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Dai, Charles Kooperberg, Michael L. Leblanc, Ross Prentice

UW Biostatistics Working Paper Series

Kooperberg and LeBlanc (2008) proposed a two-stage testing procedure to screen for significant interactions in genome-wide association (GWA) studies by a soft threshold on marginal associations (MA), though its theoretical properties and generalization have not been elaborated. In this article, we discuss conditions that are required to achieve strong control of the Family-Wise Error Rate (FWER) by such procedures for low or high-dimensional hypothesis testing. We provide proof of asymptotic independence of marginal association statistics and interaction statistics in linear regression, logistic regression, and Cox proportional hazard models in a randomized clinical trial (RCT) with a rare event. In case-control …


Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei Sep 2010

Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Y. Dai, Charles Kooperberg, Michael Leblanc, Ross L. Prentice Aug 2010

On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Y. Dai, Charles Kooperberg, Michael Leblanc, Ross L. Prentice

UW Biostatistics Working Paper Series

Kooperberg08 proposed a two-stage testing procedure to screen for significant interactions in genome-wide association (GWA) studies by a soft threshold on marginal associations (MA), though its theoretical properties and generalization have not been elaborated. In this article, we discuss conditions that are required to achieve strong control of the Family-Wise Error Rate (FWER) by such procedures for low or high-dimensional hypothesis testing. We provide proof of asymptotic independence of marginal association statistics and interaction statistics in linear regression, logistic regression, and Cox proportional hazard models in a randomized clinical trial (RCT) with a rare event. In case-control studies nested within …


A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai Aug 2010

A Perturbation Method For Inference On Regularized Regression Estimates, Jessica Minnier, Lu Tian, Tianxi Cai

Harvard University Biostatistics Working Paper Series

No abstract provided.


Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li Jul 2010

Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


Optimizing Randomized Trial Designs To Distinguish Which Subpopulations Benefit From Treatment, Michael Rosenblum, Mark J. Van Der Laan Jun 2010

Optimizing Randomized Trial Designs To Distinguish Which Subpopulations Benefit From Treatment, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

It is a challenge to evaluate experimental treatments where it is suspected that the treatment effect may only be strong for certain subpopulations, such as those having a high initial severity of disease, or those having a particular gene variant. Standard randomized controlled trials can have low power in such situations. They also are not optimized to distinguish which subpopulations benefit from a treatment. With the goal of overcoming these limitations, we consider randomized trial designs in which the criteria for patient enrollment may be changed, in a preplanned manner, based on interim analyses. Since such designs allow data-dependent changes …


The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel Jun 2010

The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel

COBRA Preprint Series

A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors one composite hypothesis over another is the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function over the parameter of interest. Since the weight of evidence is generally only known up to a nuisance parameter, it is approximated by replacing the likelihood function with …


Model-Robust Regression And A Bayesian `Sandwich' Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley May 2010

Model-Robust Regression And A Bayesian `Sandwich' Estimator, Adam A. Szpiro, Kenneth M. Rice, Thomas Lumley

UW Biostatistics Working Paper Series

The published version of this paper in Annals of Applied Statistics (Vol. 4, No. 4 (2010), 2099–2113) is available from the journal web site at http://dx.doi.org/10.1214/10-AOAS362.

We present a new Bayesian approach to model-robust linear regression that leads to uncertainty estimates with the same robustness properties as the Huber-White sandwich estimator. The sandwich estimator is known to provide asymptotically correct frequentist inference, even when standard modeling assumptions such as linearity and homoscedasticity in the data-generating mechanism are violated. Our derivation provides a compelling Bayesian justification for using this simple and popular tool, and it also clarifies what is being estimated …


Asymptotic Properties Of The Sequential Empirical Roc And Ppv Curves, Joseph S. Koopmeiners, Ziding Feng May 2010

Asymptotic Properties Of The Sequential Empirical Roc And Ppv Curves, Joseph S. Koopmeiners, Ziding Feng

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve, the positive predictive value (PPV) curve and the negative predictive value (NPV) curve are three common measures of performance for a diagnostic biomarker. The independent increments covariance structure assumption is common in the group sequential study design literature. Showing that summary measures of the ROC, PPV and NPV curves have an independent increments covariance structure will provide the theoretical foundation for designing group sequential diagnostic biomarker studies. The ROC, PPV and NPV curves are often estimated empirically to avoid assumptions about the distributional form of the biomarkers. In this paper we derive asymptotic theory …


Estimating Causal Effects In Trials Involving Multi-Treatment Arms Subject To Non-Compliance: A Bayesian Frame-Work, Qi Long, Roderick J. Little, Xihong Lin May 2010

Estimating Causal Effects In Trials Involving Multi-Treatment Arms Subject To Non-Compliance: A Bayesian Frame-Work, Qi Long, Roderick J. Little, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Super Learner In Prediction, Eric C. Polley, Mark J. Van Der Laan May 2010

Super Learner In Prediction, Eric C. Polley, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Super learning is a general loss based learning method that has been proposed and analyzed theoretically in van der Laan et al. (2007). In this article we consider super learning for prediction. The super learner is a prediction method designed to find the optimal combination of a collection of prediction algorithms. The super learner algorithm finds the combination of algorithms minimizing the cross-validated risk. The super learner framework is built on the theory of cross-validation and allows for a general class of prediction algorithms to be considered for the ensemble. Due to the previously established oracle results for the cross-validation …


Assessing Noninferiority In A Three-Arm Trial Using The Bayesian Approach, Pulak Ghosh, Farouk S. Nathoo, Mithat Gonen, Ram C. Tiwari May 2010

Assessing Noninferiority In A Three-Arm Trial Using The Bayesian Approach, Pulak Ghosh, Farouk S. Nathoo, Mithat Gonen, Ram C. Tiwari

Memorial Sloan-Kettering Cancer Center, Dept. of Epidemiology & Biostatistics Working Paper Series

Non-inferiority trials, which aim to demonstrate that a test product is not worse than a competitor by more than a pre-specified small amount, are of great importance to the pharmaceutical community. As a result, methodology for designing and analyzing such trials is required, and developing new methods for such analysis is an important area of statistical research. The three-arm clinical trial is usually recommended for non-inferiority trials by the Food and Drug Administration (FDA). The three-arm trial consists of a placebo, a reference, and an experimental treatment, and simultaneously tests the superiority of the reference over the placebo along with …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Simple Examples Of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation, Michael Rosenblum, Mark J. Van Der Laan Mar 2010

Simple Examples Of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation, Michael Rosenblum, Mark J. Van Der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

We present a brief overview of targeted maximum likelihood for estimating the causal effect of a single time point treatment and of a two time point treatment. We focus on simple examples demonstrating how to apply the methodology developed in (van der Laan and Rubin, 2006; Moore and van der Laan, 2007; van der Laan, 2010a,b). We include R code for the single time point case.


Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang Mar 2010

Likelihood Ratio Testing For Admixture Models With Application To Genetic Linkage Analysis, Chong-Zhi Di, Kung-Yee Liang

Johns Hopkins University, Dept. of Biostatistics Working Papers

We consider likelihood ratio tests (LRT) and their modifications for homogeneity in admixture models. The admixture model is a special case of two component mixture model, where one component is indexed by an unknown parameter while the parameter value for the other component is known. It has been widely used in genetic linkage analysis under heterogeneity, in which the kernel distribution is binomial. For such models, it is long recognized that testing for homogeneity is nonstandard and the LRT statistic does not converge to a conventional 2 distribution. In this paper, we investigate the asymptotic behavior of the LRT for …


Graphical Procedures For Evaluating Overall And Subject-Specific Incremental Values From New Predictors With Censored Event Time Data, Hajime Uno, Tianxi Cai, Lu Tian, L. J. Wei Mar 2010

Graphical Procedures For Evaluating Overall And Subject-Specific Incremental Values From New Predictors With Censored Event Time Data, Hajime Uno, Tianxi Cai, Lu Tian, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


A New Class Of Dantzig Selectors For Censored Linear Regression Models, Yi Li, Lee Dicker, Sihai Dave Zhao Mar 2010

A New Class Of Dantzig Selectors For Censored Linear Regression Models, Yi Li, Lee Dicker, Sihai Dave Zhao

Harvard University Biostatistics Working Paper Series

No abstract provided.


Penalized Functional Regression, Jeff Goldsmith, Jennifer Feder, Ciprian M. Crainiceanu, Brian Caffo, Daniel Reich Jan 2010

Penalized Functional Regression, Jeff Goldsmith, Jennifer Feder, Ciprian M. Crainiceanu, Brian Caffo, Daniel Reich

Johns Hopkins University, Dept. of Biostatistics Working Papers

We develop fast fitting methods for generalized functional linear models. An undersmooth of the functional predictor is obtained by projecting on a large number of smooth eigenvectors and the coefficient function is estimated using penalized spline regression. Our method can be applied to many functional data designs including functions measured with and without error, sparsely or densely sampled. The methods also extend to the case of multiple functional predictors or functional predictors with a natural multilevel structure. Our approach can be implemented using standard mixed effects software and is computationally fast. Our methodology is motivated by a diffusion tensor imaging …


Regression Adjustment And Stratification By Propensty Score In Treatment Effect Estimation, Jessica A. Myers, Thomas A. Louis Jan 2010

Regression Adjustment And Stratification By Propensty Score In Treatment Effect Estimation, Jessica A. Myers, Thomas A. Louis

Johns Hopkins University, Dept. of Biostatistics Working Papers

Propensity score adjustment of effect estimates in observational studies of treatment is a common technique used to control for bias in treatment assignment. In situations where matching on propensity score is not possible or desirable, regression adjustment and stratification are two options. Regression adjustment is used most often and can be highly efficient, but it can lead to biased results when model assumptions are violated. Validity of the stratification approach depends on fewer model assumptions, but is less efficient than regression adjustment when the regression assumptions hold. To investigate these issues, by simulation we compare stratification and regression adjustments. We …