Open Access. Powered by Scholars. Published by Universities.®

Statistical Models Commons

Open Access. Powered by Scholars. Published by Universities.®

COBRA

Series

2010

Discipline
Keyword
Publication

Articles 1 - 10 of 10

Full-Text Articles in Statistical Models

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez Nov 2010

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez

COBRA Preprint Series

We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei Sep 2010

Stratifying Subjects For Treatment Selection With Censored Event Time Data From A Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, L. J. Wei

Harvard University Biostatistics Working Paper Series

No abstract provided.


Mixed Effect Poisson Log-Linear Models For Clinical And Epidemiological Sleep Hypnogram Data, Bruce J. Swihart, Brian S. Caffo Phd, Ciprian Crainiceanu Phd, Naresh M. Punjabi Phd, Md Aug 2010

Mixed Effect Poisson Log-Linear Models For Clinical And Epidemiological Sleep Hypnogram Data, Bruce J. Swihart, Brian S. Caffo Phd, Ciprian Crainiceanu Phd, Naresh M. Punjabi Phd, Md

Johns Hopkins University, Dept. of Biostatistics Working Papers

Bayesian Poisson log-linear multilevel models scalable to epidemiological studies are proposed to investigate population variability in sleep state transition rates. Hierarchical random effects are used to account for pairings of individuals and repeated measures within those individuals, as comparing diseased to non-diseased subjects while minimizing bias is of importance. Essentially, non-parametric piecewise constant hazards are estimated and smoothed, allowing for time-varying covariates and segment of the night comparisons. The Bayesian Poisson regression is justified through a re-derivation of a classical algebraic likelihood equivalence of Poisson regression with a log(time) offset and survival regression assuming exponentially distributed survival times. Such re-derivation …


Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li Jul 2010

Principled Sure Independence Screening For Cox Models With Ultra-High-Dimensional Covariates, Sihai Dave Zhao, Yi Li

Harvard University Biostatistics Working Paper Series

No abstract provided.


A Unified Approach To Modeling Multivariate Binary Data Using Copulas Over Partitions, Bruce J. Swihart, Brian Caffo, Ciprian Crainiceanu Jul 2010

A Unified Approach To Modeling Multivariate Binary Data Using Copulas Over Partitions, Bruce J. Swihart, Brian Caffo, Ciprian Crainiceanu

Johns Hopkins University, Dept. of Biostatistics Working Papers

Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate …


Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin Apr 2010

Nonparametric Regression With Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, Xihong Lin

Harvard University Biostatistics Working Paper Series

No abstract provided.


Nonparametric And Semiparametric Analysis Of Current Status Data Subject To Outcome Misclassification, Victor G. Sal Y Rosas, James P. Hughes Apr 2010

Nonparametric And Semiparametric Analysis Of Current Status Data Subject To Outcome Misclassification, Victor G. Sal Y Rosas, James P. Hughes

UW Biostatistics Working Paper Series

In this article, we present nonparametric and semiparametric methods to analyze current status data subject to outcome misclassification. Our methods use nonparametric maximum likelihood estimation (NPMLE) to estimate the distribution function of the failure time when sensitivity and specificity may vary among subgroups. A nonparametric test is proposed for the two sample hypothesis testing. In regression analysis, we apply the Cox proportional hazard model and likelihood ratio based confidence intervals for the regression coefficients are proposed. Our methods are motivated and demonstrated by data collected from an infectious disease study in Seattle, WA.


An Analysis Of Nonignorable Nonresponse In A Survey With A Rotating Panel Design, Caterina Giusti, Roderick J. Little Mar 2010

An Analysis Of Nonignorable Nonresponse In A Survey With A Rotating Panel Design, Caterina Giusti, Roderick J. Little

The University of Michigan Department of Biostatistics Working Paper Series

Missing values to income questions are common in survey data. When the probabilities of nonresponse are assumed to depend on the observed information and not on the underlining unobserved amounts, the missing income values are missing at random (MAR), and methods such as sequential multiple imputation can be applied. However, the MAR assumption is often considered questionable in this context, since missingness of income is thought to be related to the value of income itself, after conditioning on available covariates. In this article we describe a sensitivity analysis based on a pattern-mixture model for deviations from MAR, in the context …


Robustness Of Approaches To Roc Curve Modeling Under Misspecification Of The Underlying Probability Model, Sean Devlin, Elizabeth Thomas, Scott S. Emerson Jan 2010

Robustness Of Approaches To Roc Curve Modeling Under Misspecification Of The Underlying Probability Model, Sean Devlin, Elizabeth Thomas, Scott S. Emerson

UW Biostatistics Working Paper Series

The receiver operating characteristic (ROC) curve is a tool of particular use in disease status classification with a continuous medical test (marker). A variety of statistical regression models have been proposed for the comparison of ROC curves for different markers across covariate groups. A full parametric modeling of the marker distribution has been generally found to be overly reliant on the strong parametric assumptions. Pepe (2003) has instead developed parametric models for the ROC curve that induce a semi-parametric model for the marker distributions. The estimating equations proposed for use in these ROC-GLM models may differ from commonly used estimating …