Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

COBRA Preprint Series

Series

2010

Keyword

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel Dec 2010

Minimum Description Length Measures Of Evidence For Enrichment, Zhenyu Yang, David R. Bickel

COBRA Preprint Series

In order to functionally interpret differentially expressed genes or other discovered features, researchers seek to detect enrichment in the form of overrepresentation of discovered features associated with a biological process. Most enrichment methods treat the p-value as the measure of evidence using a statistical test such as the binomial test, Fisher's exact test or the hypergeometric test. However, the p-value is not interpretable as a measure of evidence apart from adjustments in light of the sample size. As a measure of evidence supporting one hypothesis over the other, the Bayes factor (BF) overcomes this drawback of the p-value but lacks …


A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez Nov 2010

A Bayesian Shared Component Model For Genetic Association Studies, Juan J. Abellan, Carlos Abellan, Juan R. Gonzalez

COBRA Preprint Series

We present a novel approach to address genome association studies between single nucleotide polymorphisms (SNPs) and disease. We propose a Bayesian shared component model to tease out the genotype information that is common to cases and controls from the one that is specific to cases only. This allows to detect the SNPs that show the strongest association with the disease. The model can be applied to case-control studies with more than one disease. In fact, we illustrate the use of this model with a dataset of 23,418 SNPs from a case-control study by The Welcome Trust Case Control Consortium (2007) …


Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel Nov 2010

Minimum Description Length And Empirical Bayes Methods Of Identifying Snps Associated With Disease, Ye Yang, David R. Bickel

COBRA Preprint Series

The goal of determining which of hundreds of thousands of SNPs are associated with disease poses one of the most challenging multiple testing problems. Using the empirical Bayes approach, the local false discovery rate (LFDR) estimated using popular semiparametric models has enjoyed success in simultaneous inference. However, the estimated LFDR can be biased because the semiparametric approach tends to overestimate the proportion of the non-associated single nucleotide polymorphisms (SNPs). One of the negative consequences is that, like conventional p-values, such LFDR estimates cannot quantify the amount of information in the data that favors the null hypothesis of no disease-association.

We …


Improving Statistical Analysis Of Prospective Clinical Trials In Stem Cell Transplantation. An Inventory Of New Approaches In Survival Analysis, Aurelien Latouche Jun 2010

Improving Statistical Analysis Of Prospective Clinical Trials In Stem Cell Transplantation. An Inventory Of New Approaches In Survival Analysis, Aurelien Latouche

COBRA Preprint Series

The CLINT project is an European Union funded project, run as a specific support action, under the sixth framework programme. It is a 2 year project aimed at supporting the European Group for Blood and Marrow Transplantation (EBMT) to develop its infrastructure for the conduct of trans-European clinical trials in accordance with the EU Clinical Trials Directive, and to facilitate International prospective clinical trials in stem cell transplantation. The initial task is to create an inventory of the existing biostatistical literature on new approaches to survival analyses that are not currently widely utilised. The estimation of survival endpoints is introduced, …


The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel Jun 2010

The Strength Of Statistical Evidence For Composite Hypotheses: Inference To The Best Explanation, David R. Bickel

COBRA Preprint Series

A general function to quantify the weight of evidence in a sample of data for one hypothesis over another is derived from the law of likelihood and from a statistical formalization of inference to the best explanation. For a fixed parameter of interest, the resulting weight of evidence that favors one composite hypothesis over another is the likelihood ratio using the parameter value consistent with each hypothesis that maximizes the likelihood function over the parameter of interest. Since the weight of evidence is generally only known up to a nuisance parameter, it is approximated by replacing the likelihood function with …


The Linkset Model For 2^N Contingency Tables, Mikel Aickin May 2010

The Linkset Model For 2^N Contingency Tables, Mikel Aickin

COBRA Preprint Series

Abstract The linkset model is defined for parametrizing the general 2^n contingency table. The linkset parameters are designed to represent latent influences that promote the co-occurrences of binary events beyond that explained by chance. Linkages involving 2 through n binary variables are included in this parametrization. The intent of this process is to elucidate the patterns of linkage, no matter how complex they might be, rather than to fit simplifying models. The relationship between linkset parameters and the natural parameters for a 2n table are derived, and large sample inference methods are provided. Examples are given from medical diagnostics, survival …


Recovery Of The Baseline Incidence Density In Censored Time-To-Event Analysis, Mikel Aickin Apr 2010

Recovery Of The Baseline Incidence Density In Censored Time-To-Event Analysis, Mikel Aickin

COBRA Preprint Series

Abstract Time-to-event analyses are often concerned with the effects of explanatory factors on the underlying incidence density, but since there is no intrinsic interest in the form of the incidence density itself, a proportional hazards model is used. When part of the purpose of the analysis is to use actual cumulative incidence for simulation, or for providing informative visual displays of the results, an estimate of the baseline incidence density is required. The usual method for estimating the baseline hazards in Cox’s proportional hazards analysis yields values that are of little use, and furthermore no standard deviations of the estimates …


Efficient Design And Inference For Multi-Stage Randomized Trials Of Individualized Treatment Policies, Ree Dawson, Philip W. Lavori Apr 2010

Efficient Design And Inference For Multi-Stage Randomized Trials Of Individualized Treatment Policies, Ree Dawson, Philip W. Lavori

COBRA Preprint Series

Increased clinical interest in individualized ‘adaptive’ treatment policies has shifted the methodological focus for their development from the analysis of naturalistically observed strategies to experimental evaluation of a pre-selected set of strategies via multi-stage designs. Because multi-stage studies often avoid the ‘curse of dimensionality’ inherent in uncontrolled studies, and hence the need to parametrically smooth trial data, it is not surprising in this context to find direct connections among different methodological approaches. We show by asymptotic and algebraic proof that the maximum likelihood (ML) and optimal semi-parametric estimators of the mean of a treatment policy and its standard error are …