Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability

Selected Works

Multiple testing

Articles 1 - 5 of 5

Full-Text Articles in Physical Sciences and Mathematics

A Hidden Markov Model Approach To Testing Multiple Hypotheses On A Gene Ontology Graph, Kun Liang, Dan Nettleton Jun 2019

A Hidden Markov Model Approach To Testing Multiple Hypotheses On A Gene Ontology Graph, Kun Liang, Dan Nettleton

Dan Nettleton

Gene category testing problems involve testing hundreds of null hypotheses that correspond to nodes in a directed acyclic graph. The logical relationships among the nodes in the graph imply that only some configurations of true and false null hypotheses are possible and that a test for a given node should depend on data from neighboring nodes. We developed a method based on a hidden Markov model that takes the whole graph into account and provides coherent decisions in this structured multiple hypothesis testing problem. The method is illustrated by testing Gene Ontology terms for evidence of differential expression.


Paradoxical Results Of Adaptive False Discovery Rate Procedures In Neuroimaging Studies, Philip T. Reiss, Armin Schwartzman, Feihan Lu, Lei Huang, Erika Proal Nov 2012

Paradoxical Results Of Adaptive False Discovery Rate Procedures In Neuroimaging Studies, Philip T. Reiss, Armin Schwartzman, Feihan Lu, Lei Huang, Erika Proal

Philip T. Reiss

Adaptive false discovery rate (FDR) procedures, which offer greater power than the original FDR procedure of Benjamini and Hochberg, are often applied to statistical maps of the brain. When a large proportion of the null hypotheses are false, as in the case of widespread effects such as cortical thinning throughout much of the brain, adaptive FDR methods can surprisingly reject more null hypotheses than not accounting for multiple testing at all—i.e., using uncorrected p-values. A straightforward mathematical argument is presented to explain why this can occur with the q-value method of Storey and colleagues, and a simulation study shows that …


Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh Jan 2011

Generalized Benjamini-Hochberg Procedures Using Spacings, Debashis Ghosh

Debashis Ghosh

For the problem of multiple testing, the Benjamini-Hochberg (B-H) procedure has become a very popular method in applications. We show how the B-H procedure can be interpreted as a test based on the spacings corresponding to the p-value distributions. Using this equivalence, we develop a class of generalized B-H procedures that maintain control of the false discovery rate in finite-samples. We also consider the effect of correlation on the procedure; simulation studies are used to illustrate the methodology.


Identifying Important Explanatory Variables For Time-Varying Outcomes., Oliver Bembom, Maya L. Petersen, Mark J. Van Der Laan Dec 2006

Identifying Important Explanatory Variables For Time-Varying Outcomes., Oliver Bembom, Maya L. Petersen, Mark J. Van Der Laan

Maya Petersen

This chapter describes a systematic and targeted approach for estimating the impact of each of a large number of baseline covariates on an outcome that is measured repeatedly over time. These variable importance estimates can be adjusted for a user-specified set of confounders and lend themselves in a straightforward way to obtaining confidence intervals and p-values. Hence, they can in particular be used to identify a subset of baseline covariates that are the most important explanatory variables for the time-varying outcome of interest. We illustrate the methodology in a data analysis aimed at finding mutations of the human immunodeficiency virus …


Identifying Important Explanatory Variables For Time-Varying Outcomes., Oliver Bembom, Maya L. Petersen, Mark J. Van Der Laan Dec 2006

Identifying Important Explanatory Variables For Time-Varying Outcomes., Oliver Bembom, Maya L. Petersen, Mark J. Van Der Laan

Oliver Bembom

This chapter describes a systematic and targeted approach for estimating the impact of each of a large number of baseline covariates on an outcome that is measured repeatedly over time. These variable importance estimates can be adjusted for a user-specified set of confounders and lend themselves in a straightforward way to obtaining confidence intervals and p-values. Hence, they can in particular be used to identify a subset of baseline covariates that are the most important explanatory variables for the time-varying outcome of interest. We illustrate the methodology in a data analysis aimed at finding mutations of the human immunodeficiency virus …