Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Physical Sciences and Mathematics

Assessing The Probability That A Finding Is Genuine For Large-Scale Genetic Association Studies, Chia-Ling Kuo, Olga A. Vsevolozhskaya, Dmitri V. Zaykin May 2015

Assessing The Probability That A Finding Is Genuine For Large-Scale Genetic Association Studies, Chia-Ling Kuo, Olga A. Vsevolozhskaya, Dmitri V. Zaykin

Olga A. Vsevolozhskaya

Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity …


Functional Analysis Of Variance For Association Studies, Olga A. Vsevolozhskaya, Dmitri V. Zaykin, Mark C. Greenwood, Changshuai Wei, Qing Lu Sep 2014

Functional Analysis Of Variance For Association Studies, Olga A. Vsevolozhskaya, Dmitri V. Zaykin, Mark C. Greenwood, Changshuai Wei, Qing Lu

Olga A. Vsevolozhskaya

While progress has been made in identifying common genetic variants associated with human diseases, for most of common complex diseases, the identified genetic variants only account for a small proportion of heritability. Challenges remain in finding additional unknown genetic variants predisposing to complex diseases. With the advance in next-generation sequencing technologies, sequencing studies have become commonplace in genetic research. The ongoing exome-sequencing and whole-genome-sequencing studies generate a massive amount of sequencing variants and allow researchers to comprehensively investigate their role in human diseases. The discovery of new disease-associated variants can be enhanced by utilizing powerful and computationally efficient statistical methods. …


Sample Size Calculations For Roc Studies: Parametric Robustness And Bayesian Nonparametrics, Dunlei Cheng, Adam J. Branscum, Wesley O. Johnson Jan 2012

Sample Size Calculations For Roc Studies: Parametric Robustness And Bayesian Nonparametrics, Dunlei Cheng, Adam J. Branscum, Wesley O. Johnson

Dunlei Cheng

Methods for sample size calculations in ROC studies often assume independent normal distributions for test scores among the diseased and non-diseased populations. We consider sample size requirements under the default two-group normal model when the data distribution for the diseased population is either skewed or multimodal. For these two common scenarios we investigate the potential for robustness of calculated sample sizes under the mis-specified normal model and we compare to sample sizes calculated under a more flexible nonparametric Dirichlet process mixture model. We also highlight the utility of flexible models for ROC data analysis and their importance to study design. …


Accounting For Response Misclassification And Covariate Measurement Error Improves Powers And Reduces Bias In Epidemiologic Studies, Dunlei Cheng, Adam J. Branscum, James D. Stamey Jan 2010

Accounting For Response Misclassification And Covariate Measurement Error Improves Powers And Reduces Bias In Epidemiologic Studies, Dunlei Cheng, Adam J. Branscum, James D. Stamey

Dunlei Cheng

Purpose: To quantify the impact of ignoring misclassification of a response variable and measurement error in a covariate on statistical power, and to develop software for sample size and power analysis that accounts for these flaws in epidemiologic data. Methods: A Monte Carlo simulation-based procedure is developed to illustrate the differences in design requirements and inferences between analytic methods that properly account for misclassification and measurement error to those that do not in regression models for cross-sectional and cohort data. Results: We found that failure to account for these flaws in epidemiologic data can lead to a substantial reduction in …


A Bayesian Approach To Sample Size Determination For Studies Designed To Evaluate Continuous Medical Tests, Dunlei Cheng, Adam J. Branscum, James D. Stamey Jan 2010

A Bayesian Approach To Sample Size Determination For Studies Designed To Evaluate Continuous Medical Tests, Dunlei Cheng, Adam J. Branscum, James D. Stamey

Dunlei Cheng

We develop a Bayesian approach to sample size and power calculations for cross-sectional studies that are designed to evaluate and compare continuous medical tests. For studies that involve one test or two conditionally independent or dependent tests, we present methods that are applicable when the true disease status of sampled individuals will be available and when it will not. Within a hypothesis testing framework, we consider the goal of demonstrating that a medical test has area under the receiver operating characteristic (ROC) curve that exceeds a minimum acceptable level or another relevant threshold, and the goals of establishing the superiority …


Bayesian Approach To Average Power Calculations For Binary Regression Models With Misclassified Outcomes, Dunlei Cheng, James D. Stamey, Adam J. Branscum Dec 2008

Bayesian Approach To Average Power Calculations For Binary Regression Models With Misclassified Outcomes, Dunlei Cheng, James D. Stamey, Adam J. Branscum

Dunlei Cheng

We develop a simulation-based procedure for determining the required sample size in binomial regression risk assessment studies when response data are subject to misclassification. A Bayesian average power criterion is used to determine a sample size that provides high probability, averaged over the distribution of potential future data sets, of correctly establishing the direction of association between predictor variables and the probability of event occurrence. The method is broadly applicable to any parametric binomial regression model including, but not limited to, the popular logistic, probit, and complementary log-log models. We detail a common medical scenario wherein ascertainment of true disease …