Open Access. Powered by Scholars. Published by Universities.®

Statistics and Probability Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Statistics and Probability

Dot: Gene-Set Analysis By Combining Decorrelated Association Statistics, Olga A. Vsevolozhskaya, Min Shi, Fengjiao Hu, Dmitri V. Zaykin Apr 2020

Dot: Gene-Set Analysis By Combining Decorrelated Association Statistics, Olga A. Vsevolozhskaya, Min Shi, Fengjiao Hu, Dmitri V. Zaykin

Biostatistics Faculty Publications

Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic …


Assessing The Probability That A Finding Is Genuine For Large-Scale Genetic Association Studies, Chia-Ling Kuo, Olga A. Vsevolozhskaya, Dmitri V. Zaykin May 2015

Assessing The Probability That A Finding Is Genuine For Large-Scale Genetic Association Studies, Chia-Ling Kuo, Olga A. Vsevolozhskaya, Dmitri V. Zaykin

Olga A. Vsevolozhskaya

Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity …


Functional Analysis Of Variance For Association Studies, Olga A. Vsevolozhskaya, Dmitri V. Zaykin, Mark C. Greenwood, Changshuai Wei, Qing Lu Sep 2014

Functional Analysis Of Variance For Association Studies, Olga A. Vsevolozhskaya, Dmitri V. Zaykin, Mark C. Greenwood, Changshuai Wei, Qing Lu

Olga A. Vsevolozhskaya

While progress has been made in identifying common genetic variants associated with human diseases, for most of common complex diseases, the identified genetic variants only account for a small proportion of heritability. Challenges remain in finding additional unknown genetic variants predisposing to complex diseases. With the advance in next-generation sequencing technologies, sequencing studies have become commonplace in genetic research. The ongoing exome-sequencing and whole-genome-sequencing studies generate a massive amount of sequencing variants and allow researchers to comprehensively investigate their role in human diseases. The discovery of new disease-associated variants can be enhanced by utilizing powerful and computationally efficient statistical methods. …


Regionalization Of Flood Data Using Probability Distributions And Their Parameters, Nageshwar Rao Bhaskar, Carol Alf O'Connor, Harold Andrew Myers, William Paul Puckett Dec 1989

Regionalization Of Flood Data Using Probability Distributions And Their Parameters, Nageshwar Rao Bhaskar, Carol Alf O'Connor, Harold Andrew Myers, William Paul Puckett

KWRRI Research Reports

The U. S. Geological survey recently used the method of residuals to delineate seven flood regions for the State of Kentucky. As an alternative approach, the FASTCLUS clustering procedure of the Statistical Analysis system (SAS) is used in this study to delineate five to six cluster regions in conjunction with statistical properties of the AMF series, like the coefficient of variation as estimated using method of L-moments, LCV, the parameters of the EVl and GEV flood frequency distributions, and the specific mean annual flood, QSP. For both cluster and USGS flood regions, regionalized flood frequency growth curves are developed and …