Open Access. Powered by Scholars. Published by Universities.®

Genetics and Genomics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Genetics and Genomics

Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang Apr 2019

Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang

Biostatistics Faculty Publications

To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …


Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin Dec 2017

Bayesian Prediction Intervals For Assessing P-Value Variability In Prospective Replication Studies, Olga A. Vsevolozhskaya, Gabriel Ruiz, Dmitri Zaykin

Biostatistics Faculty Publications

Increased availability of data and accessibility of computational tools in recent years have created an unprecedented upsurge of scientific studies driven by statistical analysis. Limitations inherent to statistics impose constraints on the reliability of conclusions drawn from data, so misuse of statistical methods is a growing concern. Hypothesis and significance testing, and the accompanying P-values are being scrutinized as representing the most widely applied and abused practices. One line of critique is that P-values are inherently unfit to fulfill their ostensible role as measures of credibility for scientific hypotheses. It has also been suggested that while P-values …


Identification Of Prognostic Genes And Gene Sets For Early-Stage Non-Small Cell Lung Cancer Using Bi-Level Selection Methods, Suyan Tian, Chi Wang, Howard H. Chang, Jianguo Sun Apr 2017

Identification Of Prognostic Genes And Gene Sets For Early-Stage Non-Small Cell Lung Cancer Using Bi-Level Selection Methods, Suyan Tian, Chi Wang, Howard H. Chang, Jianguo Sun

Biostatistics Faculty Publications

In contrast to feature selection and gene set analysis, bi-level selection is a process of selecting not only important gene sets but also important genes within those gene sets. Depending on the order of selections, a bi-level selection method can be classified into three categories – forward selection, which first selects relevant gene sets followed by the selection of relevant individual genes; backward selection which takes the reversed order; and simultaneous selection, which performs the two tasks simultaneously usually with the aids of a penalized regression model. To test the existence of subtype-specific prognostic genes for non-small cell lung cancer …


Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang Sep 2016

Weighted-Samgsr: Combining Significance Analysis Of Microarray-Gene Set Reduction Algorithm With Pathway Topology-Based Weights To Select Relevant Genes, Suyan Tian, Howard H. Chang, Chi Wang

Biostatistics Faculty Publications

Background: It has been demonstrated that a pathway-based feature selection method that incorporates biological information within pathways during the process of feature selection usually outperforms a gene-based feature selection algorithm in terms of predictive accuracy and stability. Significance analysis of microarray-gene set reduction algorithm (SAMGSR), an extension to a gene set analysis method with further reduction of the selected pathways to their respective core subsets, can be regarded as a pathway-based feature selection method.

Methods: In SAMGSR, whether a gene is selected is mainly determined by its expression difference between the phenotypes, and partially by the number of pathways to …