Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Biostatistics (4)
- Physical Sciences and Mathematics (4)
- Statistics and Probability (4)
- Computational Biology (3)
- Genetics (2)
-
- Longitudinal Data Analysis and Time Series (2)
- Medical Specialties (2)
- Medicine and Health Sciences (2)
- Microarrays (2)
- Applied Statistics (1)
- Biochemistry (1)
- Biochemistry, Biophysics, and Structural Biology (1)
- Bioinformatics (1)
- Biology (1)
- Biometry (1)
- Biotechnology (1)
- Cancer Biology (1)
- Cell Biology (1)
- Cell and Developmental Biology (1)
- Clinical Trials (1)
- Genomics (1)
- Integrative Biology (1)
- Mathematics (1)
- Molecular Genetics (1)
- Multivariate Analysis (1)
- Neurology (1)
- Oncology (1)
- Other Genetics and Genomics (1)
- Institution
Articles 1 - 7 of 7
Full-Text Articles in Genetics and Genomics
Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang
Incorporating Pathway Information Into Feature Selection Towards Better Performed Gene Signatures, Suyan Tian, Chi Wang, Bing Wang
Biostatistics Faculty Publications
To analyze gene expression data with sophisticated grouping structures and to extract hidden patterns from such data, feature selection is of critical importance. It is well known that genes do not function in isolation but rather work together within various metabolic, regulatory, and signaling pathways. If the biological knowledge contained within these pathways is taken into account, the resulting method is a pathway-based algorithm. Studies have demonstrated that a pathway-based method usually outperforms its gene-based counterpart in which no biological knowledge is considered. In this article, a pathway-based feature selection is firstly divided into three major categories, namely, pathway-level selection, …
Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang
Feature Selection For Longitudinal Data By Using Sign Averages To Summarize Gene Expression Values Over Time, Suyan Tian, Chi Wang
Biostatistics Faculty Publications
With the rapid evolution of high-throughput technologies, time series/longitudinal high-throughput experiments have become possible and affordable. However, the development of statistical methods dealing with gene expression profiles across time points has not kept up with the explosion of such data. The feature selection process is of critical importance for longitudinal microarray data. In this study, we proposed aggregating a gene’s expression values across time into a single value using the sign average method, thereby degrading a longitudinal feature selection process into a classic one. Regularized logistic regression models with pseudogenes (i.e., the sign average of genes across time as predictors) …
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
Unified Methods For Feature Selection In Large-Scale Genomic Studies With Censored Survival Outcomes, Lauren Spirko-Burns, Karthik Devarajan
COBRA Preprint Series
One of the major goals in large-scale genomic studies is to identify genes with a prognostic impact on time-to-event outcomes which provide insight into the disease's process. With rapid developments in high-throughput genomic technologies in the past two decades, the scientific community is able to monitor the expression levels of tens of thousands of genes and proteins resulting in enormous data sets where the number of genomic features is far greater than the number of subjects. Methods based on univariate Cox regression are often used to select genomic features related to survival outcome; however, the Cox model assumes proportional hazards …
A Logitudinal Feature Selection Method Identifies Relevant Genes To Distinguish Complicated Injury And Uncomplicated Injury Over Time, Suyan Tian, Chi Wang, Howard H. Chang
A Logitudinal Feature Selection Method Identifies Relevant Genes To Distinguish Complicated Injury And Uncomplicated Injury Over Time, Suyan Tian, Chi Wang, Howard H. Chang
Biostatistics Faculty Publications
Background: Feature selection and gene set analysis are of increasing interest in the field of bioinformatics. While these two approaches have been developed for different purposes, we describe how some gene set analysis methods can be utilized to conduct feature selection.
Methods: We adopted a gene set analysis method, the significance analysis of microarray gene set reduction (SAMGSR) algorithm, to carry out feature selection for longitudinal gene expression data.
Results: Using a real-world application and simulated data, it is demonstrated that the proposed SAMGSR extension outperforms other relevant methods. In this study, we illustrate that a gene’s expression profiles over …
Global Analysis Of Gene Expression And Projection Target Correlations In The Mouse Brain, Ahmed Fakhry, Tao Zeng, Hanchuan Peng, Shuiwang Ji
Global Analysis Of Gene Expression And Projection Target Correlations In The Mouse Brain, Ahmed Fakhry, Tao Zeng, Hanchuan Peng, Shuiwang Ji
Computer Science Faculty Publications
Recent studies have shown that projection targets in the mouse neocortex are correlated with their gene expression patterns. However, a brain-wide quantitative analysis of the relationship between voxel genetic composition and their projection targets is lacking to date. Here we extended those studies to perform a global, integrative analysis of gene expression and projection target correlations in the mouse brain. By using the Allen Brain Atlas data, we analyzed the relationship between gene expression and projection targets. We first visualized and clustered the two data sets separately and showed that they both exhibit strong spatial autocorrelation. Building upon this initial …
A Comparative Study Of Different Machine Learning Methods On Microarray Gene Expression Data, Mehdi Pirooznia, Jack Y. Yang, Mary Qu Yang, Youping Deng
A Comparative Study Of Different Machine Learning Methods On Microarray Gene Expression Data, Mehdi Pirooznia, Jack Y. Yang, Mary Qu Yang, Youping Deng
Faculty Publications
Background
Several classification and feature selection methods have been studied for the identification of differentially expressed genes in microarray data. Classification methods such as SVM, RBF Neural Nets, MLP Neural Nets, Bayesian, Decision Tree and Random Forrest methods have been used in recent studies. The accuracy of these methods has been calculated with validation methods such as v-fold validation. However there is lack of comparison between these methods to find a better framework for classification, clustering and analysis of microarray gene expression results.
Results
In this study, we compared the efficiency of the classification methods including; SVM, RBF Neural Nets, …
Improving Prediction Accuracy Of Tumor Classification By Reusing Genes Discarded During Gene Selection, Jack Y. Yang, Guo-Zheng Li, Hao-Hua Meng, Mary Qu Yang, Youping Deng
Improving Prediction Accuracy Of Tumor Classification By Reusing Genes Discarded During Gene Selection, Jack Y. Yang, Guo-Zheng Li, Hao-Hua Meng, Mary Qu Yang, Youping Deng
Faculty Publications
Background
Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue.
Results
We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL …