Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
Articles 1 - 2 of 2
Full-Text Articles in Physical Sciences and Mathematics
Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey
Optimal Feature Selection For Nearest Centroid Classifiers, With Applications To Gene Expression Microarrays, Alan R. Dabney, John D. Storey
UW Biostatistics Working Paper Series
Nearest centroid classifiers have recently been successfully employed in high-dimensional applications. A necessary step when building a classifier for high-dimensional data is feature selection. Feature selection is typically carried out by computing univariate statistics for each feature individually, without consideration for how a subset of features performs as a whole. For subsets of a given size, we characterize the optimal choice of features, corresponding to those yielding the smallest misclassification rate. Furthermore, we propose an algorithm for estimating this optimal subset in practice. Finally, we investigate the applicability of shrinkage ideas to nearest centroid classifiers. We use gene-expression microarrays for …
A Platform-Independent Software Suite For Statistical Analysis Of High Dimensional Biology Data, David B. Allison, Jacob P. L. Brand, Jode W. Edwards, Gary L. Gadbury, Kyoungmi Kim, Tapan Mehta, Grier P. Page, Amit Patki, Vinodh Srinivasasainagendra, Prinal Trivedi, Jelai Wang, Stanislav O. Zakharkin
A Platform-Independent Software Suite For Statistical Analysis Of High Dimensional Biology Data, David B. Allison, Jacob P. L. Brand, Jode W. Edwards, Gary L. Gadbury, Kyoungmi Kim, Tapan Mehta, Grier P. Page, Amit Patki, Vinodh Srinivasasainagendra, Prinal Trivedi, Jelai Wang, Stanislav O. Zakharkin
Mathematics and Statistics Faculty Research & Creative Works
Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing.