Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Keyword
-
- Defect Prediction (5)
- Software Metrics (5)
- Data Mining, Software Engineering (3)
- Feature Selection (3)
- Software metrics (2)
-
- Aleph migration (1)
- Alma migration (1)
- Attribute selection (1)
- Best Practices (1)
- Bibliographic records (1)
- Classification (1)
- Data preparation (1)
- Data selection (1)
- Defect prediction (1)
- Feature ranking (1)
- High-dimensional data (1)
- Holdings records (1)
- Hybrid feature selection (1)
- Library and Information Science (1)
- Library system migrations (1)
- Performance metrics (1)
- Quality prediction (1)
- Scholarly Publishing (1)
- Software measurements (1)
- Threshold-based feature selection technique (1)
- Publication
Articles 1 - 6 of 6
Full-Text Articles in Physical Sciences and Mathematics
Alcts Crs Holdings Information Forum, 3-4 P.M. January 31, 2015, Connie Foster
Alcts Crs Holdings Information Forum, 3-4 P.M. January 31, 2015, Connie Foster
Connie Foster
Cecilia Genereux (data management & access/metadata & intellectual access, University of Minnesota Libraries) introduced the session by confessing to a pun intended for her presentation: Alma: To Have and to Hold. The levity quickly shifted into some very detailed analysis of the way the Ex Libris Alma system handled specific types of serials during a migration from Aleph. The University of Minnesota started with Aleph (Ex Libris) in 2002 and moved to Alma on December 26, 2013. Frances McNamara (director, Integrated Library Systems and Administrative and Desktop Systems at University of Chicago), discussed migrating serials data from Horizon to Kulai …
A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse
A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse
Dr. Huanjing Wang
Abstract Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The …
A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
Dr. Huanjing Wang
One factor that affects the success of machine learning is the presence of irrelevant or redundant information in the training data set. Filter-based feature ranking techniques (rankers) rank the features according to their relevance to the target attribute and we choose the most relevant features to build classification models subsequently. In order to evaluate the effectiveness of different feature ranking techniques, a commonly used method is to assess the classification performance of models built with the respective selected feature subsets in terms of a given performance metric (e.g., classification accuracy or misclassification rate). Since a given performance metric usually can …
Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya
Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya
Dr. Huanjing Wang
A large system often goes through multiple software project development cycles, in part due to changes in operation and development environments. For example, rapid turnover of the development team between releases can influence software quality, making it important to mine software project data over multiple system releases when building defect predictors. Data collection of software attributes are often conducted independent of the quality improvement goals, leading to the availability of a large number of attributes for analysis. Given the problems associated with variations in development process, data collection, and quality goals from one release to another emphasizes the importance of …
High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao
Dr. Huanjing Wang
Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set of software metrics that have the same predictive capability as a larger set of metrics – we strive to answer that question in this paper. We present a comprehensive comparison between seven commonly-used filter-based feature ranking techniques (FRT) …
An Empirical Investigation Of Filter Attribute Selection Techniques For Software Quality Classification, Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang
An Empirical Investigation Of Filter Attribute Selection Techniques For Software Quality Classification, Kehan Gao, Taghi M. Khoshgoftaar, Huanjing Wang
Dr. Huanjing Wang
Attribute selection is an important activity in data preprocessing for software quality modeling and other data mining problems. The software quality models have been used to improve the fault detection process. Finding faulty components in a software system during early stages of software development process can lead to a more reliable final product and can reduce development and maintenance costs. It has been shown in some studies that prediction accuracy of the models improves when irrelevant and redundant features are removed from the original data set. In this study, we investigated four filter attribute selection techniques, Automatic Hybrid Search (AHS), …