Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 9 of 9

Full-Text Articles in Physical Sciences and Mathematics

Stability And Classification Performance Of Feature Selection Techniques, Huanjing Wang, Taghi Khoshgoftaar, Qianhui Liang Dec 2011

Stability And Classification Performance Of Feature Selection Techniques, Huanjing Wang, Taghi Khoshgoftaar, Qianhui Liang

Computer Science Faculty Publications

Feature selection techniques can be evaluated based on either model performance or the stability (robustness) of the technique. The ideal situation is to choose a feature selec- tion technique that is robust to change, while also ensuring that models built with the selected features perform well. One domain where feature selection is especially important is software defect prediction, where large numbers of met- rics collected from previous software projects are used to help engineers focus their efforts on the most faulty mod- ules. This study presents a comprehensive empirical ex- amination of seven filter-based feature ranking techniques (rankers) applied to …


Quantifying Computer Network Security, Ian Burchett Dec 2011

Quantifying Computer Network Security, Ian Burchett

Masters Theses & Specialist Projects

Simplifying network security data to the point that it is readily accessible and usable by a wider audience is increasingly becoming important, as networks become larger and security conditions and threats become more dynamic and complex, requiring a broader and more varied security staff makeup. With the need for a simple metric to quantify the security level on a network, this thesis proposes: simplify a network’s security risk level into a simple metric. Methods for this simplification of an entire network’s security level are conducted on several characteristic networks. Identification of computer network port vulnerabilities from NIST’s Network Vulnerability Database …


Measuring Stability Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi Khoshgoftaar Nov 2011

Measuring Stability Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi Khoshgoftaar

Computer Science Faculty Publications

Feature selection has been applied in many domains, such as text mining and software engineering. Ideally a feature selection technique should produce consistent out- puts regardless of minor variations in the input data. Re- searchers have recently begun to examine the stability (robustness) of feature selection techniques. The stability of a feature selection method is defined as the degree of agreement between its outputs to randomly-selected subsets of the same input data. This study evaluated the stability of 11 threshold-based feature ranking techniques (rankers) when applied to 16 real-world software measurement datasets of different sizes. Experimental results demonstrate that AUC …


Measuring Robustness Of Feature Selection Techniques On Software Engineering Datasets, Huanjing Wang, Taghi Khoshgoftaar, Randall Wald Aug 2011

Measuring Robustness Of Feature Selection Techniques On Software Engineering Datasets, Huanjing Wang, Taghi Khoshgoftaar, Randall Wald

Computer Science Faculty Publications

Feature Selection is a process which identifies irrelevant and redundant features from a high-dimensional dataset (that is, a dataset with many features), and removes these before further analysis is performed. Recently, the robustness (e.g., stability) of feature selection techniques has been studied, to examine the sensitivity of these techniques to changes in their input data. In this study, we investigate the robustness of six commonly used feature selection techniques as the magnitude of change to the datasets and the size of the selected feature subsets are varied. All experiments were conducted on 16 datasets from three real-world software projects. The …


Implementation Of An Inexpensive 3d Scanner, Robert Michael Sivley May 2011

Implementation Of An Inexpensive 3d Scanner, Robert Michael Sivley

Mahurin Honors College Capstone Experience/Thesis Projects

The purpose of this project was to make 3D scanning technology accessible to the general public at an affordable price. To accomplish this, a scanner was designed that utilized structured-light projection. Coupled with triangulation, this provided an inexpensive, image-based modeling technique. Software was developed to analyze the required images and generate a 3D model. Users can edit analysis parameters during runtime to better optimize results for specific images. The 3D models generated are stored according to the .obj standard and can be opened in any commercial 3D modeling software. Results were positive, but several issues exist with the chosen scanning …


Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard May 2011

Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard

Economics Faculty Publications

The vast majority of the literature related to the empirical estimation of retention models includes a discussion of the theoretical retention framework established by Bean, Braxton, Tinto, Pascarella, Terenzini and others (see Bean, 1980; Bean, 2000; Braxton, 2000; Braxton et al, 2004; Chapman and Pascarella, 1983; Pascarell and Ternzini, 1978; St. John and Cabrera, 2000; Tinto, 1975) This body of research provides a starting point for the consideration of which explanatory variables to include in any model specification, as well as identifying possible data sources. The literature separates itself into two major camps including research related to the hypothesis testing …


Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan May 2011

Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan

Masters Theses & Specialist Projects

The eXtensible Markup Language (XML) has become the standard format for data exchange on the Internet, providing interoperability between different business applications. Such wide use results in large volumes of heterogeneous XML data, i.e., XML documents conforming to different schemas. Although schemas are important in many business applications, they are often missing in XML documents. In this thesis, we present a suite of algorithms that are effective in extracting schema information from a large collection of XML documents. We propose using the cost of NFA simulation to compute the Minimum Length Description to rank the inferred schema. We also studied …


Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang Jan 2011

Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang

Information Systems Faculty Publications

This article presents methods of using visual analysis to visually represent large amounts of massive, dynamic, ambiguous data allocated in a repository of learning objects. These methods are based on the semantic representation of these resources. We use a graphical model represented as a semantic graph. The formalization of the semantic graph has been intuitively built to solve a real problem which is browsing and searching for lectures in a vast repository of colleges/courses located at Western Kentucky University1. This study combines Formal Concept Analysis (FCA) with Semantic Factoring to decompose complex, vast concepts into their primitives in order to …


Ua3/9/2 I.T. Division Annual Report + Tactical Plan, Wku Information Technology Jan 2011

Ua3/9/2 I.T. Division Annual Report + Tactical Plan, Wku Information Technology

WKU Archives Records

Annual report of WKU Information Technology Division submitted to WKU President Gary Ransdell. Report is housed in UA3/9/2 Subject Files.