Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Western Kentucky University

Discipline
Keyword
Publication Year
Publication
Publication Type
File Type

Articles 1 - 30 of 42

Full-Text Articles in Databases and Information Systems

Index Bucketing: A Novel Approach To Manipulating Data Structures, Jeffrey Myers Dec 2023

Index Bucketing: A Novel Approach To Manipulating Data Structures, Jeffrey Myers

Masters Theses & Specialist Projects

Handling nested data collections in large-scale distributed systems poses considerable challenges in query processing, often resulting in substantial costs and error susceptibility. While substantial efforts have been directed toward overcoming computation hurdles in querying vast data collections within relational databases, scant attention has been devoted to the manipulation and flattening procedures necessary for unnesting these data collections. Flattening operations, integral to unnesting, frequently yield copious duplicate data and entail a loss of information, devoid of mechanisms for reconstructing the original structure. These challenges exacerbate in scenarios involving skewed, nested data with irregular inner data collections. Processing such data demands an …


Buffer Overflow And Sql Injection In C++, Noah Warren Kapley Apr 2021

Buffer Overflow And Sql Injection In C++, Noah Warren Kapley

Masters Theses & Specialist Projects

Buffer overflows and SQL Injection have plagued programmers for many years. A successful buffer overflow, innocuous or not, damages a computer’s permanent memory. Safer buffer overflow programs are presented in this thesis for the C programs characterizing string concatenation, string copy, and format get string, a C program which takes input and output from a keyboard, in most cases. Safer string concatenation and string copy programs presented in this thesis require the programmer to specify the amount of storage space necessary for the program’s execution. This safety mechanism is designed to help programmers avoid over specifying the amount of storage …


A Proposed Frequency-Based Feature Selection Method For Cancer Classification, Yi Pan Apr 2017

A Proposed Frequency-Based Feature Selection Method For Cancer Classification, Yi Pan

Masters Theses & Specialist Projects

Feature selection method is becoming an essential procedure in data preprocessing step. The feature selection problem can affect the efficiency and accuracy of classification models. Therefore, it also relates to whether a classification model can have a reliable performance. In this study, we compared an original feature selection method and a proposed frequency-based feature selection method with four classification models and three filter-based ranking techniques using a cancer dataset. The proposed method was implemented in WEKA which is an open source software. The performance is evaluated by two evaluation methods: Recall and Receiver Operating Characteristic (ROC). Finally, we found the …


Mabic: Mobile Application Builder For Interactive Communication, Huy Manh Nguyen Oct 2016

Mabic: Mobile Application Builder For Interactive Communication, Huy Manh Nguyen

Masters Theses & Specialist Projects

Nowadays, the web services and mobile technology advance to a whole new level. These technologies make the modern communication faster and more convenient than the traditional way. People can also easily share data, picture, image and video instantly. It also saves time and money. For example: sending an email or text message is cheaper and faster than a letter. Interactive communication allows the instant exchange of feedback and enables two-way communication between people and people, or people and computer. It increases the engagement of sender and receiver in communication.

Although many systems such as REDCap and Taverna are built for …


Alcts Crs Holdings Information Forum, 3-4 P.M. January 31, 2015, Connie Foster Jan 2015

Alcts Crs Holdings Information Forum, 3-4 P.M. January 31, 2015, Connie Foster

Connie Foster

Cecilia Genereux (data management & access/metadata & intellectual access, University of Minnesota Libraries) introduced the session by confessing to a pun intended for her presentation: Alma: To Have and to Hold. The levity quickly shifted into some very detailed analysis of the way the Ex Libris Alma system handled specific types of serials during a migration from Aleph. The University of Minnesota started with Aleph (Ex Libris) in 2002 and moved to Alma on December 26, 2013. Frances McNamara (director, Integrated Library Systems and Administrative and Desktop Systems at University of Chicago), discussed migrating serials data from Horizon to Kulai …


Advances Of Scientific Research On Technology Enhanced Learning In Social Networks And Mobile Contexts: Towards High Effective Educational Platforms For Next Generation Education, Leyla Zhuhadar, Miltiadis Lytras, Jacky Xi Zhang, Eugenius Kurilovas Oct 2014

Advances Of Scientific Research On Technology Enhanced Learning In Social Networks And Mobile Contexts: Towards High Effective Educational Platforms For Next Generation Education, Leyla Zhuhadar, Miltiadis Lytras, Jacky Xi Zhang, Eugenius Kurilovas

Information Systems Faculty Publications

This editorial presents the latest advances of scientific Research on Technology enhanced learning in social networks and mobile contexts. It summarizes the key finding and promotes three main pillars for future scientific contribution to the domain namely: Enabling Technologies, Educational Strategies, and Next Generation Social Networks for Educational Purposes. It can serve as a position document for scientific debate fostering international collaboration and empirical research in the various aspects of the well-defined agenda. It can also serve as a reference edition for researchers interested in the adoption of Social Networks, in the Education Sector.


[Sabbatical Report], Huanjing Wang Apr 2014

[Sabbatical Report], Huanjing Wang

Sabbatical Reports

My sabbatical leave was conducted during Spring semester 2014. The leave was successful because it strengthened my research in data mining and software engineering domains and resulted four full-paper publications in peer-reviewed international conferences and one journal paper (to be submitted to a peer-reviewed journal). The purpose of my sabbatical was to complete two main projects: (1) Investigate the stability and defect prediction model performance of feature selection techniques together on real-world software metrics data and (2) Design a novel, robust, and efficient metric selection method for imbalanced data.


Has Safeer Improved Sacm's Work And Helped Saudi Students In The Usa Resolve Their Needs Quickly, Faisal M. Alzomily Aug 2013

Has Safeer Improved Sacm's Work And Helped Saudi Students In The Usa Resolve Their Needs Quickly, Faisal M. Alzomily

Masters Theses & Specialist Projects

This study examined efficiency of the Safeer by gathering and analyzing the perception of 131 Saudi students from Bowling Green, KY. The purpose of the study was to ensure that the system is able to perform its function as the bridge between different institutions and Saudi students studying in the US who require assistance in processing their academic requirements. A self-administered survey using five scale points was employed. Results were summarized using descriptive statistics at 95% confidence level. The result confirmed the hypothesis that the use of the Safeer program provides quality service delivery within SACM, which in turn benefits …


Hybrid Methods For Feature Selection, Iunniang Cheng May 2013

Hybrid Methods For Feature Selection, Iunniang Cheng

Masters Theses & Specialist Projects

Feature selection is one of the important data preprocessing steps in data mining. The feature selection problem involves finding a feature subset such that a classification model built only with this subset would have better predictive accuracy than model built with a complete set of features. In this study, we propose two hybrid methods for feature selection. The best features are selected through either the hybrid methods or existing feature selection methods. Next, the reduced dataset is used to build classification models using five classifiers. The classification accuracy was evaluated in terms of the area under the Receiver Operating Characteristic …


Ikriya: Simulating Software Quality Enhancement With Selected Replacement Policies, Sindhu Dharani Murthy May 2013

Ikriya: Simulating Software Quality Enhancement With Selected Replacement Policies, Sindhu Dharani Murthy

Masters Theses & Specialist Projects

The quality of information systems in any organization helps to determine the
efficiency of the organization. Many organizations maintain a custom software portfolio, whose quality is important to the organization. Management would like to optimize the portfolio’s quality. Decisions about software replacement or enhancement are made based on organizational needs and priorities. The development resources allocated help in determining the quality of new software, and should be put to optimal use. Enhancing existing software might sound cheap and easy but it is not always efficient. This thesis proposes a simulation model - iKriya - for this problem. It explores the …


A Hybrid Recommendation System Based On Association Rules, Ahmed Alsalama May 2013

A Hybrid Recommendation System Based On Association Rules, Ahmed Alsalama

Masters Theses & Specialist Projects

Recommendation systems are widely used in e-commerce applications. The
engine of a current recommendation system recommends items to a particular user based on user preferences and previous high ratings. Various recommendation schemes such as collaborative filtering and content-based approaches are used to build a recommendation system. Most of current recommendation systems were developed to fit a certain domain such as books, articles, and movies. We propose a hybrid framework recommendation system to be applied on two dimensional spaces (User × Item) with a large number of users and a small number of items. Moreover, our proposed framework makes use of …


Integrating Expert System And Geographic Information System For Spatial Decision Making, Sriharsha Shesham Dec 2012

Integrating Expert System And Geographic Information System For Spatial Decision Making, Sriharsha Shesham

Masters Theses & Specialist Projects

Spatial decision making is a process of providing an effective solution for a problem that encompasses semi-structured spatial data. It is a challenging task which involves various factors to consider. For example, in order to build a new industry, an appropriate site must be selected for which several factors have to be taken into consideration. Some of the factors, which can affect the decision in this particular case, are air pollution, noise pollution, and distance from living areas, which makes the decision difficult. The geographic information systems (GIS) and the expert systems (ES) have many advantages in solving problems in …


Dynamic Scoping For Browser Based Access Control System, Vinaykumar Nadipelly May 2012

Dynamic Scoping For Browser Based Access Control System, Vinaykumar Nadipelly

Masters Theses & Specialist Projects

We have inorganically increased the use of web applications to the point of using them for almost everything and making them an essential part of our everyday lives. As a result, the enhancement of privacy and security policies for the web applications is becoming increasingly essential. The importance and stateless nature of the web infrastructure made the web a preferred target of attacks. The current web access control system is a reason behind the victory of attacks. The current web consists of two major components, the browser and the server, where the effective access control system needs to be implemented. …


Dynamic Data Extraction And Data Visualization With Application To The Kentucky Mesonet, Anoop Rao Paidipally May 2012

Dynamic Data Extraction And Data Visualization With Application To The Kentucky Mesonet, Anoop Rao Paidipally

Masters Theses & Specialist Projects

There is a need to integrate large-scale database, high-performance computing engines and geographical information system technologies into a user-friendly web interface as a platform for data visualization and customized statistical analysis. We present some concepts and design ideas regarding dynamic data storage and extraction by making use of open-source computing and mapping technologies. We implemented our methods to the Kentucky Mesonet automated weather mapping workflow. The main components of the work flow includes a web based interface, a robust database and computing infrastructure designed for both general users and power users such as modelers and researchers.


Organizational Search In Email Systems, Sruthi Bhushan Pitla May 2012

Organizational Search In Email Systems, Sruthi Bhushan Pitla

Masters Theses & Specialist Projects

The storage space for emails has been increasing at a rapid pace day by day. Email systems still serve as very important data repositories for many users to store different kinds of information which they use in their daily activities. Due to the rapidly increasing volume of email data, there is a need to maintain the data in a most efficient way. It is also very important to provide intuitive and flexible search utilities to provide better access to the information in the email repositories, especially in an enterprise or organizational setting. In order to implement the functionality, we are …


Ensemble Of Feature Selection Techniques For High Dimensional Data, Sri Harsha Vege May 2012

Ensemble Of Feature Selection Techniques For High Dimensional Data, Sri Harsha Vege

Masters Theses & Specialist Projects

Data mining involves the use of data analysis tools to discover previously unknown, valid patterns and relationships from large amounts of data stored in databases, data warehouses, or other information repositories. Feature selection is an important preprocessing step of data mining that helps increase the predictive performance of a model. The main aim of feature selection is to choose a subset of features with high predictive information and eliminate irrelevant features with little or no predictive information. Using a single feature selection technique may generate local optima.

In this thesis we propose an ensemble approach for feature selection, where multiple …


User Expectations Of Library Genealogy Databases V. What They Actually Get, Rosemary L. Meszaros, Katherine Pennavaria Apr 2012

User Expectations Of Library Genealogy Databases V. What They Actually Get, Rosemary L. Meszaros, Katherine Pennavaria

DLPS Faculty Publications

An analysis and comparison of two genealogical databases: Ancestry.com and Heritagequest.com.


Quantifying Computer Network Security, Ian Burchett Dec 2011

Quantifying Computer Network Security, Ian Burchett

Masters Theses & Specialist Projects

Simplifying network security data to the point that it is readily accessible and usable by a wider audience is increasingly becoming important, as networks become larger and security conditions and threats become more dynamic and complex, requiring a broader and more varied security staff makeup. With the need for a simple metric to quantify the security level on a network, this thesis proposes: simplify a network’s security risk level into a simple metric. Methods for this simplification of an entire network’s security level are conducted on several characteristic networks. Identification of computer network port vulnerabilities from NIST’s Network Vulnerability Database …


Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard May 2011

Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard

Economics Faculty Publications

The vast majority of the literature related to the empirical estimation of retention models includes a discussion of the theoretical retention framework established by Bean, Braxton, Tinto, Pascarella, Terenzini and others (see Bean, 1980; Bean, 2000; Braxton, 2000; Braxton et al, 2004; Chapman and Pascarella, 1983; Pascarell and Ternzini, 1978; St. John and Cabrera, 2000; Tinto, 1975) This body of research provides a starting point for the consideration of which explanatory variables to include in any model specification, as well as identifying possible data sources. The literature separates itself into two major camps including research related to the hypothesis testing …


Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan May 2011

Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan

Masters Theses & Specialist Projects

The eXtensible Markup Language (XML) has become the standard format for data exchange on the Internet, providing interoperability between different business applications. Such wide use results in large volumes of heterogeneous XML data, i.e., XML documents conforming to different schemas. Although schemas are important in many business applications, they are often missing in XML documents. In this thesis, we present a suite of algorithms that are effective in extracting schema information from a large collection of XML documents. We propose using the cost of NFA simulation to compute the Minimum Length Description to rank the inferred schema. We also studied …


Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang Jan 2011

Visual Knowledge Representation Of Conceptual Semantic Networks, Leyla Zhuhadar, Olfa Nasraoui, Robert Wyatt, Rong Yang

Information Systems Faculty Publications

This article presents methods of using visual analysis to visually represent large amounts of massive, dynamic, ambiguous data allocated in a repository of learning objects. These methods are based on the semantic representation of these resources. We use a graphical model represented as a semantic graph. The formalization of the semantic graph has been intuitively built to solve a real problem which is browsing and searching for lectures in a vast repository of colleges/courses located at Western Kentucky University1. This study combines Formal Concept Analysis (FCA) with Semantic Factoring to decompose complex, vast concepts into their primitives in order to …


Ua3/9/2 I.T. Division Annual Report + Tactical Plan, Wku Information Technology Jan 2011

Ua3/9/2 I.T. Division Annual Report + Tactical Plan, Wku Information Technology

WKU Archives Records

Annual report of WKU Information Technology Division submitted to WKU President Gary Ransdell. Report is housed in UA3/9/2 Subject Files.


A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse Aug 2010

A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse

Dr. Huanjing Wang

Abstract Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The …


A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao Aug 2010

A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao

Dr. Huanjing Wang

One factor that affects the success of machine learning is the presence of irrelevant or redundant information in the training data set. Filter-based feature ranking techniques (rankers) rank the features according to their relevance to the target attribute and we choose the most relevant features to build classification models subsequently. In order to evaluate the effectiveness of different feature ranking techniques, a commonly used method is to assess the classification performance of models built with the respective selected feature subsets in terms of a given performance metric (e.g., classification accuracy or misclassification rate). Since a given performance metric usually can …


A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse Aug 2010

A Comparative Study Of Threshold-Based Feature Selection Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Jason Van Hulse

Computer Science Faculty Publications

Abstract Given high-dimensional software measurement data, researchers and practitioners often use feature (metric) selection techniques to improve the performance of software quality classification models. This paper presents our newly proposed threshold-based feature selection techniques, comparing the performance of these techniques by building classification models using five commonly used classifiers. In order to evaluate the effectiveness of different feature selection techniques, the models are evaluated using eight different performance metrics separately since a given performance metric usually captures only one aspect of the classification performance. All experiments are conducted on three Eclipse data sets with different levels of class imbalance. The …


A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao Aug 2010

A Comparative Study Of Filter-Based Feature Ranking Techniques, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao

Computer Science Faculty Publications

One factor that affects the success of machine learning is the presence of irrelevant or redundant information in the training data set. Filter-based feature ranking techniques (rankers) rank the features according to their relevance to the target attribute and we choose the most relevant features to build classification models subsequently. In order to evaluate the effectiveness of different feature ranking techniques, a commonly used method is to assess the classification performance of models built with the respective selected feature subsets in terms of a given performance metric (e.g., classification accuracy or misclassification rate). Since a given performance metric usually can …


Information Technology Implementation Decisions To Support The Kentucky Mesonet, D. Michael Grogan Apr 2010

Information Technology Implementation Decisions To Support The Kentucky Mesonet, D. Michael Grogan

Masters Theses & Specialist Projects

The Kentucky Mesonet is a high-density, mesoscale network of automated meteorological and climatological sensing platforms being developed across the commonwealth. Data communications, collection, processing, and delivery mechanisms play a critical role in such networks, and the World Meteorological Organization recognizes that “an observing system is not complete unless it is connected to other systems that deliver the data to the users.” This document reviews the implementation steps, decisions, and rationale surrounding communications and computing infrastructure development to support the Mesonet. A general overview of the network and technology-related research is provided followed by a review of pertinent literature related to …


Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya Dec 2009

Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya

Dr. Huanjing Wang

A large system often goes through multiple software project development cycles, in part due to changes in operation and development environments. For example, rapid turnover of the development team between releases can influence software quality, making it important to mine software project data over multiple system releases when building defect predictors. Data collection of software attributes are often conducted independent of the quality improvement goals, leading to the availability of a large number of attributes for analysis. Given the problems associated with variations in development process, data collection, and quality goals from one release to another emphasizes the importance of …


Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya Dec 2009

Mining Data From Multiple Software Development Projects, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao, Naeem Seliya

Computer Science Faculty Publications

A large system often goes through multiple software project development cycles, in part due to changes in operation and development environments. For example, rapid turnover of the development team between releases can influence software quality, making it important to mine software project data over multiple system releases when building defect predictors. Data collection of software attributes are often conducted independent of the quality improvement goals, leading to the availability of a large number of attributes for analysis. Given the problems associated with variations in development process, data collection, and quality goals from one release to another emphasizes the importance of …


High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao Nov 2009

High-Dimensional Software Engineering Data And Feature Selection, Huanjing Wang, Taghi M. Khoshgoftaar, Kehan Gao

Dr. Huanjing Wang

Software metrics collected during project development play a critical role in software quality assurance. A software practitioner is very keen on learning which software metrics to focus on for software quality prediction. While a concise set of software metrics is often desired, a typical project collects a very large number of metrics. Minimal attention has been devoted to finding the minimum set of software metrics that have the same predictive capability as a larger set of metrics – we strive to answer that question in this paper. We present a comprehensive comparison between seven commonly-used filter-based feature ranking techniques (FRT) …