Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 11 of 11

Full-Text Articles in Computer Sciences

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi Nov 2011

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi

David LO

In statistics and data mining communities, there have been many measures proposed to gauge the strength of association between two variables of interest, such as odds ratio, confidence, Yule-Y, Yule-Q, Kappa, and gini index. These association measures have been used in various domains, for example, to evaluate whether a particular medical practice is associated positively to a cure of a disease or whether a particular marketing strategy is associated positively to an increase in revenue, etc. This paper models the problem of locating faults as association between the execution or non-execution of particular program elements with failures. There have been …


Artificial Intelligence - I: Adaptive Automated Teller Machines - Part Ii, Ghulam Mujtaba, Tariq Mahmood Jul 2011

Artificial Intelligence - I: Adaptive Automated Teller Machines - Part Ii, Ghulam Mujtaba, Tariq Mahmood

International Conference on Information and Communication Technologies

Nowadays, the banking sector is increasingly relying on Automated Teller Machines (ATMs) in order to provide services to its customers. Although thousands of ATMs exist across many banks and different locations, the GUI and content of a typical ATM interface remains, more or less, the same. For instance, any ATM provides typical options for withdrawal, electronic funds transfer, viewing of mini-statements etc. However, such a static interface might not be suitable for all ATM customers, e.g., some users might not prefer to view all the options when they access the ATM, or to view specific withdrawal amounts less than, say, …


Artificial Intelligence – I: Adaptive Automated Teller Machines — Part I, Ghulam Mujtaba, Tariq Mahmood Jul 2011

Artificial Intelligence – I: Adaptive Automated Teller Machines — Part I, Ghulam Mujtaba, Tariq Mahmood

International Conference on Information and Communication Technologies

During the past few years, the banking sector has started providing a variety of services to its customers. One of the most significant of such services has been the introduction of the Automated Teller Machines (ATMs) for providing online support to bank customers. The use of ATMs has reached its zenith in every developed country, and thousands of ATM transactions are occurring on a daily basis. In order to increase the customers' satisfaction and to provide them with more user-friendly ATM interfaces, it becomes important to mine the ATM transactions to discover useful patterns about the customers' interacting behaviors. In …


Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard May 2011

Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard

Economics Faculty Publications

The vast majority of the literature related to the empirical estimation of retention models includes a discussion of the theoretical retention framework established by Bean, Braxton, Tinto, Pascarella, Terenzini and others (see Bean, 1980; Bean, 2000; Braxton, 2000; Braxton et al, 2004; Chapman and Pascarella, 1983; Pascarell and Ternzini, 1978; St. John and Cabrera, 2000; Tinto, 1975) This body of research provides a starting point for the consideration of which explanatory variables to include in any model specification, as well as identifying possible data sources. The literature separates itself into two major camps including research related to the hypothesis testing …


Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan May 2011

Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan

Masters Theses & Specialist Projects

The eXtensible Markup Language (XML) has become the standard format for data exchange on the Internet, providing interoperability between different business applications. Such wide use results in large volumes of heterogeneous XML data, i.e., XML documents conforming to different schemas. Although schemas are important in many business applications, they are often missing in XML documents. In this thesis, we present a suite of algorithms that are effective in extracting schema information from a large collection of XML documents. We propose using the cost of NFA simulation to compute the Minimum Length Description to rank the inferred schema. We also studied …


An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu Mar 2011

An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu

Faculty and Research Publications

Finding nearest neighbors in large multi-dimensional data has always been one of the research interests in data mining field. In this paper, we present our continuous research on similarity search problems. Previously we have worked on exploring the meaning of K nearest neighbors from a new perspective in PanKNN [20]. It redefines the distances between data points and a given query point Q, efficiently and effectively selecting data points which are closest to Q. It can be applied in various data mining fields. A large amount of real data sets have irrelevant or obstacle information which greatly affects the effectiveness …


Data Mining Based Learning Algorithms For Semi-Supervised Object Identification And Tracking, Michael P. Dessauer Jan 2011

Data Mining Based Learning Algorithms For Semi-Supervised Object Identification And Tracking, Michael P. Dessauer

Doctoral Dissertations

Sensor exploitation (SE) is the crucial step in surveillance applications such as airport security and search and rescue operations. It allows localization and identification of movement in urban settings and can significantly boost knowledge gathering, interpretation and action. Data mining techniques offer the promise of precise and accurate knowledge acquisition techniques in high-dimensional data domains (and diminishing the “curse of dimensionality” prevalent in such datasets), coupled by algorithmic design in feature extraction, discriminative ranking, feature fusion and supervised learning (classification). Consequently, data mining techniques and algorithms can be used to refine and process captured data and to detect, recognize, classify, …


Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman Jan 2011

Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman

USF Tampa Graduate Theses and Dissertations

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms, such as association rule mining and decision tree induction, are used to discover classification rules for specific targets. This multi-stage pipeline approach is contrasted with traditional statistical text mining (STM) methods based on term counts and term-by-document frequencies. The aim is to create effective text analytic processes by adapting and combining individual …


Determining A Patient Recovery From A Total Knee Replacement Using Fuzzy Logic And Active Databases, Robert Azarbod Jan 2011

Determining A Patient Recovery From A Total Knee Replacement Using Fuzzy Logic And Active Databases, Robert Azarbod

All Graduate Theses, Dissertations, and Other Capstone Projects

The purpose of the knowledge-based system is to predict the rehabilitation timeline of a patient in physical therapy for a total knee replacement. All patients have various attributes that contribute to their rehabilitation rate such as: weight, gender, smoking habit, medications, physical ability, or other medical problems. A combination of any one or several of these attributes will affect the recovery process. The proposed FRTP (Fuzzy Rehabilitation Timeline Predictor) is a fuzzy data mining model that can predict the recovery length of a patient in physical therapy for a total knee replacement and provide feedback to experts for revision of …


A Web Based Fuzzy Data Mining Using Combs Inference Method And Decision Predictor, Shajia Akhter Sharmin Jan 2011

A Web Based Fuzzy Data Mining Using Combs Inference Method And Decision Predictor, Shajia Akhter Sharmin

All Graduate Theses, Dissertations, and Other Capstone Projects

Fuzzy logic has become a very popular method of reasoning a system with approximate input system instead of a precise one. When qualitative variables are used to determine the decisions then we have to create some specific functions where the membership values of the input can be any number between 0 to 1 instead of 1 or 0 which is used in binary logic. When number of input attribute increases it the combinatorial rules increases exponentially, and diminishes performance of the system. The problem is generally known as “combinatorial rule explosion”. The Information Technology Department of Minnesota State University, Mankato …


Parallel Surrogate Detection In Large-Scale Simulations, Lei Jiang Jan 2011

Parallel Surrogate Detection In Large-Scale Simulations, Lei Jiang

LSU Master's Theses

Simulation has become a useful approach in scientific computing and engineering for its ability to model real natural or human systems. In particular, for complex systems such as hurricanes, wildfire disasters, and real-time road traffic, simulation methods are able to provide researchers, engineers and decision makers predicted values in order to help them to take appropriate actions. For large-scale problems, the simulations usually take a lot of time on supercomputers, thus making real-time predictions more difficult. Approximation models that mimic the behavior of simulation models but are computationally cheaper, namely "surrogate models", are desired in such scenarios. In the thesis, …