Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Physical Sciences and Mathematics

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi Nov 2011

Comprehensive Evaluation Of Association Measures For Fault Localization, Lucia Lucia, David Lo, Lingxiao Jiang, Aditya Budi

David LO

In statistics and data mining communities, there have been many measures proposed to gauge the strength of association between two variables of interest, such as odds ratio, confidence, Yule-Y, Yule-Q, Kappa, and gini index. These association measures have been used in various domains, for example, to evaluate whether a particular medical practice is associated positively to a cure of a disease or whether a particular marketing strategy is associated positively to an increase in revenue, etc. This paper models the problem of locating faults as association between the execution or non-execution of particular program elements with failures. There have been …


Artificial Intelligence - I: Adaptive Automated Teller Machines - Part Ii, Ghulam Mujtaba, Tariq Mahmood Jul 2011

Artificial Intelligence - I: Adaptive Automated Teller Machines - Part Ii, Ghulam Mujtaba, Tariq Mahmood

International Conference on Information and Communication Technologies

Nowadays, the banking sector is increasingly relying on Automated Teller Machines (ATMs) in order to provide services to its customers. Although thousands of ATMs exist across many banks and different locations, the GUI and content of a typical ATM interface remains, more or less, the same. For instance, any ATM provides typical options for withdrawal, electronic funds transfer, viewing of mini-statements etc. However, such a static interface might not be suitable for all ATM customers, e.g., some users might not prefer to view all the options when they access the ATM, or to view specific withdrawal amounts less than, say, …


Artificial Intelligence – I: Adaptive Automated Teller Machines — Part I, Ghulam Mujtaba, Tariq Mahmood Jul 2011

Artificial Intelligence – I: Adaptive Automated Teller Machines — Part I, Ghulam Mujtaba, Tariq Mahmood

International Conference on Information and Communication Technologies

During the past few years, the banking sector has started providing a variety of services to its customers. One of the most significant of such services has been the introduction of the Automated Teller Machines (ATMs) for providing online support to bank customers. The use of ATMs has reached its zenith in every developed country, and thousands of ATM transactions are occurring on a daily basis. In order to increase the customers' satisfaction and to provide them with more user-friendly ATM interfaces, it becomes important to mine the ATM transactions to discover useful patterns about the customers' interacting behaviors. In …


Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard May 2011

Empirical Methods For Predicting Student Retention- A Summary From The Literature, Matt Bogard

Economics Faculty Publications

The vast majority of the literature related to the empirical estimation of retention models includes a discussion of the theoretical retention framework established by Bean, Braxton, Tinto, Pascarella, Terenzini and others (see Bean, 1980; Bean, 2000; Braxton, 2000; Braxton et al, 2004; Chapman and Pascarella, 1983; Pascarell and Ternzini, 1978; St. John and Cabrera, 2000; Tinto, 1975) This body of research provides a starting point for the consideration of which explanatory variables to include in any model specification, as well as identifying possible data sources. The literature separates itself into two major camps including research related to the hypothesis testing …


Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan May 2011

Efficient Schema Extraction From A Collection Of Xml Documents, Vijayeandra Parthepan

Masters Theses & Specialist Projects

The eXtensible Markup Language (XML) has become the standard format for data exchange on the Internet, providing interoperability between different business applications. Such wide use results in large volumes of heterogeneous XML data, i.e., XML documents conforming to different schemas. Although schemas are important in many business applications, they are often missing in XML documents. In this thesis, we present a suite of algorithms that are effective in extracting schema information from a large collection of XML documents. We propose using the cost of NFA simulation to compute the Minimum Length Description to rank the inferred schema. We also studied …


An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu Mar 2011

An Approach To Nearest Neighboring Search For Multi-Dimensional Data, Yong Shi, Li Zhang, Lei Zhu

Faculty and Research Publications

Finding nearest neighbors in large multi-dimensional data has always been one of the research interests in data mining field. In this paper, we present our continuous research on similarity search problems. Previously we have worked on exploring the meaning of K nearest neighbors from a new perspective in PanKNN [20]. It redefines the distances between data points and a given query point Q, efficiently and effectively selecting data points which are closest to Q. It can be applied in various data mining fields. A large amount of real data sets have irrelevant or obstacle information which greatly affects the effectiveness …


Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman Jan 2011

Combining Natural Language Processing And Statistical Text Mining: A Study Of Specialized Versus Common Languages, Jay Jarman

USF Tampa Graduate Theses and Dissertations

This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms, such as association rule mining and decision tree induction, are used to discover classification rules for specific targets. This multi-stage pipeline approach is contrasted with traditional statistical text mining (STM) methods based on term counts and term-by-document frequencies. The aim is to create effective text analytic processes by adapting and combining individual …