Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 16 of 16

Full-Text Articles in Physical Sciences and Mathematics

A Direct Algorithm For The K-Nearest-Neighbor Classifier Via Local Warping Of The Distance Metric, Tohkoon Neo Nov 2007

A Direct Algorithm For The K-Nearest-Neighbor Classifier Via Local Warping Of The Distance Metric, Tohkoon Neo

Theses and Dissertations

The k-nearest neighbor (k-NN) pattern classifier is a simple yet effective learner. However, it has a few drawbacks, one of which is the large model size. There are a number of algorithms that are able to condense the model size of the k-NN classifier at the expense of accuracy. Boosting is therefore desirable for increasing the accuracy of these condensed models. Unfortunately, there does not exist a boosting algorithm that works well with k-NN directly. We present a direct boosting algorithm for the k-NN classifier that creates an ensemble of models with locally modified distance weighting. An empirical study conducted ...


Context-Aware Statistical Debugging: From Bug Predictors To Faulty Control Flow Paths, Lingxiao Jiang, Zhendong Su Nov 2007

Context-Aware Statistical Debugging: From Bug Predictors To Faulty Control Flow Paths, Lingxiao Jiang, Zhendong Su

Research Collection School Of Information Systems

Effective bug localization is important for realizing automated debugging. One attractive approach is to apply statistical techniques on a collection of evaluation profiles of program properties to help localize bugs. Previous research has proposed various specialized techniques to isolate certain program predicates as bug predictors. However, because many bugs may not be directly associated with these predicates, these techniques are often ineffective in localizing bugs. Relevant control flow paths that may contain bug locations are more informative than stand-alone predicates for discovering and understanding bugs. In this paper, we propose an approach to automatically generate such faulty control flow paths ...


Heuristic Weighted Voting, Kristine Perry Monteith Oct 2007

Heuristic Weighted Voting, Kristine Perry Monteith

Theses and Dissertations

Selecting an effective method for combining the votes of classifiers in an ensemble can have a significant impact on the overall classification accuracy an ensemble is able to achieve. With some methods, the ensemble cannot even achieve as high a classification accuracy as the most accurate individual classifying component. To address this issue, we present the strategy of Heuristic Weighted Voting, a technique that uses heuristics to determine the confidence that a classifier has in its predictions on an instance by instance basis. Using these heuristics to weight the votes in an ensemble results in an overall average increase in ...


A Data-Dependent Distance Measure For Transductive Instance-Based Learning, Jared Lundell, Dan A. Ventura Oct 2007

A Data-Dependent Distance Measure For Transductive Instance-Based Learning, Jared Lundell, Dan A. Ventura

Faculty Publications

We consider learning in a transductive setting using instance-based learning (k-NN) and present a method for constructing a data-dependent distance “metric” using both labeled training data as well as available unlabeled data (that is to be classified by the model). This new data-driven measure of distance is empirically studied in the context of various instance-based models and is shown to reduce error (compared to traditional models) under certain learning conditions. Generalizations and improvements are suggested.


Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook Sep 2007

Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook

Theses and Dissertations

Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given, and WoLF-PHC is shown empirically to diverge as well. Various WoLF-PHC based modifications were created, evaluated, and compared in an attempt to obtain convergence to the single shot Nash equilibrium when playing Shapley's game in self-play without using more information than WoLF-PHC uses. Partial Commitment WoLF-PHC (PCWoLF-PHC), which performs best on Shapley's game, is ...


Improving Neural Network Classification Training, Michael Edwin Rimer Sep 2007

Improving Neural Network Classification Training, Michael Edwin Rimer

Theses and Dissertations

The following work presents a new set of general methods for improving neural network accuracy on classification tasks, grouped under the label of classification-based methods. The central theme of these approaches is to provide problem representations and error functions that more directly improve classification accuracy than conventional learning and error functions. The CB1 algorithm attempts to maximize classification accuracy by selectively backpropagating error only on misclassified training patterns. CB2 incorporates a sliding error threshold to the CB1 algorithm, interpolating between the behavior of CB1 and standard error backpropagation as training progresses in order to avoid prematurely saturated network weights. CB3 ...


Parallelization Of Ant Colony Optimization Via Area Of Expertise Learning, Adrian A. De Freitas Sep 2007

Parallelization Of Ant Colony Optimization Via Area Of Expertise Learning, Adrian A. De Freitas

Theses and Dissertations

Ant colony optimization algorithms have long been touted as providing an effective and efficient means of generating high quality solutions to NP-hard optimization problems. Unfortunately, while the structure of the algorithm is easy to parallelize, the nature and amount of communication required for parallel execution has meant that parallel implementations developed suffer from decreased solution quality, slower runtime performance, or both. This thesis explores a new strategy for ant colony parallelization that involves Area of Expertise (AOE) learning. The AOE concept is based on the idea that individual agents tend to gain knowledge of different areas of the search space ...


Solar Activity Detection And Prediction Using Image Processing And Machine Learning Techniques, Gang Fu Aug 2007

Solar Activity Detection And Prediction Using Image Processing And Machine Learning Techniques, Gang Fu

Dissertations

The objective of the research in this dissertation is to develop the methods for automatic detection and prediction of solar activities, including prominence eruptions, emerging flux regions and solar flares. Image processing and machine learning techniques are applied in this study. These methods can be used for automatic observation of solar activities and prediction of space weather that may have great influence on the near earth environment.

The research presented in this dissertation covers the following topics: i) automatic detection of prominence eruptions (PBs), ii) automatic detection of emerging flux regions (EFRs), and iii) automatic prediction of solar flares.

In ...


Predicting Coronary Artery Disease With Medical Profile And Gene Polymorphisms Data, Qiongyu Chen, Guoliang Li, Tze-Yun Leong, Chew-Kiat Heng Aug 2007

Predicting Coronary Artery Disease With Medical Profile And Gene Polymorphisms Data, Qiongyu Chen, Guoliang Li, Tze-Yun Leong, Chew-Kiat Heng

Research Collection School Of Information Systems

Coronary artery disease (CAD) is a main cause of death in the world. Finding cost-effective methods to predict CAD is a major challenge in public health. In this paper, we investigate the combined effects of genetic polymorphisms and non-genetic factors on predicting the risk of CAD by applying well known classification methods, such as Bayesian networks, naïve Bayes, support vector machine, k-nearest neighbor, neural networks and decision trees. Our experiments show that all these classifiers are comparable in terms of accuracy, while Bayesian networks have the additional advantage of being able to provide insights into the relationships among the variables ...


Obstacle Avoidance And Path Traversal Using Interactive Machine Learning, Jonathan M. Turner Jul 2007

Obstacle Avoidance And Path Traversal Using Interactive Machine Learning, Jonathan M. Turner

Theses and Dissertations

Recently there has been a growing interest in using robots in activities that are dangerous or cost prohibitive for humans to do. Such activities include military uses and space exploration. While robotic hardware is often capable of being used in these types of situations, the ability of human operators to control robots in an effective manner is often limited. This deficiency is often related to the control interface of the robot and the level of autonomy that control system affords the human operator. This thesis describes a robot control system, called the safe/unsafe system, which gives a human operator ...


Cognitive And Behavioral Model Ensembles For Autonomous Virtual Characters, Jeffrey S. Whiting Jun 2007

Cognitive And Behavioral Model Ensembles For Autonomous Virtual Characters, Jeffrey S. Whiting

Theses and Dissertations

Cognitive and behavioral models have become popular methods to create autonomous self-animating characters. Creating these models presents the following challenges: (1) Creating a cognitive or behavioral model is a time intensive and complex process that must be done by an expert programmer (2) The models are created to solve a specific problem in a given environment and because of their specific nature cannot be easily reused. Combining existing models together would allow an animator, without the need of a programmer, to create new characters in less time and would be able to leverage each model's strengths to increase the ...


Active Learning For Part-Of-Speech Tagging: Accelerating Corpus Annotation, George Busby, Marc Carmen, James Carroll, Robbie Haertel, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi Jun 2007

Active Learning For Part-Of-Speech Tagging: Accelerating Corpus Annotation, George Busby, Marc Carmen, James Carroll, Robbie Haertel, Deryle W. Lonsdale, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi

Faculty Publications

In the construction of a part-of-speech annotated corpus, we are constrained by a fixed budget. A fully annotated corpus is required, but we can afford to label only a subset. We train a Maximum Entropy Markov Model tagger from a labeled subset and automatically tag the remainder. This paper addresses the question of where to focus our manual tagging efforts in order to deliver an annotation of highest quality. In this context, we find that active learning is always helpful. We focus on Query by Uncertainty (QBU) and Query by Committee (QBC) and report on experiments with several baselines and ...


Evolutionary Granular Kernel Machines, Bo Jin May 2007

Evolutionary Granular Kernel Machines, Bo Jin

Computer Science Dissertations

Kernel machines such as Support Vector Machines (SVMs) have been widely used in various data mining applications with good generalization properties. Performance of SVMs for solving nonlinear problems is highly affected by kernel functions. The complexity of SVMs training is mainly related to the size of a training dataset. How to design a powerful kernel, how to speed up SVMs training and how to train SVMs with millions of examples are still challenging problems in the SVMs research. For these important problems, powerful and flexible kernel trees called Evolutionary Granular Kernel Trees (EGKTs) are designed to incorporate prior domain knowledge ...


Towards A Self-Calibrating Video Camera Network For Content Analysis And Forensics, Imran Junejo Jan 2007

Towards A Self-Calibrating Video Camera Network For Content Analysis And Forensics, Imran Junejo

Electronic Theses and Dissertations, 2004-2019

Due to growing security concerns, video surveillance and monitoring has received an immense attention from both federal agencies and private firms. The main concern is that a single camera, even if allowed to rotate or translate, is not sufficient to cover a large area for video surveillance. A more general solution with wide range of applications is to allow the deployed cameras to have a non-overlapping field of view (FoV) and to, if possible, allow these cameras to move freely in 3D space. This thesis addresses the issue of how cameras in such a network can be calibrated and how ...


Using Machine Learning Techniques To Create Ai Controlled Players For Video Games, Bhuman Soni Jan 2007

Using Machine Learning Techniques To Create Ai Controlled Players For Video Games, Bhuman Soni

Theses : Honours

This study aims to achieve higher replay and entertainment value in a game through human-like AI behaviour in computer controlled characters called bats. In order to achieve that, an artificial intelligence system capable of learning from observation of human player play was developed. The artificial intelligence system makes use of machine learning capabilities to control the state change mechanism of the bot. The implemented system was tested by an audience of gamers and compared against bats controlled by static scripts. The data collected was focused on qualitative aspects of replay and entertainment value of the game and subjected to quantitative ...


Knowledge-Based Methods For Automatic Extraction Of Domain-Specific Ontologies, Janardhana R. Punuru Jan 2007

Knowledge-Based Methods For Automatic Extraction Of Domain-Specific Ontologies, Janardhana R. Punuru

LSU Doctoral Dissertations

Semantic web technology aims at developing methodologies for representing large amount of knowledge in web accessible form. The semantics of knowledge should be easy to interpret and understand by computer programs, so that sharing and utilizing knowledge across the Web would be possible. Domain specific ontologies form the basis for knowledge representation in the semantic web. Research on automated development of ontologies from texts has become increasingly important because manual construction of ontologies is labor intensive and costly, and, at the same time, large amount of texts for individual domains is already available in electronic form. However, automatic extraction of ...