Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 30 of 43

Full-Text Articles in Physical Sciences and Mathematics

Using Logical Specifications For Multi-Objective Reinforcement Learning, Kolby Nottingham Mar 2020

Using Logical Specifications For Multi-Objective Reinforcement Learning, Kolby Nottingham

Undergraduate Honors Theses

In the multi-objective reinforcement learning (MORL) paradigm, the relative importance of environment objectives is often unknown prior to training, so agents must learn to specialize their behavior to optimize different combinations of environment objectives that are specified post-training. These are typically linear combinations, so the agent is effectively parameterized by a weight vector that describes how to balance competing environment objectives. However, we show that behaviors can be successfully specified and learned by much more expressive non-linear logical specifications. We test our agent in several environments with various objectives and show that it can generalize to many never-before-seen specifications.


Machine Learning For Effective Parkinson's Disease Diagnosis, Brennon Brimhall Mar 2020

Machine Learning For Effective Parkinson's Disease Diagnosis, Brennon Brimhall

Undergraduate Honors Theses

Parkinson’s Disease is a degenerative neurological condition that affects approximately 10 million people globally. Because there is currently no cure, there is a strong motivation for research into improved and automated diagnostic procedures. Using Random Forests, a computer can effectively learn to diagnose Parkinson’s disease in a patient with high accuracy (94%), precision (95%), and recall (91%) across the data of over 2800 patients. Using similar techniques, I further determine that the most predictive medical tests relate to tremors observed in patients.


Flow Adaptive Video Object Segmentation, Fanqing Lin Dec 2018

Flow Adaptive Video Object Segmentation, Fanqing Lin

Theses and Dissertations

We tackle the task of semi-supervised video object segmentation, i.e, pixel-level object classification of the images in video sequences using very limited ground truth training data of its corresponding video. Recently introduced online adaptation of convolutional neural networks for video object segmentation (OnAVOS) has achieved good results by pretraining the network, fine-tuning on the first frame and training the network at test time using its approximate prediction as newly obtained ground truth. We propose Flow Adaptive Video Object Segmentation (FAVOS) that refines the generated adaptive ground truth for online updates and utilizes temporal consistency between video frames with the help …


Toward Real-Time Flip Fluid Simulation Through Machine Learning Approximations, Javid Kennon Pack Dec 2018

Toward Real-Time Flip Fluid Simulation Through Machine Learning Approximations, Javid Kennon Pack

Theses and Dissertations

Fluids in computer generated imagery can add an impressive amount of realism to a scene, but are particularly time-consuming to simulate. In an attempt to run fluid simulations in real-time, recent efforts have attempted to simulate fluids by using machine learning techniques to approximate the movement of fluids. We explore utilizing machine learning to simulate fluids while also integrating the Fluid-Implicit-Particle (FLIP) simulation method into machine learning fluid simulation approaches.


The Ogcleaner: Detecting False-Positive Sequence Homology, Masaki Stanley Fujimoto Jun 2017

The Ogcleaner: Detecting False-Positive Sequence Homology, Masaki Stanley Fujimoto

Theses and Dissertations

Within bioinformatics, phylogenetics is the study of the evolutionary relationships between different species and organisms. The genetic revolution has caused an explosion in the amount of raw genomic information that is available to scientists for study. While there has been an explosion in available data, analysis methods have lagged behind. A key task in phylogenetics is identifying homology clusters. Current methods rely on using heuristics based on pairwise sequence comparison to identify homology clusters. We propose the Orthology Group Cleaner (the OGCleaner) as a method to evaluate cluster level verification of putative homology clusters in order to create higher quality …


Creating And Automatically Grading Annotated Questions, Alicia Crowder Wood Sep 2016

Creating And Automatically Grading Annotated Questions, Alicia Crowder Wood

Theses and Dissertations

We have created a question type that allows teachers to easily create questions, helps provide an intuitive user experience for students to take questions, and reduces the time it currently takes teachers to grade and provide feedback to students. This question type, or an "annotated" question, will allow teachers to test students' knowledge in a particular subject area by having students "annotate" or mark text and video sources to answer questions. Through user testing we determined that overall the interface and the implemented system decrease the time it would take a teacher to grade annotated quiz questions. However, there are …


Feature Identification And Reduction For Improved Generalization Accuracy In Secondary-Structure Prediction Using Temporal Context Inputs In Machine-Learning Models, Matthew Benjamin Seeley May 2015

Feature Identification And Reduction For Improved Generalization Accuracy In Secondary-Structure Prediction Using Temporal Context Inputs In Machine-Learning Models, Matthew Benjamin Seeley

Theses and Dissertations

A protein's properties are influenced by both its amino-acid sequence and its three-dimensional conformation. Ascertaining a protein's sequence is relatively easy using modern techniques, but determining its conformation requires much more expensive and time-consuming techniques. Consequently, it would be useful to identify a method that can accurately predict a protein's secondary-structure conformation using only the protein's sequence data. This problem is not trivial, however, because identical amino-acid subsequences in different contexts sometimes have disparate secondary structures, while highly dissimilar amino-acid subsequences sometimes have identical secondary structures. We propose (1) to develop a set of metrics that facilitates better comparisons between …


Using Instance-Level Meta-Information To Facilitate A More Principled Approach To Machine Learning, Michael Reed Smith Apr 2015

Using Instance-Level Meta-Information To Facilitate A More Principled Approach To Machine Learning, Michael Reed Smith

Theses and Dissertations

As the capability for capturing and storing data increases and becomes more ubiquitous, an increasing number of organizations are looking to use machine learning techniques as a means of understanding and leveraging their data. However, the success of applying machine learning techniques depends on which learning algorithm is selected, the hyperparameters that are provided to the selected learning algorithm, and the data that is supplied to the learning algorithm. Even among machine learning experts, selecting an appropriate learning algorithm, setting its associated hyperparameters, and preprocessing the data can be a challenging task and is generally left to the expertise of …


Intelligent Indexing: A Semi-Automated, Trainable System For Field Labeling, Robert T. Clawson Sep 2014

Intelligent Indexing: A Semi-Automated, Trainable System For Field Labeling, Robert T. Clawson

Theses and Dissertations

We present Intelligent Indexing: a general, scalable, collaborative approach to indexing and transcription of non-machine-readable documents that exploits visual consensus and group labeling while harnessing human recognition and domain expertise. In our system, indexers work directly on the page, and with minimal context switching can navigate the page, enter labels, and interact with the recognition engine. Interaction with the recognition engine occurs through preview windows that allow the indexer to quickly verify and correct recommendations. This interaction is far superior to conventional, tedious, inefficient post-correction and editing. Intelligent Indexing is a trainable system that improves over time and can provide …


Musical Motif Discovery In Non-Musical Media, Daniel S. Johnson Jun 2014

Musical Motif Discovery In Non-Musical Media, Daniel S. Johnson

Theses and Dissertations

Many music composition algorithms attempt to compose music in a particular style. The resulting music is often impressive and indistinguishable from the style of the training data, but it tends to lack significant innovation. In an effort to increase innovation in the selection of pitches and rhythms, we present a system that discovers musical motifs by coupling machine learning techniques with an inspirational component. The inspirational component allows for the discovery of musical motifs that are unlikely to be produced by a generative model, while the machine learning component harnesses innovation. Candidate motifs are extracted from non-musical media such as …


Ensemble Methods For Historical Machine-Printed Document Recognition, William B. Lund Apr 2014

Ensemble Methods For Historical Machine-Printed Document Recognition, William B. Lund

Theses and Dissertations

The usefulness of digitized documents is directly related to the quality of the extracted text. Optical Character Recognition (OCR) has reached a point where well-formatted and clean machine- printed documents are easily recognizable by current commercial OCR products; however, older or degraded machine-printed documents present problems to OCR engines resulting in word error rates (WER) that severely limit either automated or manual use of the extracted text. Major archives of historical machine-printed documents are being assembled around the globe, requiring an accurate transcription of the text for the automated creation of descriptive metadata, full-text searching, and information extraction. Given document …


Practical Cost-Conscious Active Learning For Data Annotation In Annotator-Initiated Environments, Robbie A. Haertel Aug 2013

Practical Cost-Conscious Active Learning For Data Annotation In Annotator-Initiated Environments, Robbie A. Haertel

Theses and Dissertations

Many projects exist whose purpose is to augment raw data with annotations that increase the usefulness of the data. The number of these projects is rapidly growing and in the age of “big data” the amount of data to be annotated is likewise growing within each project. One common use of such data is in supervised machine learning, which requires labeled data to train a predictive model. Annotation is often a very expensive proposition, particularly for structured data. The purpose of this dissertation is to explore methods of reducing the cost of creating such data sets, including annotated text corpora.We …


Probabilistic Explicit Topic Modeling, Joshua Aaron Hansen Apr 2013

Probabilistic Explicit Topic Modeling, Joshua Aaron Hansen

Theses and Dissertations

Latent Dirichlet Allocation (LDA) is widely used for automatic discovery of latent topics in document corpora. However, output from analysis using an LDA topic model suffers from a lack of identifiability between topics not only across corpora, but across runs of the algorithm. The output is also isolated from enriching information from knowledge sources such as Wikipedia and is difficult for humans to interpret due to a lack of meaningful topic labels. This thesis introduces two methods for probabilistic explicit topic modeling that address these issues: Latent Dirichlet Allocation with Static Topic-Word Distributions (LDA-STWD), and Explicit Dirichlet Allocation (EDA). LDA-STWD …


Bayesian Test Analytics For Document Collections, Daniel David Walker Nov 2012

Bayesian Test Analytics For Document Collections, Daniel David Walker

Theses and Dissertations

Modern document collections are too large to annotate and curate manually. As increasingly large amounts of data become available, historians, librarians and other scholars increasingly need to rely on automated systems to efficiently and accurately analyze the contents of their collections and to find new and interesting patterns therein. Modern techniques in Bayesian text analytics are becoming wide spread and have the potential to revolutionize the way that research is conducted. Much work has been done in the document modeling community towards this end,though most of it is focused on modern, relatively clean text data. We present research for improved …


A Confidence-Prioritization Approach To Data Processing In Noisy Data Sets And Resulting Estimation Models For Predicting Streamflow Diel Signals In The Pacific Northwest, Nathaniel Lee Gustafson Aug 2012

A Confidence-Prioritization Approach To Data Processing In Noisy Data Sets And Resulting Estimation Models For Predicting Streamflow Diel Signals In The Pacific Northwest, Nathaniel Lee Gustafson

Theses and Dissertations

Streams in small watersheds are often known to exhibit diel fluctuations, in which streamflow oscillates on a 24-hour cycle. Streamflow diel fluctuations, which we investigate in this study, are an informative indicator of environmental processes. However, in Environmental Data sets, as well as many others, there is a range of noise associated with individual data points. Some points are extracted under relatively clear and defined conditions, while others may include a range of known or unknown confounding factors, which may decrease those points' validity. These points may or may not remain useful for training, depending on how much uncertainty they …


Practical Improvements In Applied Spectral Learning, Adam C. Drake Jun 2010

Practical Improvements In Applied Spectral Learning, Adam C. Drake

Theses and Dissertations

Spectral learning algorithms, which learn an unknown function by learning a spectral representation of the function, have been widely used in computational learning theory to prove many interesting learnability results. These algorithms have also been successfully used in real-world applications. However, previous work has left open many questions about how to best use these methods in real-world learning scenarios. This dissertation presents several significant advances in real-world spectral learning. It presents new algorithms for finding large spectral coefficients (a key sub-problem in spectral learning) that allow spectral learning methods to be applied to much larger problems and to a wider …


Transformation Learning: Modeling Transferable Transformations In High-Dimensional Data, Christopher R. Wilson May 2010

Transformation Learning: Modeling Transferable Transformations In High-Dimensional Data, Christopher R. Wilson

Theses and Dissertations

The goal of learning transfer is to apply knowledge gained from one problem to a separate related problem. Transformation learning is a proposed approach to computational learning transfer that focuses on modeling high-level transformations that are well suited for transfer. By using a high-level representation of transferable data, transformation learning facilitates both shallow transfer (intra-domain) and deep transfer (inter-domain) scenarios. Transformations can be discovered in data using manifold learning to order data instances according to the transformations they represent. For high-dimensional data representable with coordinate systems, such as images and sounds, data instances can be decomposed into small sub-instances based …


A Bayesian Decision Theoretical Approach To Supervised Learning, Selective Sampling, And Empirical Function Optimization, James Lamond Carroll Mar 2010

A Bayesian Decision Theoretical Approach To Supervised Learning, Selective Sampling, And Empirical Function Optimization, James Lamond Carroll

Theses and Dissertations

Many have used the principles of statistics and Bayesian decision theory to model specific learning problems. It is less common to see models of the processes of learning in general. One exception is the model of the supervised learning process known as the "Extended Bayesian Formalism" or EBF. This model is descriptive, in that it can describe and compare learning algorithms. Thus the EBF is capable of modeling both effective and ineffective learning algorithms. We extend the EBF to model un-supervised learning, semi-supervised learning, supervised learning, and empirical function optimization. We also generalize the utility model of the EBF to …


Noninvasive Estimation Of Pulmonary Artery Pressure Using Heart Sound Analysis, Aaron W. Dennis Dec 2009

Noninvasive Estimation Of Pulmonary Artery Pressure Using Heart Sound Analysis, Aaron W. Dennis

Theses and Dissertations

Right-heart catheterization is the most accurate method for estimating pulmonary artery pressure (PAP). Because it is an invasive procedure it is expensive, exposes patients to the risk of infection, and is not suited for long-term monitoring situations. Medical researchers have shown that PAP influences the characteristics of heart sounds. This suggests that heart sound analysis is a potential noninvasive solution to the PAP estimation problem. This thesis describes the development of a prototype system, called PAPEr, which estimates PAP noninvasively using heart sound analysis. PAPEr uses patient data with machine learning algorithms to build models of how PAP affects heart …


Real-Time Automatic Price Prediction For Ebay Online Trading, Ilya Igorevitch Raykhel Nov 2008

Real-Time Automatic Price Prediction For Ebay Online Trading, Ilya Igorevitch Raykhel

Theses and Dissertations

While Machine Learning is one of the most popular research areas in Computer Science, there are still only a few deployed applications intended for use by the general public. We have developed an exemplary application that can be directly applied to eBay trading. Our system predicts how much an item would sell for on eBay based on that item's attributes. We ran our experiments on the eBay laptop category, with prior trades used as training data. The system implements a feature-weighted k-Nearest Neighbor algorithm, using genetic algorithms to determine feature weights. Our results demonstrate an average prediction error of 16%; …


Assessing The Costs Of Sampling Methods In Active Learning For Annotation, James Carroll, Robbie Haertel, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi Jun 2008

Assessing The Costs Of Sampling Methods In Active Learning For Annotation, James Carroll, Robbie Haertel, Peter Mcclanahan, Eric K. Ringger, Kevin Seppi

Faculty Publications

Traditional Active Learning (AL) techniques assume that the annotation of each datum costs the same. This is not the case when annotating sequences; some sequences will take longer than others. We show that the AL technique which performs best depends on how cost is measured. Applying an hourly cost model based on the results of an annotation user study, we approximate the amount of time necessary to annotate a given sentence. This model allows us to evaluate the effectiveness of AL sampling methods in terms of time spent in annotation. We acheive a 77% reduction in hours from a random …


Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton Mar 2008

Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton

Theses and Dissertations

Liquid State Machines (LSMs) exploit the power of recurrent spiking neural networks (SNNs) without training the SNN. Instead, a reservoir, or liquid, is randomly created which acts as a filter for a readout function. We develop three methods for iteratively refining a randomly generated liquid to create a more effective one. First, we apply Hebbian learning to LSMs by building the liquid with spike-time dependant plasticity (STDP) synapses. Second, we create an eligibility based reinforcement learning algorithm for synaptic development. Third, we apply principles of Hebbian learning and reinforcement learning to create a new algorithm called separation driven synaptic modification …


Learning Policies For Embodied Virtual Agents Through Demonstration, Jonathan Dinerstein, Parris K. Egbert, Dan A. Ventura Jan 2008

Learning Policies For Embodied Virtual Agents Through Demonstration, Jonathan Dinerstein, Parris K. Egbert, Dan A. Ventura

Faculty Publications

Although many powerful AI and machine learning techniques exist, it remains difficult to quickly create AI for embodied virtual agents that produces visually lifelike behavior. This is important for applications (e.g., games, simulators, interactive displays) where an agent must behave in a manner that appears human-like. We present a novel technique for learning reactive policies that mimic demonstrated human behavior. The user demonstrates the desired behavior by dictating the agent’s actions during an interactive animation. Later, when the agent is to behave autonomously, the recorded data is generalized to form a continuous state-to-action mapping. Combined with an appropriate animation algorithm …


A Direct Algorithm For The K-Nearest-Neighbor Classifier Via Local Warping Of The Distance Metric, Tohkoon Neo Nov 2007

A Direct Algorithm For The K-Nearest-Neighbor Classifier Via Local Warping Of The Distance Metric, Tohkoon Neo

Theses and Dissertations

The k-nearest neighbor (k-NN) pattern classifier is a simple yet effective learner. However, it has a few drawbacks, one of which is the large model size. There are a number of algorithms that are able to condense the model size of the k-NN classifier at the expense of accuracy. Boosting is therefore desirable for increasing the accuracy of these condensed models. Unfortunately, there does not exist a boosting algorithm that works well with k-NN directly. We present a direct boosting algorithm for the k-NN classifier that creates an ensemble of models with locally modified distance weighting. An empirical study conducted …


Heuristic Weighted Voting, Kristine Perry Monteith Oct 2007

Heuristic Weighted Voting, Kristine Perry Monteith

Theses and Dissertations

Selecting an effective method for combining the votes of classifiers in an ensemble can have a significant impact on the overall classification accuracy an ensemble is able to achieve. With some methods, the ensemble cannot even achieve as high a classification accuracy as the most accurate individual classifying component. To address this issue, we present the strategy of Heuristic Weighted Voting, a technique that uses heuristics to determine the confidence that a classifier has in its predictions on an instance by instance basis. Using these heuristics to weight the votes in an ensemble results in an overall average increase in …


A Data-Dependent Distance Measure For Transductive Instance-Based Learning, Jared Lundell, Dan A. Ventura Oct 2007

A Data-Dependent Distance Measure For Transductive Instance-Based Learning, Jared Lundell, Dan A. Ventura

Faculty Publications

We consider learning in a transductive setting using instance-based learning (k-NN) and present a method for constructing a data-dependent distance “metric” using both labeled training data as well as available unlabeled data (that is to be classified by the model). This new data-driven measure of distance is empirically studied in the context of various instance-based models and is shown to reduce error (compared to traditional models) under certain learning conditions. Generalizations and improvements are suggested.


Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook Sep 2007

Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook

Theses and Dissertations

Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given, and WoLF-PHC is shown empirically to diverge as well. Various WoLF-PHC based modifications were created, evaluated, and compared in an attempt to obtain convergence to the single shot Nash equilibrium when playing Shapley's game in self-play without using more information than WoLF-PHC uses. Partial Commitment WoLF-PHC (PCWoLF-PHC), which performs best on Shapley's game, is tested on other …


Improving Neural Network Classification Training, Michael Edwin Rimer Sep 2007

Improving Neural Network Classification Training, Michael Edwin Rimer

Theses and Dissertations

The following work presents a new set of general methods for improving neural network accuracy on classification tasks, grouped under the label of classification-based methods. The central theme of these approaches is to provide problem representations and error functions that more directly improve classification accuracy than conventional learning and error functions. The CB1 algorithm attempts to maximize classification accuracy by selectively backpropagating error only on misclassified training patterns. CB2 incorporates a sliding error threshold to the CB1 algorithm, interpolating between the behavior of CB1 and standard error backpropagation as training progresses in order to avoid prematurely saturated network weights. CB3 …


Obstacle Avoidance And Path Traversal Using Interactive Machine Learning, Jonathan M. Turner Jul 2007

Obstacle Avoidance And Path Traversal Using Interactive Machine Learning, Jonathan M. Turner

Theses and Dissertations

Recently there has been a growing interest in using robots in activities that are dangerous or cost prohibitive for humans to do. Such activities include military uses and space exploration. While robotic hardware is often capable of being used in these types of situations, the ability of human operators to control robots in an effective manner is often limited. This deficiency is often related to the control interface of the robot and the level of autonomy that control system affords the human operator. This thesis describes a robot control system, called the safe/unsafe system, which gives a human operator the …


Cognitive And Behavioral Model Ensembles For Autonomous Virtual Characters, Jeffrey S. Whiting Jun 2007

Cognitive And Behavioral Model Ensembles For Autonomous Virtual Characters, Jeffrey S. Whiting

Theses and Dissertations

Cognitive and behavioral models have become popular methods to create autonomous self-animating characters. Creating these models presents the following challenges: (1) Creating a cognitive or behavioral model is a time intensive and complex process that must be done by an expert programmer (2) The models are created to solve a specific problem in a given environment and because of their specific nature cannot be easily reused. Combining existing models together would allow an animator, without the need of a programmer, to create new characters in less time and would be able to leverage each model's strengths to increase the character's …