Open Access. Powered by Scholars. Published by Universities.®

Digital Commons Network

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Brigham Young University

Theses/Dissertations

Reinforcement learning

Articles 1 - 6 of 6

Full-Text Articles in Entire DC Network

Using Logical Specifications For Multi-Objective Reinforcement Learning, Kolby Nottingham Mar 2020

Using Logical Specifications For Multi-Objective Reinforcement Learning, Kolby Nottingham

Undergraduate Honors Theses

In the multi-objective reinforcement learning (MORL) paradigm, the relative importance of environment objectives is often unknown prior to training, so agents must learn to specialize their behavior to optimize different combinations of environment objectives that are specified post-training. These are typically linear combinations, so the agent is effectively parameterized by a weight vector that describes how to balance competing environment objectives. However, we show that behaviors can be successfully specified and learned by much more expressive non-linear logical specifications. We test our agent in several environments with various objectives and show that it can generalize to many never-before-seen specifications.


Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton Mar 2008

Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton

Theses and Dissertations

Liquid State Machines (LSMs) exploit the power of recurrent spiking neural networks (SNNs) without training the SNN. Instead, a reservoir, or liquid, is randomly created which acts as a filter for a readout function. We develop three methods for iteratively refining a randomly generated liquid to create a more effective one. First, we apply Hebbian learning to LSMs by building the liquid with spike-time dependant plasticity (STDP) synapses. Second, we create an eligibility based reinforcement learning algorithm for synaptic development. Third, we apply principles of Hebbian learning and reinforcement learning to create a new algorithm called separation driven synaptic modification …


Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook Sep 2007

Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook

Theses and Dissertations

Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given, and WoLF-PHC is shown empirically to diverge as well. Various WoLF-PHC based modifications were created, evaluated, and compared in an attempt to obtain convergence to the single shot Nash equilibrium when playing Shapley's game in self-play without using more information than WoLF-PHC uses. Partial Commitment WoLF-PHC (PCWoLF-PHC), which performs best on Shapley's game, is tested on other …


Learning Successful Strategies In Repeated General-Sum Games, Jacob W. Crandall Dec 2005

Learning Successful Strategies In Repeated General-Sum Games, Jacob W. Crandall

Theses and Dissertations

Many environments in which an agent can use reinforcement learning techniques to learn profitable strategies are affected by other learning agents. These situations can be modeled as general-sum games. When playing repeated general-sum games with other learning agents, the goal of a self-interested learning agent is to maximize its own payoffs over time. Traditional reinforcement learning algorithms learn myopic strategies in these games. As a result, they learn strategies that produce undesirable results in many games. In this dissertation, we develop and analyze algorithms that learn non-myopic strategies when playing many important infinitely repeated general-sum games. We show that, in …


Improving And Extending Behavioral Animation Through Machine Learning, Jonathan J. Dinerstein Apr 2005

Improving And Extending Behavioral Animation Through Machine Learning, Jonathan J. Dinerstein

Theses and Dissertations

Behavioral animation has become popular for creating virtual characters that are autonomous agents and thus self-animating. This is useful for lessening the workload of human animators, populating virtual environments with interactive agents, etc. Unfortunately, current behavioral animation techniques suffer from three key problems: (1) deliberative behavioral models (i.e., cognitive models) are slow to execute; (2) interactive virtual characters cannot adapt online due to interaction with a human user; (3) programming of behavioral models is a difficult and time-intensive process. This dissertation presents a collection of papers that seek to overcome each of these problems. Specifically, these issues are alleviated …


Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate Jun 2004

Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate

Theses and Dissertations

Value iteration is not typically considered a viable algorithm for solving large-scale MDPs because it converges too slowly. However, its performance can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We present several methods designed to help structure value dependency, and present a systematic study of companion prioritization techniques which focus computation in useful regions of the state space. In order to scale to solve ever larger problems, we evaluate all enhancements and methods in the context of parallelizability. Using the enhancements, we discover that in many instances the limiting …