Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

Brigham Young University

Series

Reinforcement learning

Publication Year

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson Dec 2004

Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson

Faculty Publications

We present JoSTLe, an algorithm that performs value iteration on control problems with continuous actions, allowing this useful reinforcement learning technique to be applied to problems where a priori action discretization is inadequate. The algorithm is an extension of a variable resolution technique that works for problems with continuous states and discrete actions. Results are given that indicate that JoSTLe is a promising step toward reinforcement learning in a fully continuous domain.


Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura Jul 2004

Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura

Faculty Publications

We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm will learn to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able …


Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura Sep 2003

Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura

Faculty Publications

Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range …


Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura Jun 2003

Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura

Faculty Publications

Q-learning is a reinforcement learning algorithm that learns expected utilities for state-action transitions through successive interactions with the environment. The algorithm's simplicity as well as its convergence properties have made it a popular algorithm for study. However, its non-parametric representation of utilities limits its effectiveness in environments with large amounts of perceptual input. For example, in multiagent systems, each agent may need to consider the action selections of its counterparts in order to learn effective behaviors. This creates a joint action space which grows exponentially with the number of agents in the system. In such situations, the Q-learning algorithm quickly …