Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

PDF

2003

Reinforcement learning

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura Sep 2003

Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura

Faculty Publications

Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range …


Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura Jun 2003

Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura

Faculty Publications

Q-learning is a reinforcement learning algorithm that learns expected utilities for state-action transitions through successive interactions with the environment. The algorithm's simplicity as well as its convergence properties have made it a popular algorithm for study. However, its non-parametric representation of utilities limits its effectiveness in environments with large amounts of perceptual input. For example, in multiagent systems, each agent may need to consider the action selections of its counterparts in order to learn effective behaviors. This creates a joint action space which grows exponentially with the number of agents in the system. In such situations, the Q-learning algorithm quickly …