Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 8 of 8

Full-Text Articles in Physical Sciences and Mathematics

Using Taint Analysis And Reinforcement Learning (Tarl) To Repair Autonomous Robot Software, Damian Lyons, Saba Zahra May 2020

Using Taint Analysis And Reinforcement Learning (Tarl) To Repair Autonomous Robot Software, Damian Lyons, Saba Zahra

Faculty Publications

It is important to be able to establish formal performance bounds for autonomous systems. However, formal verification techniques require a model of the environment in which the system operates; a challenge for autonomous systems, especially those expected to operate over longer timescales. This paper describes work in progress to automate the monitor and repair of ROS-based autonomous robot software written for an a-priori partially known and possibly incorrect environment model. A taint analysis method is used to automatically extract the data-flow sequence from input topic to publish topic, and instrument that code. A unique reinforcement learning approximation of MDP utility …


Dynamic Coalition Formation Under Uncertainty, Daylon J. Hooper, Gilbert L. Peterson, Brett J. Borghetti Oct 2009

Dynamic Coalition Formation Under Uncertainty, Daylon J. Hooper, Gilbert L. Peterson, Brett J. Borghetti

Faculty Publications

Coalition formation algorithms are generally not applicable to real-world robotic collectives since they lack mechanisms to handle uncertainty. Those mechanisms that do address uncertainty either deflect it by soliciting information from others or apply reinforcement learning to select an agent type from within a set. This paper presents a coalition formation mechanism that directly addresses uncertainty while allowing the agent types to fall outside of a known set. The agent types are captured through a novel agent modeling technique that handles uncertainty through a belief-based evaluation mechanism. This technique allows for uncertainty in environmental data, agent type, coalition value, and …


Fuzzy State Aggregation And Policy Hill Climbing For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson Sep 2006

Fuzzy State Aggregation And Policy Hill Climbing For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson

Faculty Publications

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the fastest policy hill …


Fuzzy State Aggregation And Off-Policy Reinforcement Learning For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson May 2006

Fuzzy State Aggregation And Off-Policy Reinforcement Learning For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson

Faculty Publications

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the environment it is operating in changes. This ability to learn in an unsupervised manner in a changing environment is applicable in complex domains through the use of function approximation of the domain’s policy. The function approximation presented here is that of fuzzy state aggregation. This article presents the use of fuzzy state aggregation with the current policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF), exceeding the learning rate …


Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson Dec 2004

Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson

Faculty Publications

We present JoSTLe, an algorithm that performs value iteration on control problems with continuous actions, allowing this useful reinforcement learning technique to be applied to problems where a priori action discretization is inadequate. The algorithm is an extension of a variable resolution technique that works for problems with continuous states and discrete actions. Results are given that indicate that JoSTLe is a promising step toward reinforcement learning in a fully continuous domain.


Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura Jul 2004

Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura

Faculty Publications

We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm will learn to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able …


Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura Sep 2003

Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura

Faculty Publications

Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range …


Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura Jun 2003

Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura

Faculty Publications

Q-learning is a reinforcement learning algorithm that learns expected utilities for state-action transitions through successive interactions with the environment. The algorithm's simplicity as well as its convergence properties have made it a popular algorithm for study. However, its non-parametric representation of utilities limits its effectiveness in environments with large amounts of perceptual input. For example, in multiagent systems, each agent may need to consider the action selections of its counterparts in order to learn effective behaviors. This creates a joint action space which grows exponentially with the number of agents in the system. In such situations, the Q-learning algorithm quickly …