Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 3 of 3
Full-Text Articles in Physical Sciences and Mathematics
Virtual Robot Climbing Using Reinforcement Learning, Ujjawal Garg
Virtual Robot Climbing Using Reinforcement Learning, Ujjawal Garg
Master's Projects
Reinforcement Learning (RL) is a field of Artificial Intelligence that has gained a lot of attention in recent years. In this project, RL research was used to design and train an agent to climb and navigate through an environment with slopes. We compared and evaluated the performance of two state-of-the-art reinforcement learning algorithms for locomotion related tasks, Deep Deterministic Policy Gradients (DDPG) and Trust Region Policy Optimisation (TRPO). We observed that, on an average, training with TRPO was three times faster than DDPG, and also much more stable for the locomotion control tasks that we experimented. We conducted experiments and …
Influencing Exploration In Actor-Critic Reinforcement Learning Algorithms, Andrew R. Gough
Influencing Exploration In Actor-Critic Reinforcement Learning Algorithms, Andrew R. Gough
Master's Theses
Reinforcement Learning (RL) is a subset of machine learning primarily concerned with goal-directed learning and optimal decision making. RL agents learn based on a reward signal discovered from trial and error in complex, uncertain environments with the goal of maximizing positive reward signals. RL approaches need to scale up as they are applied to more complex environments with extremely large state spaces. Inefficient exploration methods cannot sufficiently explore complex environments in a reasonable amount of time, and optimal policies will be unrealized resulting in RL agents failing to solve an environment.
This thesis proposes a novel variant of the Actor-Advantage …
Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday
Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday
Graduate Theses and Dissertations
We propose a simple and efficient modification to the Asynchronous Advantage Actor Critic (A3C)
algorithm that improves training. In 2016 Google’s DeepMind set a new standard for state-of-theart
reinforcement learning performance with the introduction of the A3C algorithm. The goal of
this research is to show that A3C can be improved by the use of a new novel exploration strategy we
call “Follow then Forage Exploration” (FFE). FFE forces the agents to follow the best known path
at the beginning of a training episode and then later in the episode the agent is forced to “forage”
and explores randomly. In …