Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 3 of 3

Full-Text Articles in Computer Sciences

Virtual Robot Climbing Using Reinforcement Learning, Ujjawal Garg Dec 2018

Virtual Robot Climbing Using Reinforcement Learning, Ujjawal Garg

Master's Projects

Reinforcement Learning (RL) is a field of Artificial Intelligence that has gained a lot of attention in recent years. In this project, RL research was used to design and train an agent to climb and navigate through an environment with slopes. We compared and evaluated the performance of two state-of-the-art reinforcement learning algorithms for locomotion related tasks, Deep Deterministic Policy Gradients (DDPG) and Trust Region Policy Optimisation (TRPO). We observed that, on an average, training with TRPO was three times faster than DDPG, and also much more stable for the locomotion control tasks that we experimented. We conducted experiments and …


Influencing Exploration In Actor-Critic Reinforcement Learning Algorithms, Andrew R. Gough Jun 2018

Influencing Exploration In Actor-Critic Reinforcement Learning Algorithms, Andrew R. Gough

Master's Theses

Reinforcement Learning (RL) is a subset of machine learning primarily concerned with goal-directed learning and optimal decision making. RL agents learn based on a reward signal discovered from trial and error in complex, uncertain environments with the goal of maximizing positive reward signals. RL approaches need to scale up as they are applied to more complex environments with extremely large state spaces. Inefficient exploration methods cannot sufficiently explore complex environments in a reasonable amount of time, and optimal policies will be unrealized resulting in RL agents failing to solve an environment.

This thesis proposes a novel variant of the Actor-Advantage …


Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday May 2018

Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday

Graduate Theses and Dissertations

We propose a simple and efficient modification to the Asynchronous Advantage Actor Critic (A3C)

algorithm that improves training. In 2016 Google’s DeepMind set a new standard for state-of-theart

reinforcement learning performance with the introduction of the A3C algorithm. The goal of

this research is to show that A3C can be improved by the use of a new novel exploration strategy we

call “Follow then Forage Exploration” (FFE). FFE forces the agents to follow the best known path

at the beginning of a training episode and then later in the episode the agent is forced to “forage”

and explores randomly. In …