Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Western University

Series

2020

Deep learning

Articles 1 - 1 of 1

Full-Text Articles in Engineering

Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay, Miriam A M Capretz, Norman Tasfi Jul 2020

Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay, Miriam A M Capretz, Norman Tasfi

Electrical and Computer Engineering Publications

This paper presents Noisy Importance Sampling Actor-Critic (NISAC), a set of empirically validated modifications to the advantage actor-critic algorithm (A2C), allowing off-policy reinforcement learning and increased performance. NISAC uses additive action space noise, aggressive truncation of importance sample weights, and large batch sizes. We see that additive noise drastically changes how off-sample experience is weighted for policy updates. The modified algorithm achieves an increase in convergence speed and sample efficiency compared to both the on-policy actor-critic A2C and the importance weighted off-policy actor-critic algorithm. In comparison to state-of-the-art (SOTA) methods, such as actor-critic with experience replay (ACER), NISAC nears the …