Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Engineering
Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay, Miriam A M Capretz, Norman Tasfi
Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay, Miriam A M Capretz, Norman Tasfi
Electrical and Computer Engineering Publications
This paper presents Noisy Importance Sampling Actor-Critic (NISAC), a set of empirically validated modifications to the advantage actor-critic algorithm (A2C), allowing off-policy reinforcement learning and increased performance. NISAC uses additive action space noise, aggressive truncation of importance sample weights, and large batch sizes. We see that additive noise drastically changes how off-sample experience is weighted for policy updates. The modified algorithm achieves an increase in convergence speed and sample efficiency compared to both the on-policy actor-critic A2C and the importance weighted off-policy actor-critic algorithm. In comparison to state-of-the-art (SOTA) methods, such as actor-critic with experience replay (ACER), NISAC nears the …