Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 1 of 1
Full-Text Articles in Entire DC Network
Policy Gradient Methods: Analysis, Misconceptions, And Improvements, Christopher P. Nota
Policy Gradient Methods: Analysis, Misconceptions, And Improvements, Christopher P. Nota
Doctoral Dissertations
Policy gradient methods are a class of reinforcement learning algorithms that optimize a parametric policy by maximizing an objective function that directly measures the performance of the policy. Despite being used in many high-profile applications of reinforcement learning, the conventional use of policy gradient methods in practice deviates from existing theory. This thesis presents a comprehensive mathematical analysis of policy gradient methods, uncovering misconceptions and suggesting novel solutions to improve their performance. We first demonstrate that the update rule used by most policy gradient methods does not correspond to the gradient of any objective function due to the way the …