Open Access. Powered by Scholars. Published by Universities.®

Operations Research, Systems Engineering and Industrial Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Civil and Environmental Engineering

Engineering Management and Systems Engineering Faculty Research & Creative Works

Series

Deep reinforcement learning

Articles 1 - 1 of 1

Full-Text Articles in Operations Research, Systems Engineering and Industrial Engineering

Deep Reinforcement Learning For Approximate Policy Iteration: Convergence Analysis And A Post-Earthquake Disaster Response Case Study, Abhijit Gosavi, L. (Lesley) H. Sneed, L. A. Spearing Jan 2023

Deep Reinforcement Learning For Approximate Policy Iteration: Convergence Analysis And A Post-Earthquake Disaster Response Case Study, Abhijit Gosavi, L. (Lesley) H. Sneed, L. A. Spearing

Engineering Management and Systems Engineering Faculty Research & Creative Works

Approximate Policy Iteration (API) is a Class of Reinforcement Learning (RL) Algorithms that Seek to Solve the Long-Run Discounted Reward Markov Decision Process (MDP), Via the Policy Iteration Paradigm, Without Learning the Transition Model in the Underlying Bellman Equation. Unfortunately, These Algorithms Suffer from a Defect Known as Chattering in Which the Solution (Policy) Delivered in Each Iteration of the Algorithm Oscillates between Improved and Worsened Policies, Leading to Sub-Optimal Behavior. Two Causes for This that Have Been Traced to the Crucial Policy Improvement Step Are: (I) the Inaccuracies in the Policy Improvement Function and (Ii) the Exploration/exploitation Tradeoff Integral …