Physical Sciences and Mathematics | Open Access Articles

Policy Gradient Methods: Analysis, Misconceptions, And Improvements, Christopher P. Nota Mar 2024

Policy Gradient Methods: Analysis, Misconceptions, And Improvements, Christopher P. Nota

Doctoral Dissertations

Policy gradient methods are a class of reinforcement learning algorithms that optimize a parametric policy by maximizing an objective function that directly measures the performance of the policy. Despite being used in many high-profile applications of reinforcement learning, the conventional use of policy gradient methods in practice deviates from existing theory. This thesis presents a comprehensive mathematical analysis of policy gradient methods, uncovering misconceptions and suggesting novel solutions to improve their performance. We first demonstrate that the update rule used by most policy gradient methods does not correspond to the gradient of any objective function due to the way the …

Go to article

Rigorous Experimentation For Reinforcement Learning, Scott M. Jordan Apr 2023

Rigorous Experimentation For Reinforcement Learning, Scott M. Jordan

Doctoral Dissertations

Scientific fields make advancements by leveraging the knowledge created by others to push the boundary of understanding. The primary tool in many fields for generating knowledge is empirical experimentation. Although common, generating accurate knowledge from empirical experiments is often challenging due to inherent randomness in execution and confounding variables that can obscure the correct interpretation of the results. As such, researchers must hold themselves and others to a high degree of rigor when designing experiments. Unfortunately, most reinforcement learning (RL) experiments lack this rigor, making the knowledge generated from experiments dubious. This dissertation proposes methods to address central issues in …

Go to article

Decision-Analytic Models Using Reinforcement Learning To Inform Dynamic Sequential Decisions In Public Policy, Seyedeh Nazanin Khatami Mar 2022

Decision-Analytic Models Using Reinforcement Learning To Inform Dynamic Sequential Decisions In Public Policy, Seyedeh Nazanin Khatami

Doctoral Dissertations

We developed decision-analytic models specifically suited for long-term sequential decision-making in the context of large-scale dynamic stochastic systems, focusing on public policy investment decisions. We found that while machine learning and artificial intelligence algorithms provide the most suitable frameworks for such analyses, multiple challenges arise in its successful adaptation. We address three specific challenges in two public sectors, public health and climate policy, through the following three essays. In Essay I, we developed a reinforcement learning (RL) model to identify optimal sequence of testing and retention-in-care interventions to inform the national strategic plan “Ending the HIV Epidemic in the US”. …

Go to article

Improving Reinforcement Learning Techniques By Leveraging Prior Experience, Francisco M. Garcia Jul 2020

Improving Reinforcement Learning Techniques By Leveraging Prior Experience, Francisco M. Garcia

Doctoral Dissertations

In this dissertation we develop techniques to leverage prior knowledge for improving the learning speed of existing reinforcement learning (RL) algorithms. RL systems can be expensive to train, which limits its applicability when a large number of agents need to be trained to solve a large number of tasks; a situation that often occurs in industry and is often ignored in the RL literature. In this thesis, we develop three methods to leverage the experience obtained from solving a small number of tasks to improve an agent's ability to learn on new tasks the agent might face in the future. …

Go to article

Safe Reinforcement Learning, Philip S. Thomas Nov 2015

Safe Reinforcement Learning, Philip S. Thomas

Doctoral Dissertations

This dissertation proposes and presents solutions to two new problems that fall within the broad scope of reinforcement learning (RL) research. The first problem, high confidence off-policy evaluation (HCOPE), requires an algorithm to use historical data from one or more behavior policies to compute a high confidence lower bound on the performance of an evaluation policy. This allows us to, for the first time, provide the user of any RL algorithm with confidence that a newly proposed policy (which has never actually been used) will perform well. The second problem is to construct what we call a safe reinforcement learning …

Go to article

Autonomous Robot Skill Acquisition, George D. Konidaris May 2011

Autonomous Robot Skill Acquisition, George D. Konidaris

Open Access Dissertations

Among the most impressive of aspects of human intelligence is skill acquisition—the ability to identify important behavioral components, retain them as skills, refine them through practice, and apply them in new task contexts. Skill acquisition underlies both our ability to choose to spend time and effort to specialize at particular tasks, and our ability to collect and exploit previous experience to become able to solve harder and harder problems over time with less and less cognitive effort.

Hierarchical reinforcement learning provides a theoretical basis for skill acquisition, including principled methods for learning new skills and deploying them during problem solving. …

Go to article

Physical Sciences and Mathematics Commons^™

Full-Text Articles in Physical Sciences and Mathematics

Policy Gradient Methods: Analysis, Misconceptions, And Improvements, Christopher P. Nota

Doctoral Dissertations

Rigorous Experimentation For Reinforcement Learning, Scott M. Jordan

Doctoral Dissertations

Decision-Analytic Models Using Reinforcement Learning To Inform Dynamic Sequential Decisions In Public Policy, Seyedeh Nazanin Khatami

Doctoral Dissertations

Improving Reinforcement Learning Techniques By Leveraging Prior Experience, Francisco M. Garcia

Doctoral Dissertations

Safe Reinforcement Learning, Philip S. Thomas

Doctoral Dissertations

Autonomous Robot Skill Acquisition, George D. Konidaris

Open Access Dissertations