Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

Graduate Theses and Dissertations

Reinforcement Learning

Publication Year

Articles 1 - 2 of 2

Full-Text Articles in Physical Sciences and Mathematics

Scheduling Allocation And Inventory Replenishment Problems Under Uncertainty: Applications In Managing Electric Vehicle And Drone Battery Swap Stations, Amin Asadi Jan 2021

Scheduling Allocation And Inventory Replenishment Problems Under Uncertainty: Applications In Managing Electric Vehicle And Drone Battery Swap Stations, Amin Asadi

Graduate Theses and Dissertations

In this dissertation, motivated by electric vehicle (EV) and drone application growth, we propose novel optimization problems and solution techniques for managing the operations at EV and drone battery swap stations. In Chapter 2, we introduce a novel class of stochastic scheduling allocation and inventory replenishment problems (SAIRP), which determines the recharging, discharging, and replacement decisions at a swap station over time to maximize the expected total profit. We use Markov Decision Process (MDP) to model SAIRPs facing uncertain demands, varying costs, and battery degradation. Considering battery degradation is crucial as it relaxes the assumption that charging/discharging batteries do not …


Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday May 2018

Improving Asynchronous Advantage Actor Critic With A More Intelligent Exploration Strategy, James B. Holliday

Graduate Theses and Dissertations

We propose a simple and efficient modification to the Asynchronous Advantage Actor Critic (A3C)

algorithm that improves training. In 2016 Google’s DeepMind set a new standard for state-of-theart

reinforcement learning performance with the introduction of the A3C algorithm. The goal of

this research is to show that A3C can be improved by the use of a new novel exploration strategy we

call “Follow then Forage Exploration” (FFE). FFE forces the agents to follow the best known path

at the beginning of a training episode and then later in the episode the agent is forced to “forage”

and explores randomly. In …