Open Access. Powered by Scholars. Published by Universities.®

Operations Research, Systems Engineering and Industrial Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Reinforcement Learning

Discipline
Institution
Publication Year
Publication
Publication Type

Articles 1 - 13 of 13

Full-Text Articles in Operations Research, Systems Engineering and Industrial Engineering

Using Reinforcement Learning To Improve Network Reliability Through Optimal Resource Allocation, Henley Wells Dec 2022

Using Reinforcement Learning To Improve Network Reliability Through Optimal Resource Allocation, Henley Wells

Graduate Theses and Dissertations

Networks provide a variety of critical services to society (e.g. power grid, telecommunication, water, transportation) but are prone to disruption. With this motivation, we study a sequential decision problem in which an initial network is improved over time (e.g., by adding or increasing the reliability of edges) and rewards are gained over time as a function of the network’s all-terminal reliability. The actions during each time period are limited due to availability of resources such as time, money, or labor. To solve this problem, we utilized a Deep Reinforcement Learning (DRL) approach implemented within OpenAI-Gym using Stable Baselines. A Proximal …


Developing Novel Optimization And Machine Learning Frameworks To Improve And Assess The Safety Of Workplaces, Amin Aghalari Aug 2022

Developing Novel Optimization And Machine Learning Frameworks To Improve And Assess The Safety Of Workplaces, Amin Aghalari

Theses and Dissertations

This study proposes several decision-making tools utilizing optimization and machine learning frameworks to assess and improve the safety of the workplaces. The first chapter of this study presents a novel mathematical model to optimally locate a set of detectors to minimize the expected number of casualties in a given threat area. The problem is formulated as a nonlinear binary integer programming model and then solved as a linearized branch-and-bound algorithm. Several sensitivity analyses illustrate the model's robustness and draw key managerial insights. One of the prevailing threats in the last decades, Active Shooting (AS) violence, poses a serious threat to …


Hierarchical Value Decomposition For Effective On-Demand Ride Pooling, Hao Jiang, Pradeep Varakantham May 2022

Hierarchical Value Decomposition For Effective On-Demand Ride Pooling, Hao Jiang, Pradeep Varakantham

Research Collection School Of Computing and Information Systems

On-demand ride-pooling (e.g., UberPool, GrabShare) services focus on serving multiple different customer requests using each vehicle, i.e., an empty or partially filled vehicle can be assigned requests from different passengers with different origins and destinations. On the other hand, in Taxi on Demand (ToD) services (e.g., UberX), one vehicle is assigned to only one request at a time. On-demand ride pooling is not only beneficial to customers (lower cost), drivers (higher revenue per trip) and aggregation companies (higher revenue), but is also of crucial importance to the environment as it reduces the number of vehicles required on the roads. Since …


Decision-Analytic Models Using Reinforcement Learning To Inform Dynamic Sequential Decisions In Public Policy, Seyedeh Nazanin Khatami Mar 2022

Decision-Analytic Models Using Reinforcement Learning To Inform Dynamic Sequential Decisions In Public Policy, Seyedeh Nazanin Khatami

Doctoral Dissertations

We developed decision-analytic models specifically suited for long-term sequential decision-making in the context of large-scale dynamic stochastic systems, focusing on public policy investment decisions. We found that while machine learning and artificial intelligence algorithms provide the most suitable frameworks for such analyses, multiple challenges arise in its successful adaptation. We address three specific challenges in two public sectors, public health and climate policy, through the following three essays. In Essay I, we developed a reinforcement learning (RL) model to identify optimal sequence of testing and retention-in-care interventions to inform the national strategic plan “Ending the HIV Epidemic in the US”. …


Scheduling Allocation And Inventory Replenishment Problems Under Uncertainty: Applications In Managing Electric Vehicle And Drone Battery Swap Stations, Amin Asadi Jan 2021

Scheduling Allocation And Inventory Replenishment Problems Under Uncertainty: Applications In Managing Electric Vehicle And Drone Battery Swap Stations, Amin Asadi

Graduate Theses and Dissertations

In this dissertation, motivated by electric vehicle (EV) and drone application growth, we propose novel optimization problems and solution techniques for managing the operations at EV and drone battery swap stations. In Chapter 2, we introduce a novel class of stochastic scheduling allocation and inventory replenishment problems (SAIRP), which determines the recharging, discharging, and replacement decisions at a swap station over time to maximize the expected total profit. We use Markov Decision Process (MDP) to model SAIRPs facing uncertain demands, varying costs, and battery degradation. Considering battery degradation is crucial as it relaxes the assumption that charging/discharging batteries do not …


Sky Surveys Scheduling Using Reinforcement Learning, Andres Felipe Alba Hernandez Jan 2019

Sky Surveys Scheduling Using Reinforcement Learning, Andres Felipe Alba Hernandez

Graduate Research Theses & Dissertations

Modern cosmic sky surveys (e.g., CMB S4, DES, LSST) collect a complex diversity of astronomical objects. Each of class of objects presents different requirements for observation time and sensitivity. For determining the best sequence of exposures for mapping the sky systematically, conventional scheduling methods do not optimize the use of survey time and resources. Dynamic sky survey scheduling is an NP-hard problem that has been therefore treated primarily with heuristic methods. We present an alternative scheduling method based on reinforcement learning (RL) that aims to optimize the use of telescope resources for scheduling sky surveys.

We present an exploration of …


A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead Jan 2017

A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead

Masters Theses

"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs) …


A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera Jan 2017

A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera

Masters Theses

"Artificial intelligence or machine learning techniques are currently being widely applied for solving problems within the field of data analytics. This work presents and demonstrates the use of a new machine learning algorithm for solving semi-Markov decision processes (SMDPs). SMDPs are encountered in the domain of Reinforcement Learning to solve control problems in discrete-event systems. The new algorithm developed here is called iSMART, an acronym for imaging Semi-Markov Average Reward Technique. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. The major difference between R-SMART and iSMART is that the latter uses, in addition …


Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal Jan 2015

Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal

Doctoral Dissertations

"Motivated by the limitations of the current reinforcement learning and optimal control techniques, this dissertation proposes quantum theory inspired algorithms for learning and control of both single-agent and multi-agent stochastic systems.

A common problem encountered in traditional reinforcement learning techniques is the exploration-exploitation trade-off. To address the above issue an action selection procedure inspired by a quantum search algorithm called Grover's iteration is developed. This procedure does not require an explicit design parameter to specify the relative frequency of explorative/exploitative actions.

The second part of this dissertation extends the powerful adaptive critic design methodology to solve finite horizon stochastic optimal …


Understanding The Electricity-Water-Climate Change Nexus Using A Stochastic Optimization Approach, Ivan Saavedra Antolinez May 2014

Understanding The Electricity-Water-Climate Change Nexus Using A Stochastic Optimization Approach, Ivan Saavedra Antolinez

Theses and Dissertations

Climate change has been shown to cause droughts (among other

catastrophic weather events) and it is shown to be exacerbated by the

increasing levels of greenhouse gas emissions on our planet. In May 2013, CO2 daily average concentration over the Pacific Ocean at Mauna Loa Observatory reached a dangerous milestone of 400 ppm, which has not been experienced in thousands of years in the earth's climate. These levels were attributed to the ever-increasing human activity over the last 5-6 decades. Electric power generators are documented by the U.S. Department of Energy to be the largest users of ground and …


Revenue Management For Make-To-Order And Make-To-Stock Systems, Jiao Wang May 2011

Revenue Management For Make-To-Order And Make-To-Stock Systems, Jiao Wang

Doctoral Dissertations

With the success of Revenue Management (RM) techniques over the past three decades in various segments of the service industry, many manufacturing firms have started exploring innovative RM technologies to improve their profits. This dissertation studies RM for make-to-order (MTO) and make-to-stock (MTS) systems.

We start with a problem faced by a MTO firm that has the ability to reject or accept the order and set prices and lead-times to influence demands. The firm is confronted with the problem to decide, which orders to accept or reject and trade-off the price, lead-time and potential for increased demand against capacity constraints, …


A Suite Of Robust Controllers For The Manipulation Of Microscale Objects, Qinmin Yang, Jagannathan Sarangapani Feb 2008

A Suite Of Robust Controllers For The Manipulation Of Microscale Objects, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

A suite of novel robust controllers is introduced for the pickup operation of microscale objects in a microelectromechanical system (MEMS). In MEMS, adhesive, surface tension, friction, and van der Waals forces are dominant. Moreover, these forces are typically unknown. The proposed robust controller overcomes the unknown contact dynamics and ensures its performance in the presence of actuator constraints by assuming that the upper bounds on these forces are known. On the other hand, for the robust adaptive critic-based neural network (NN) controller, the unknown dynamic forces are estimated online. It consists of an action NN for compensating the unknown system …


Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He Jan 2006

Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He

Electrical and Computer Engineering Faculty Research & Creative Works

Spark ignition (SI) engines running at very lean conditions demonstrate significant nonlinear behavior by exhibiting cycle-to-cycle dispersion of heat release even though such operation can significantly reduce NOx emissions and improve fuel efficiency by as much as 5-10%. A suite of neural network (NN) controller without and with reinforcement learning employing output feedback has shown ability to reduce the nonlinear cyclic dispersion observed under lean operating conditions. The neural network controllers consists of three NN: a) A NN observer to estimate the states of the engine such as total fuel and air; b) a second NN for generating virtual input; …