Open Access. Powered by Scholars. Published by Universities.®

Operations Research, Systems Engineering and Industrial Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Missouri University of Science and Technology

Computer Sciences

Reinforcement Learning

Articles 1 - 5 of 5

Full-Text Articles in Operations Research, Systems Engineering and Industrial Engineering

A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead Jan 2017

A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead

Masters Theses

"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs) …


A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera Jan 2017

A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera

Masters Theses

"Artificial intelligence or machine learning techniques are currently being widely applied for solving problems within the field of data analytics. This work presents and demonstrates the use of a new machine learning algorithm for solving semi-Markov decision processes (SMDPs). SMDPs are encountered in the domain of Reinforcement Learning to solve control problems in discrete-event systems. The new algorithm developed here is called iSMART, an acronym for imaging Semi-Markov Average Reward Technique. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. The major difference between R-SMART and iSMART is that the latter uses, in addition …


Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal Jan 2015

Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal

Doctoral Dissertations

"Motivated by the limitations of the current reinforcement learning and optimal control techniques, this dissertation proposes quantum theory inspired algorithms for learning and control of both single-agent and multi-agent stochastic systems.

A common problem encountered in traditional reinforcement learning techniques is the exploration-exploitation trade-off. To address the above issue an action selection procedure inspired by a quantum search algorithm called Grover's iteration is developed. This procedure does not require an explicit design parameter to specify the relative frequency of explorative/exploitative actions.

The second part of this dissertation extends the powerful adaptive critic design methodology to solve finite horizon stochastic optimal …


A Suite Of Robust Controllers For The Manipulation Of Microscale Objects, Qinmin Yang, Jagannathan Sarangapani Feb 2008

A Suite Of Robust Controllers For The Manipulation Of Microscale Objects, Qinmin Yang, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

A suite of novel robust controllers is introduced for the pickup operation of microscale objects in a microelectromechanical system (MEMS). In MEMS, adhesive, surface tension, friction, and van der Waals forces are dominant. Moreover, these forces are typically unknown. The proposed robust controller overcomes the unknown contact dynamics and ensures its performance in the presence of actuator constraints by assuming that the upper bounds on these forces are known. On the other hand, for the robust adaptive critic-based neural network (NN) controller, the unknown dynamic forces are estimated online. It consists of an action NN for compensating the unknown system …


Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He Jan 2006

Neural Network-Based Output Feedback Controller For Lean Operation Of Spark Ignition Engines, Brian C. Kaul, Jagannathan Sarangapani, J. A. Drallmeier, Jonathan B. Vance, Pingan He

Electrical and Computer Engineering Faculty Research & Creative Works

Spark ignition (SI) engines running at very lean conditions demonstrate significant nonlinear behavior by exhibiting cycle-to-cycle dispersion of heat release even though such operation can significantly reduce NOx emissions and improve fuel efficiency by as much as 5-10%. A suite of neural network (NN) controller without and with reinforcement learning employing output feedback has shown ability to reduce the nonlinear cyclic dispersion observed under lean operating conditions. The neural network controllers consists of three NN: a) A NN observer to estimate the states of the engine such as total fuel and air; b) a second NN for generating virtual input; …