Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Theses/Dissertations

Reinforcement Learning

Discipline
Institution
Publication Year
Publication

Articles 1 - 28 of 28

Full-Text Articles in Engineering

A Study Of Deep Reinforcement Learning In Autonomous Racing Using Deepracer Car, Mukesh Ghimire May 2021

A Study Of Deep Reinforcement Learning In Autonomous Racing Using Deepracer Car, Mukesh Ghimire

Honors Theses

Reinforcement learning is thought to be a promising branch of machine learning that has the potential to help us develop an Artificial General Intelligence (AGI) machine. Among the machine learning algorithms, primarily, supervised, semi supervised, unsupervised and reinforcement learning, reinforcement learning is different in a sense that it explores the environment without prior knowledge, and determines the optimal action. This study attempts to understand the concept behind reinforcement learning, the mathematics behind it and see it in action by deploying the trained model in Amazon's DeepRacer car. DeepRacer, a 1/18th scaled autonomous car, is the agent which ...


Reinforcement Learning-Based Access Schemes In Cognitive Radio Networks, Ehab Maged Elguindy Jan 2021

Reinforcement Learning-Based Access Schemes In Cognitive Radio Networks, Ehab Maged Elguindy

Theses and Dissertations

In this thesis, we propose different MAC protocols based on three Reinforcement Learning (RL) approaches, namely Q-Learning, Deep Q-Network (DQN), and Deep Deterministic Policy Gradient (DDPG). We exploit the primary user (PU) feedback, in the form of ARQ and CQI bits, to enhance the performance of the secondary user (SU) MAC protocols. Exploiting the PU feedback information can be applied on the top of any SU sensing-based MAC protocol. Our proposed model relies on two main pillars, namely, an infinite-state Partially Observable Markov Decision Process (POMDP) to model the system dynamics besides a queuing-theoretic model for the PU queue; the ...


Reinforcement Learning Approach For Inspect/Correct Tasks, Hoda Nasereddin Dec 2020

Reinforcement Learning Approach For Inspect/Correct Tasks, Hoda Nasereddin

LSU Doctoral Dissertations

In this research, we focus on the application of reinforcement learning (RL) in automated agent tasks involving considerable target variability (i.e., characterized by stochastic distributions); in particular, learning of inspect/correct tasks. Examples include automated identification & correction of rivet failures in airplane maintenance procedures, and automated cleaning of surgical instruments in a hospital sterilization processing department. The location of defects and the corrective action to be taken for each varies from task episode. What needs to be learned are optimal stochastic strategies rather than optimization of any one single defect type and location. RL has been widely applied in ...


Artificial Intelligence Enabled Distributed Edge Computing For Internet Of Things Applications, Georgios Fragkos Nov 2020

Artificial Intelligence Enabled Distributed Edge Computing For Internet Of Things Applications, Georgios Fragkos

Electrical and Computer Engineering ETDs

Artificial Intelligence (AI) based techniques are typically used to model decision-making in terms of strategies and mechanisms that can conclude to optimal payoffs for a number of interacting entities, often presenting competitive behaviors. In this thesis, an AI-enabled multi-access edge computing (MEC) framework is proposed, supported by computing-equipped Unmanned Aerial Vehicles (UAVs) to facilitate Internet of Things (IoT) applications. Initially, the problem of determining the IoT nodes optimal data offloading strategies to the UAV-mounted MEC servers, while accounting for the IoT nodes’ communication and computation overhead, is formulated based on a game-theoretic model. The existence of at least one Pure ...


A Comprehensive And Modular Robotic Control Framework For Model-Less Control Law Development Using Reinforcement Learning For Soft Robotics, Charles Sullivan Jan 2020

A Comprehensive And Modular Robotic Control Framework For Model-Less Control Law Development Using Reinforcement Learning For Soft Robotics, Charles Sullivan

Open Access Theses & Dissertations

Soft robotics is a growing field in robotics research. Heavily inspired by biological systems, these robots are made of softer, non-linear, materials such as elastomers and are actuated using several novel methods, from fluidic actuation channels to shape changing materials such as electro-active polymers. Highly non-linear materials make modeling difficult, and sensors are still an area of active research. These issues have rendered typical control and modeling techniques often inadequate for soft robotics. Reinforcement learning is a branch of machine learning that focuses on model-less control by mapping states to actions that maximize a specific reward signal. Reinforcement learning has ...


A Comparative Analysis Of Reinforcement Learning Applied To Task-Space Reaching With A Robotic Manipulator With And Without Gravity Compensation, Jonathan Fugal Jan 2020

A Comparative Analysis Of Reinforcement Learning Applied To Task-Space Reaching With A Robotic Manipulator With And Without Gravity Compensation, Jonathan Fugal

Theses and Dissertations--Electrical and Computer Engineering

Advances in computing power in recent years have facilitated developments in autonomous robotic systems. These robotic systems can be used in prosthetic limbs, wearhouse packaging and sorting, assembly line production, as well as many other applications. Designing these autonomous systems typically requires robotic system and world models (for classical control based strategies) or time consuming and computationally expensive training (for learning based strategies). Often these requirements are difficult to fulfill. There are ways to combine classical control and learning based strategies that can mitigate both requirements. One of these ways is to use a gravity compensated torque control with reinforcement ...


Landing Throttleable Hybrid Rockets With Hierarchical Reinforcement Learning In A Simulated Environment, Francesco Alessandro Stefano Mikulis-Borsoi Jan 2020

Landing Throttleable Hybrid Rockets With Hierarchical Reinforcement Learning In A Simulated Environment, Francesco Alessandro Stefano Mikulis-Borsoi

Honors Theses and Capstones

In this paper, I develop a hierarchical Markov Decision Process (MDP) structure for completing the task of vertical rocket landing. I start by covering the background of this problem, and formally defining its constraints. In order to reduce mistakes while formulating different MDPs, I define and develop the criteria for a standardized MDP definition format. I then decompose the problem into several sub-problems of vertical landing, namely velocity control and vertical stability control. By exploiting MDP coupling and symmetrical properties, I am able to significantly reduce the size of the state space compared to a unified MDP formulation. This paper ...


Satisfaction-Aware Data Offloading In Surveillance Systems, Marcos Paul Torres Nov 2019

Satisfaction-Aware Data Offloading In Surveillance Systems, Marcos Paul Torres

Electrical and Computer Engineering ETDs

In this thesis, exploiting Fully Autonomous Aerial Systems' (FAAS) and Mobile Edge Computing (MEC) servers' computing capabilities to introduce a novel data offloading framework to support the energy and time-efficient video processing in surveillance systems based on satisfaction games. A surveillance system is introduced consisting of Areas of Interest (AoIs), where a MEC server is associated with each AoI, and a FAAS is flying above the AoIs to support the IP cameras' computing demands. Each IP camera adopts a utility function capturing its Quality of Service (QoS) considering the experienced time and energy overhead to offload and process remotely or ...


Artificial Intelligence Empowered Uavs Data Offloading In Mobile Edge Computing, Nicholas Alexander Kemp Nov 2019

Artificial Intelligence Empowered Uavs Data Offloading In Mobile Edge Computing, Nicholas Alexander Kemp

Electrical and Computer Engineering ETDs

The advances introduced by Unmanned Aerial Vehicles (UAVs) are manifold and have paved the path for the full integration of UAVs, as intelligent objects, into the Internet of Things (IoT). This paper brings artificial intelligence into the UAVs data offloading process in a multi-server Mobile Edge Computing (MEC) environment, by adopting principles and concepts from game theory and reinforcement learning. Initially, the autonomous MEC server selection for partial data offloading is performed by the UAVs, based on the theory of the stochastic learning automata. A non-cooperative game among the UAVs is then formulated to determine the UAVs' data to be ...


Enhancing The Performance Of Energy Harvesting Wireless Communications Using Optimization And Machine Learning, Ala'eddin A. Masadeh Jan 2019

Enhancing The Performance Of Energy Harvesting Wireless Communications Using Optimization And Machine Learning, Ala'eddin A. Masadeh

Graduate Theses and Dissertations

The motivation behind this thesis is to provide efficient solutions for energy harvesting communications. Firstly, an energy harvesting underlay cognitive radio relaying network is investigated. In this context, the secondary network is an energy harvesting network. Closed-form expressions are derived for transmission power of secondary source and relay that maximizes the secondary network throughput. Secondly, a practical scenario in terms of information availability about the environment is investigated. We consider a communications system with a source capable of harvesting solar energy. Two cases are considered based on the knowledge availability about the underlying processes. When this knowledge is available, an ...


Reinforcement Learning And Game Theory For Smart Grid Security, Shuva Paul Jan 2019

Reinforcement Learning And Game Theory For Smart Grid Security, Shuva Paul

Electronic Theses and Dissertations

This dissertation focuses on one of the most critical and complicated challenges facing electric power transmission and distribution systems which is their vulnerability against failure and attacks. Large scale power outages in Australia (2016), Ukraine (2015), India (2013), Nigeria (2018), and the United States (2011, 2003) have demonstrated the vulnerability of power grids to cyber and physical attacks and failures. These incidents clearly indicate the necessity of extensive research efforts to protect the power system from external intrusion and to reduce the damages from post-attack effects. We analyze the vulnerability of smart power grids to cyber and physical attacks and ...


Design And Investigation Of Genetic Algorithmic And Reinforcement Learning Approaches To Wire Crossing Reductions For Pnml Devices, Alexander Keith Gunter Jan 2019

Design And Investigation Of Genetic Algorithmic And Reinforcement Learning Approaches To Wire Crossing Reductions For Pnml Devices, Alexander Keith Gunter

Electronic Theses and Dissertations

Perpendicular nanomagnet logic (pNML) is an emerging post-CMOS technology which encodes binary data in the polarization of single-domain nanomagnets and performs operations via fringing field interactions. Currently, there is no complete top-down workflow for pNML. Researchers must instead simultaneously handle place-and-route, timing, and logic minimization by hand. These tasks include multiple NP-Hard subproblems, and the lack of automated tools for solving them for pNML precludes the design of large-scale pNML circuits.


Reinforcement Learning In Robotic Task Domains With Deictic Descriptor Representation, Harry Paul Moore Oct 2018

Reinforcement Learning In Robotic Task Domains With Deictic Descriptor Representation, Harry Paul Moore

LSU Doctoral Dissertations

In the field of reinforcement learning, robot task learning in a specific environment with a Markov decision process backdrop has seen much success. But, extending these results to learning a task for an environment domain has not been as fruitful, even for advanced methodologies such as relational reinforcement learning. In our research into robot learning in environment domains, we utilize a form of deictic representation for the robot’s description of the task environment. However, the non-Markovian nature of the deictic representation leads to perceptual aliasing and conflicting actions, invalidating standard reinforcement learning algorithms. To circumvent this difficulty, several past ...


Mastering The Game Of Gomoku Without Human Knowledge, Yuan Wang Jun 2018

Mastering The Game Of Gomoku Without Human Knowledge, Yuan Wang

Master's Theses

Gomoku, also called Five in a row, is one of the earliest checkerboard games invented by humans. For a long time, it has brought countless pleasures to us. We humans, as players, also created a lot of skills in playing it. Scientists normalize and enter these skills into the computer so that the computer knows how to play Gomoku. However, the computer just plays following the pre-entered skills, it doesn’t know how to develop these skills by itself. Inspired by Google’s AlphaGo Zero, in this thesis, by combining the technologies of Monte Carlo Tree Search, Deep Neural Networks ...


Adaptive Dynamic Programming With Eligibility Traces And Complexity Reduction Of High-Dimensional Systems, Seaar Jawad Kadhim Al-Dabooni Jan 2018

Adaptive Dynamic Programming With Eligibility Traces And Complexity Reduction Of High-Dimensional Systems, Seaar Jawad Kadhim Al-Dabooni

Doctoral Dissertations

"This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of ...


Adaptive Interventions Treatment Modelling And Regimen Optimization Using Sequential Multiple Assignment Randomized Trials (Smart) And Q-Learning, Abiral Baniya Jan 2018

Adaptive Interventions Treatment Modelling And Regimen Optimization Using Sequential Multiple Assignment Randomized Trials (Smart) And Q-Learning, Abiral Baniya

Electronic Theses and Dissertations

Nowadays, pharmacological practices are focused on a single best treatment to treat a disease which sounds impractical as the same treatment may not work the same way for every patient. Thus, there is a need of shift towards more patient-centric rather than disease-centric approach, in which personal characteristics of a patient or biomarkers are used to determine the tailored optimal treatment. The “one size fits all” concept is contradicted by research area of personalized medicine. The Sequential Multiple Assignment Randomized Trial (SMART) is a multi-stage trials to inform the development of dynamic treatment regimens (DTR’s). In SMART, a subject ...


Online Learning With Bandits For Coverage, Mahmuda Rahman Dec 2017

Online Learning With Bandits For Coverage, Mahmuda Rahman

Dissertations - ALL

With the rapid growth in velocity and volume, streaming data compels decision support systems to predict a small number of unique data points in due time that can represent a massive amount of correlated data without much loss of precision. In this work, we formulate this problem as the {\it online set coverage problem} and propose its solution for recommendation systems and the patrol assignment problem.

We propose a novel online reinforcement learning algorithm inspired by the Multi-Armed Bandit problem to solve the online recommendation system problem. We introduce a graph-based mechanism to improve the user coverage by recommended items ...


Risk-Aware Navigation For Uav Digital Data Collection, Zhi Xing Aug 2017

Risk-Aware Navigation For Uav Digital Data Collection, Zhi Xing

Dissertations - ALL

This thesis studies the navigation task for autonomous UAVs to collect digital data in a risky environment. Three problem formulations are proposed according to different real-world situations. First, we focus on uniform probabilistic risk and assume UAV has unlimited amount of energy. With these assumptions, we provide the graph-based Data-collecting Robot Problem (DRP) model, and propose heuristic planning solutions that consist of a clustering step and a tour building step. Experiments show our methods provide high-quality solutions with high expected reward. Second, we investigate non-uniform probabilistic risk and limited energy capacity of UAV. We present the Data-collection Problem (DCP) to ...


Multi-Scale Spatial Cognition Models And Bio-Inspired Robot Navigation, Martin I. Llofriu Alonso Jun 2017

Multi-Scale Spatial Cognition Models And Bio-Inspired Robot Navigation, Martin I. Llofriu Alonso

Graduate Theses and Dissertations

The rodent navigation system has been the focus of study for over a century. Discoveries made lately have provided insight on the inner workings of this system. Since then, computational approaches have been used to test hypothesis, as well as to improve robotics navigation and learning by taking inspiration on the rodent navigation system.

This dissertation focuses on the study of the multi-scale representation of the rat’s current location found in the rat hippocampus. It first introduces a model that uses these different scales in the Morris maze task to show their advantages. The generalization power of larger scales ...


A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera Jan 2017

A New Reinforcement Learning Algorithm With Fixed Exploration For Semi-Markov Decision Processes, Angelo Michael Encapera

Masters Theses

"Artificial intelligence or machine learning techniques are currently being widely applied for solving problems within the field of data analytics. This work presents and demonstrates the use of a new machine learning algorithm for solving semi-Markov decision processes (SMDPs). SMDPs are encountered in the domain of Reinforcement Learning to solve control problems in discrete-event systems. The new algorithm developed here is called iSMART, an acronym for imaging Semi-Markov Average Reward Technique. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. The major difference between R-SMART and iSMART is that the latter uses, in addition ...


A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead Jan 2017

A Bounded Actor-Critic Algorithm For Reinforcement Learning, Ryan Jacob Lawhead

Masters Theses

"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs ...


Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich Dec 2015

Neuron Clustering For Mitigating Catastrophic Forgetting In Supervised And Reinforcement Learning, Benjamin Frederick Goodrich

Doctoral Dissertations

Neural networks have had many great successes in recent years, particularly with the advent of deep learning and many novel training techniques. One issue that has affected neural networks and prevented them from performing well in more realistic online environments is that of catastrophic forgetting. Catastrophic forgetting affects supervised learning systems when input samples are temporally correlated or are non-stationary. However, most real-world problems are non-stationary in nature, resulting in prolonged periods of time separating inputs drawn from different regions of the input space.

Reinforcement learning represents a worst-case scenario when it comes to precipitating catastrophic forgetting in neural networks ...


On The Selection Of Just-In-Time Interventions, Luis Gabriel Jaimes Mar 2015

On The Selection Of Just-In-Time Interventions, Luis Gabriel Jaimes

Graduate Theses and Dissertations

A deeper understanding of human physiology, combined with improvements in sensing technologies, is fulfilling the vision of affective computing, where applications monitor and react to changes in affect. Further, the proliferation of commodity mobile devices is extending these applications into the natural environment, where they become a pervasive part of our daily lives. This work examines one such pervasive affective computing application with significant implications for long-term health and quality of life adaptive just-in-time interventions (AJITIs). We discuss fundamental components needed to design AJITIs based for one kind of affective data, namely stress. Chronic stress has significant long-term behavioral and ...


Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal Jan 2015

Quantum Inspired Algorithms For Learning And Control Of Stochastic Systems, Karthikeyan Rajagopal

Doctoral Dissertations

"Motivated by the limitations of the current reinforcement learning and optimal control techniques, this dissertation proposes quantum theory inspired algorithms for learning and control of both single-agent and multi-agent stochastic systems.

A common problem encountered in traditional reinforcement learning techniques is the exploration-exploitation trade-off. To address the above issue an action selection procedure inspired by a quantum search algorithm called Grover's iteration is developed. This procedure does not require an explicit design parameter to specify the relative frequency of explorative/exploitative actions.

The second part of this dissertation extends the powerful adaptive critic design methodology to solve finite horizon ...


Understanding The Electricity-Water-Climate Change Nexus Using A Stochastic Optimization Approach, Ivan Saavedra Antolinez May 2014

Understanding The Electricity-Water-Climate Change Nexus Using A Stochastic Optimization Approach, Ivan Saavedra Antolinez

Theses and Dissertations

Climate change has been shown to cause droughts (among other

catastrophic weather events) and it is shown to be exacerbated by the

increasing levels of greenhouse gas emissions on our planet. In May 2013, CO2 daily average concentration over the Pacific Ocean at Mauna Loa Observatory reached a dangerous milestone of 400 ppm, which has not been experienced in thousands of years in the earth's climate. These levels were attributed to the ever-increasing human activity over the last 5-6 decades. Electric power generators are documented by the U.S. Department of Energy to be the largest users of ...


Policy Based Reinforcement Learning Approach Of Jobshop Scheduling With High Level Deadlock Detection, Mengmeng Chen Jan 2014

Policy Based Reinforcement Learning Approach Of Jobshop Scheduling With High Level Deadlock Detection, Mengmeng Chen

Graduate Theses and Dissertations

We present a policy based reinforcement learning scheduling algorithm with high level deadlock detection for job-shop discrete manufacturing systems without buffer being equipped. Deadlock is a highly undesirable phenomenon resulting from resource sharing and competition. Hence, we first propose detection algorithms for second and third level deadlocks. Subsequently, based on these high level deadlock detection algorithms, a new policy based reinforcement learning scheduling algorithm is developed in the context of buffer-less job-shop systems. Applying our reinforcement learning approach into scheduling algorithm to a set of 40 widely-used buffer-less job shop benchmark, satisfactory makespan can be obtained, which, to our knowledge ...


A Comparison Of The Performance Of Neural Q-Learning And Soar-Rl On A Derivative Of The Block Design (Bd)/Block Design Multiple Choice (Bdmc) Subtests On The Wisc-Iv Intelligence Test, Charreau Bell Dec 2011

A Comparison Of The Performance Of Neural Q-Learning And Soar-Rl On A Derivative Of The Block Design (Bd)/Block Design Multiple Choice (Bdmc) Subtests On The Wisc-Iv Intelligence Test, Charreau Bell

All Theses

Teaching an autonomous agent to perform tasks that are simple to humans can be complex, especially when the task requires successive steps, has a low likelihood of successful completion with a brute force approach, and when the solution space is too large or too complex to be explicitly encoded. Reinforcement learning algorithms are particularly suited to such situations, and are based on rewards that help the agent to find the optimal action to execute given a certain state. The task investigated in this thesis is a modified form of the Block Design (BD) and Block Design Multiple Choice (BDMC) subtests ...


Revenue Management For Make-To-Order And Make-To-Stock Systems, Jiao Wang May 2011

Revenue Management For Make-To-Order And Make-To-Stock Systems, Jiao Wang

Doctoral Dissertations

With the success of Revenue Management (RM) techniques over the past three decades in various segments of the service industry, many manufacturing firms have started exploring innovative RM technologies to improve their profits. This dissertation studies RM for make-to-order (MTO) and make-to-stock (MTS) systems.

We start with a problem faced by a MTO firm that has the ability to reject or accept the order and set prices and lead-times to influence demands. The firm is confronted with the problem to decide, which orders to accept or reject and trade-off the price, lead-time and potential for increased demand against capacity constraints ...