Computer Engineering | Open Access Articles | Digital Commons Network™

Reinforcement Learning In Robotic Task Domains With Deictic Descriptor Representation, Harry Paul Moore

LSU Doctoral Dissertations

In the field of reinforcement learning, robot task learning in a specific environment with a Markov decision process backdrop has seen much success. But, extending these results to learning a task for an environment domain has not been as fruitful, even for advanced methodologies such as relational reinforcement learning. In our research into robot learning in environment domains, we utilize a form of deictic representation for the robot’s description of the task environment. However, the non-Markovian nature of the deictic representation leads to perceptual aliasing and conflicting actions, invalidating standard reinforcement learning algorithms. To circumvent this difficulty, several past research …

Go to article

Mastering The Game Of Gomoku Without Human Knowledge, Yuan Wang

Master's Theses

Gomoku, also called Five in a row, is one of the earliest checkerboard games invented by humans. For a long time, it has brought countless pleasures to us. We humans, as players, also created a lot of skills in playing it. Scientists normalize and enter these skills into the computer so that the computer knows how to play Gomoku. However, the computer just plays following the pre-entered skills, it doesn’t know how to develop these skills by itself. Inspired by Google’s AlphaGo Zero, in this thesis, by combining the technologies of Monte Carlo Tree Search, Deep Neural Networks, and Reinforcement …

Go to article

Adaptive Dynamic Programming With Eligibility Traces And Complexity Reduction Of High-Dimensional Systems, Seaar Jawad Kadhim Al-Dabooni

Doctoral Dissertations

"This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer …

Go to article

Computer Engineering Commons^™

Full-Text Articles in Computer Engineering

Reinforcement Learning In Robotic Task Domains With Deictic Descriptor Representation, Harry Paul Moore

LSU Doctoral Dissertations

Mastering The Game Of Gomoku Without Human Knowledge, Yuan Wang

Master's Theses

Adaptive Dynamic Programming With Eligibility Traces And Complexity Reduction Of High-Dimensional Systems, Seaar Jawad Kadhim Al-Dabooni

Doctoral Dissertations