Entire DC Network | Open Access Articles | Digital Commons Network™

Team Air Combat Using Model-Based Reinforcement Learning, David A. Mottice

Theses and Dissertations

We formulate the first generalized air combat maneuvering problem (ACMP), called the MvN ACMP, wherein M friendly AUCAVs engage against N enemy AUCAVs, developing a Markov decision process (MDP) model to control the team of M Blue AUCAVs. The MDP model leverages a 5-degree-of-freedom aircraft state transition model and formulates a directed energy weapon capability. Instead, a model-based reinforcement learning approach is adopted wherein an approximate policy iteration algorithmic strategy is implemented to attain high-quality approximate policies relative to a high performing benchmark policy. The ADP algorithm utilizes a multi-layer neural network for the value function approximation regression mechanism. One-versus-one …

Go to article

Monte Carlo Tree Search Applied To A Modified Pursuit/Evasion Scotland Yard Game With Rendezvous Spaceflight Operation Applications, Joshua A. Daughtery

Theses and Dissertations

This thesis takes the Scotland Yard board game and modifies its rules to mimic important aspects of space in order to facilitate the creation of artificial intelligence for space asset pursuit/evasion scenarios. Space has become a physical warfighting domain. To combat threats, an understanding of the tactics, techniques, and procedures must be captured and studied. Games and simulations are effective tools to capture data lacking historical context. Artificial intelligence and machine learning models can use simulations to develop proper defensive and offensive tactics, techniques, and procedures capable of protecting systems against potential threats. Monte Carlo Tree Search is a bandit-based …

Go to article

Fuzzy State Aggregation And Off-Policy Reinforcement Learning For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson

Faculty Publications

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the environment it is operating in changes. This ability to learn in an unsupervised manner in a changing environment is applicable in complex domains through the use of function approximation of the domain’s policy. The function approximation presented here is that of fuzzy state aggregation. This article presents the use of fuzzy state aggregation with the current policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF), exceeding the learning rate …

Go to article

Digital Commons Network^™

Full-Text Articles in Entire DC Network

Team Air Combat Using Model-Based Reinforcement Learning, David A. Mottice

Theses and Dissertations

Monte Carlo Tree Search Applied To A Modified Pursuit/Evasion Scotland Yard Game With Rendezvous Spaceflight Operation Applications, Joshua A. Daughtery

Theses and Dissertations

Fuzzy State Aggregation And Off-Policy Reinforcement Learning For Stochastic Environments, Dean C. Wardell, Gilbert L. Peterson

Faculty Publications