Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Singapore Management University (53)
- China Simulation Federation (22)
- Brigham Young University (12)
- MBZUAI (7)
- TÜBİTAK (7)
-
- University of Texas at Arlington (7)
- Missouri University of Science and Technology (6)
- Air Force Institute of Technology (5)
- San Jose State University (4)
- University of Denver (4)
- Old Dominion University (3)
- Portland State University (3)
- Selected Works (3)
- Utah State University (3)
- Western University (3)
- Bucknell University (2)
- Chapman University (2)
- Edith Cowan University (2)
- Georgia Southern University (2)
- Nova Southeastern University (2)
- Technological University Dublin (2)
- University of Kentucky (2)
- University of Nebraska - Lincoln (2)
- University of Nevada, Las Vegas (2)
- University of Texas Rio Grande Valley (2)
- Western Michigan University (2)
- Zayed University (2)
- California State University, San Bernardino (1)
- Clemson University (1)
- Fordham University (1)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (51)
- Journal of System Simulation (22)
- Theses and Dissertations (10)
- Electronic Theses and Dissertations (9)
- Faculty Publications (8)
-
- Machine Learning Faculty Publications (7)
- Turkish Journal of Electrical Engineering and Computer Sciences (7)
- Computer Science and Engineering Dissertations (5)
- Electrical and Computer Engineering Faculty Research & Creative Works (4)
- Master's Projects (4)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (3)
- Dissertations (3)
- Electronic Thesis and Dissertation Repository (3)
- All Works (2)
- Articles (2)
- CCE Theses and Dissertations (2)
- Computer Science Faculty Research & Creative Works (2)
- Computer Science and Engineering Theses (2)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (2)
- Dissertations and Theses (2)
- Dissertations and Theses Collection (Open Access) (2)
- Electrical & Computer Engineering Faculty Research (2)
- Journal of Undergraduate Research (2)
- Theses and Dissertations--Computer Science (2)
- All Dissertations (1)
- All Graduate Theses, Dissertations, and Other Capstone Projects (1)
- Andrew G. Barto (1)
- Biology, Chemistry, and Environmental Sciences Faculty Articles and Research (1)
- Civil & Environmental Engineering Faculty Publications (1)
- Computer Science Faculty Publications (1)
- Publication Type
Articles 151 - 180 of 191
Full-Text Articles in Physical Sciences and Mathematics
Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan
Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
Games are good test-beds to evaluate AI methodologies. In recent years, there has been a vast amount of research dealing with real-time computer games other than the traditional board games or card games. This paper illustrates how we create agents by employing FALCON, a self-organizing neural network that performs reinforcement learning, to play a well-known first-person shooter computer game called Unreal Tournament. Rewards used for learning are either obtained from the game environment or estimated using the temporal difference learning scheme. In this way, the agents are able to acquire proper strategies and discover the effectiveness of different weapons without …
Memory-Guided Exploration In Reinforcement Learning, James L. Carroll, Todd Peterson
Memory-Guided Exploration In Reinforcement Learning, James L. Carroll, Todd Peterson
Journal of Undergraduate Research
Traditional reinforcement learning techniques learn a single task by giving the agent positive and negative rewards. In one type of reinforcement learning, called Q-learning, the agent stores Qvalues, which are the expected reward for performing an action in a given state. Task transfer is a method of transferring information learned in one task to another related task. Most work in transfer has focused on classification techniques. The purpose of our research has been to extend classification techniques to reinforcement learning.
Reinforcement Learning Task Clustering, James Carroll, Todd Peterson
Reinforcement Learning Task Clustering, James Carroll, Todd Peterson
Journal of Undergraduate Research
Reinforcement Learning is a process whereby actions are acquired using reinforcement signals. A signal is given to an autonomous agent indicating how well that agent is performing an action. The agent then attempts to maximize this reinforcement signal. One common method in reinforcement learning is Q-learning where the agent attempts to learn the expected temporally discounted value function for performing an action a in a state s Q(s,a). This function is updated according to:
Complementary Layered Learning, Sean Mondesire
Complementary Layered Learning, Sean Mondesire
Electronic Theses and Dissertations
Layered learning is a machine learning paradigm used to develop autonomous robotic-based agents by decomposing a complex task into simpler subtasks and learns each sequentially. Although the paradigm continues to have success in multiple domains, performance can be unexpectedly unsatisfactory. Using Boolean-logic problems and autonomous agent navigation, we show poor performance is due to the learner forgetting how to perform earlier learned subtasks too quickly (favoring plasticity) or having difficulty learning new things (favoring stability). We demonstrate that this imbalance can hinder learning so that task performance is no better than that of a suboptimal learning technique, monolithic learning, which …
Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow
Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow
Research Collection School Of Computing and Information Systems
Simulator-based training is in constant pursuit of increasing level of realism. The transition from doctrine-driven computer-generated forces (CGF) to adaptive CGF represents one such effort. The use of doctrine-driven CGF is fraught with challenges such as modeling of complex expert knowledge and adapting to the trainees’ progress in real time. Therefore, this paper reports on how the use of adaptive CGF can overcome these challenges. Using a self-organizing neural network to implement the adaptive CGF, air combat maneuvering strategies are learned incrementally and generalized in real time. The state space and action space are extracted from the same hierarchical doctrine …
Reinforcement Learning With Motivations For Realistic Agents, Jacquelyne T. Forgette
Reinforcement Learning With Motivations For Realistic Agents, Jacquelyne T. Forgette
Electronic Thesis and Dissertation Repository
Believable virtual humans have important applications in various fields, including computer based video games. The challenge in programming video games is to produce a non-player controlled character that is autonomous, and capable of action selections that appear human. In this thesis, motivations are used as a basis for learning using reinforcements. With motives driving the decisions of the agents, their actions will appear less structured and repetitious, and more human in nature. This will also allow developers to easily create game agents with specific motivations, based mostly on their narrative purposes. With minimum and maximum desirable motive values, the agents …
Actor-Critic-Based Ink Drop Spread As An Intelligent Controller, Hesam Sagha, Iman Esmaili Paeen Afrakoti, Saeed Bagherishouraki
Actor-Critic-Based Ink Drop Spread As An Intelligent Controller, Hesam Sagha, Iman Esmaili Paeen Afrakoti, Saeed Bagherishouraki
Turkish Journal of Electrical Engineering and Computer Sciences
This paper introduces an innovative adaptive controller based on the actor-critic method. The proposed approach employs the ink drop spread (IDS) method as its main engine. The IDS method is a new trend in soft-computing approaches that is a universal fuzzy modeling technique and has been also used as a supervised controller. Its process is very similar to the processing system of the human brain. The proposed actor-critic method uses an IDS structure as an actor and a 2-dimensional plane, representing control variable states, as a critic that estimates the lifetime goodness of each state. This method is fast, simple, …
Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan
Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …
Efficient Reinforcement Learning In Multiple-Agent Systems And Its Application In Cognitive Radio Networks, Jing Zhang
Efficient Reinforcement Learning In Multiple-Agent Systems And Its Application In Cognitive Radio Networks, Jing Zhang
Dissertations
The objective of reinforcement learning in multiple-agent systems is to find an efficient learning method for the agents to behave optimally. Finding Nash equilibrium has become the common learning target for the optimality. However, finding Nash equilibrium is a PPAD (Polynomial Parity Arguments on Directed graphs)-complete problem. The conventional methods can find Nash equilibrium for some special types of Markov games.
This dissertation proposes a new reinforcement learning algorithm to improve the search efficiency and effectiveness for multiple-agent systems. This algorithm is based on the definition of Nash equilibrium and utilizes the greedy and rational features of the agents. When …
Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan
Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
A new machine learning approach known as motivated learning (ML) is presented in this work. Motivated learning drives a machine to develop abstract motivations and choose its own goals. ML also provides a self-organizing system that controls a machine’s behavior based on competition between dynamically-changing pain signals. This provides an interplay of externally driven and internally generated control signals. It is demonstrated that ML not only yields a more sophisticated learning mechanism and system of values than reinforcement learning (RL), but is also more efficient in learning complex relations and delivers better performance than RL in dynamically changing environments. In …
Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan
Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan
Research Collection School Of Computing and Information Systems
The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …
Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan
Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
Topology-based multi-agent systems (TMAS), wherein agents interact with one another according to their spatial relationship in a network, are well suited for problems with topological constraints. In a TMAS system, however, each agent may have a different state space, which can be rather large. Consequently, traditional approaches to multi-agent cooperative learning may not be able to scale up with the complexity of the network topology. In this paper, we propose a cooperative learning strategy, under which autonomous agents are assembled in a binary tree formation (BTF). By constraining the interaction between agents, we effectively unify the state space of individual …
A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj
A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj
Research Collection School Of Computing and Information Systems
This paper presents a hybrid agent architecture that integrates the behaviours of BDI agents, specifically desire and intention, with a neural network based reinforcement learner known as Temporal DifferenceFusion Architecture for Learning and COgNition (TD-FALCON). With the explicit maintenance of goals, the agent performs reinforcement learning with the awareness of its objectives instead of relying on external reinforcement signals. More importantly, the intention module equips the hybrid architecture with deliberative planning capabilities, enabling the agent to purposefully maintain an agenda of actions to perform and reducing the need of constantly sensing the environment. Through reinforcement learning, plans can also be …
Higher-Level Application Of Adaptive Dynamic Programming/Reinforcement Learning – A Next Phase For Controls And System Identification?, George G. Lendaris
Higher-Level Application Of Adaptive Dynamic Programming/Reinforcement Learning – A Next Phase For Controls And System Identification?, George G. Lendaris
Systems Science Friday Noon Seminar Series
Humans have the ability to make use of experience while performing system identification and selecting control actions for changing situations. In contrast to current technological implementations that slow down as more knowledge is stored, as more experience is gained, human processing speeds up and has enhanced effectiveness. An emerging experience-based (“higher level”) approach promises to endow our technology with enhanced efficiency and effectiveness.
The notions of context and context discernment are important to understanding this human ability. These are defined as appropriate to controls and system-identification. Some general background on controls, Dynamic Programming, and Adaptive Critic leading to Adaptive Dynamic …
Reinforcement Learning Of Competitive And Cooperative Skills In Soccer Agents, Jinsong Leng, Chee Lim
Reinforcement Learning Of Competitive And Cooperative Skills In Soccer Agents, Jinsong Leng, Chee Lim
Research outputs 2011
The main aim of this paper is to provide a comprehensive numerical analysis on the efficiency of various reinforcementlearning (RL) techniques in an agent-based soccer game. The SoccerBots is employed as a simulation testbed to analyze the effectiveness of RL techniques under various scenarios. A hybrid agent teaming framework for investigating agent team architecture, learning abilities, and other specific behaviours is presented. Novel RL algorithms to verify the competitiveandcooperativelearning abilities of goal-oriented agents for decision-making are developed. In particular, the tile coding (TC) technique, a function approximation approach, is used to prevent the state space from growing exponentially, hence avoiding …
An Exploration Of Multi-Agent Learning Within The Game Of Sheephead, Brady Brau
An Exploration Of Multi-Agent Learning Within The Game Of Sheephead, Brady Brau
All Graduate Theses, Dissertations, and Other Capstone Projects
In this paper, we examine a machine learning technique presented by Ishii et al. used to allow for learning in a multi-agent environment and apply an adaptation of this learning technique to the card game Sheephead. We then evaluate the effectiveness of our adaptation by running simulations against rule-based opponents. Multi-agent learning presents several layers of complexity on top of a single-agent learning in a stationary environment. This added complexity and increased state space is just beginning to be addressed by researchers. We utilize techniques used by Ishii et al. to facilitate this multi-agent learning. We model the environment of …
Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan
Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan
Sridhar Mahadevan
In this paper we introduce proto-transfer leaning, a new framework for transfer learning. We explore solutions to transfer learning within reinforcement learning through the use of spectral methods. Proto-value functions (PVFs) are basis functions computed from a spectral analysis of random walks on the state space graph. They naturally lead to the ability to transfer knowledge and representation between related tasks or domains. We investigate task transfer by using the same PVFs in Markov decision processes (MDPs) with different rewards functions. Additionally, our experiments in domain transfer explore applying the Nyström method for interpolation of PVFs between MDPs of different …
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Andrew G. Barto
The execution order of a block of computer instructions on a pipelined machine can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results using both rollouts and reinforcement learning to construct heuristics for scheduling basic blocks. In simulation, the rollout scheduler outperformed a commercial scheduler, and the reinforcement learning scheduler performed almost as well as the commercial scheduler.
A Biologically-Inspired Cognitive Agent Model Integrating Declarative Knowledge And Reinforcement Learning, Ah-Hwee Tan, Gee-Wah Ng
A Biologically-Inspired Cognitive Agent Model Integrating Declarative Knowledge And Reinforcement Learning, Ah-Hwee Tan, Gee-Wah Ng
Research Collection School Of Computing and Information Systems
The paper proposes a biologically-inspired cognitive agent model, known as FALCON-X, based on an integration of the Adaptive Control of Thought (ACT-R) architecture and a class of self-organizing neural networks called fusion Adaptive Resonance Theory (fusion ART). By replacing the production system of ACT-R by a fusion ART model, FALCON-X integrates high-level deliberative cognitive behaviors and real-time learning abilities, based on biologically plausible neural pathways. We illustrate how FALCON-X, consisting of a core inference area interacting with the associated intentional, declarative, perceptual, motor and critic memory modules, can be used to build virtual robots for battles in a simulated RoboCode …
Global Optimization For Value Function Approximation, Marek Petrik, Shlomo Zilberstein
Global Optimization For Value Function Approximation, Marek Petrik, Shlomo Zilberstein
Shlomo Zilberstein
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze both optimal and approximate algorithms for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization …
A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong
A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong
Research Collection School Of Computing and Information Systems
This paper presents a self-organizing neural architecture that integrates the features of belief, desire, and intention (BDI) systems with reinforcement learning. Based on fusion Adaptive Resonance Theory (fusion ART), the proposed architecture provides a unified treatment for both intentional and reactive cognitive functionalities. Operating with a sense-act-learn paradigm, the low level reactive module is a fusion ART network that learns action and value policies across the sensory, motor, and feedback channels. During performance, the actions executed by the reactive module are tracked by a high level intention module (also a fusion ART network) that learns to associate sequences of actions …
Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan
Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
We have developed a unified framework to conduct computational experiments with both learning systems: Motivated learning based on Goal Creation System, and reinforcedment learning using RL Q-Learning Algorithm. Future work includes combining motivated learning to set abstract motivations and manage goals with reinforcement learning to learn proper actions. This will allow testing of motivated learning on typical reinforcement learning benchmarks with large dimensionality of the state/action spaces.
Dynamic Coalition Formation Under Uncertainty, Daylon J. Hooper, Gilbert L. Peterson, Brett J. Borghetti
Dynamic Coalition Formation Under Uncertainty, Daylon J. Hooper, Gilbert L. Peterson, Brett J. Borghetti
Faculty Publications
Coalition formation algorithms are generally not applicable to real-world robotic collectives since they lack mechanisms to handle uncertainty. Those mechanisms that do address uncertainty either deflect it by soliciting information from others or apply reinforcement learning to select an agent type from within a set. This paper presents a coalition formation mechanism that directly addresses uncertainty while allowing the agent types to fall outside of a known set. The agent types are captured through a novel agent modeling technique that handles uncertainty through a belief-based evaluation mechanism. This technique allows for uncertainty in environmental data, agent type, coalition value, and …
A Survey Of Transfer Learning Methods For Reinforcement Learning, Nicholas Bone
A Survey Of Transfer Learning Methods For Reinforcement Learning, Nicholas Bone
Computer Science Graduate and Undergraduate Student Scholarship
Transfer Learning (TL) is the branch of Machine Learning concerned with improving performance on a target task by leveraging knowledge from a related (and usually already learned) source task. TL is potentially applicable to any learning task, but in this survey we consider TL in a Reinforcement Learning (RL) context. TL is inspired by psychology; humans constantly apply previous knowledge to new tasks, but such transfer has traditionally been very difficult for—or ignored by—machine learning applications. The goals of TL are to facilitate faster and better learning of new tasks by applying past experience where appropriate, and to enable autonomous …
Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan
Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
Traditional approaches to integrating knowledge into neural network are concerned mainly about supervised learning. This paper presents how a family of self-organizing neural models known as fusion architecture for learning, cognition and navigation (FALCON) can incorporate a priori knowledge and perform knowledge refinement and expansion through reinforcement learning. Symbolic rules are formulated based on pre-existing know-how and inserted into FALCON as a priori knowledge. The availability of knowledge enables FALCON to start performing earlier in the initial learning trials. Through a temporal-difference (TD) learning method, the inserted rules can be refined and expanded according to the evaluative feedback signals received …
Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton
Improving Liquid State Machines Through Iterative Refinement Of The Reservoir, R David Norton
Theses and Dissertations
Liquid State Machines (LSMs) exploit the power of recurrent spiking neural networks (SNNs) without training the SNN. Instead, a reservoir, or liquid, is randomly created which acts as a filter for a readout function. We develop three methods for iteratively refining a randomly generated liquid to create a more effective one. First, we apply Hebbian learning to LSMs by building the liquid with spike-time dependant plasticity (STDP) synapses. Second, we create an eligibility based reinforcement learning algorithm for synaptic development. Third, we apply principles of Hebbian learning and reinforcement learning to create a new algorithm called separation driven synaptic modification …
Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao
Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao
Research Collection School Of Computing and Information Systems
This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the …
Implementation Of Reinforcement Learning In Game Strategy Design, Chien-Yu Lin
Implementation Of Reinforcement Learning In Game Strategy Design, Chien-Yu Lin
Theses Digitization Project
The purpose of this study is to apply reinforcement learning to the design of game strategy. In the gaming industry, the strategy used by computers to win a game is usually pre-programmed by game designers according to the game patterns or a set of rules.
Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook
Limitations And Extensions Of The Wolf-Phc Algorithm, Philip R. Cook
Theses and Dissertations
Policy Hill Climbing (PHC) is a reinforcement learning algorithm that extends Q-learning to learn probabilistic policies for multi-agent games. WoLF-PHC extends PHC with the "win or learn fast" principle. A proof that PHC will diverge in self-play when playing Shapley's game is given, and WoLF-PHC is shown empirically to diverge as well. Various WoLF-PHC based modifications were created, evaluated, and compared in an attempt to obtain convergence to the single shot Nash equilibrium when playing Shapley's game in self-play without using more information than WoLF-PHC uses. Partial Commitment WoLF-PHC (PCWoLF-PHC), which performs best on Shapley's game, is tested on other …
Reinforcement Learning Neural-Network-Based Controller For Nonlinear Discrete-Time Systems With Input Constraints, Pingan He, Jagannathan Sarangapani
Reinforcement Learning Neural-Network-Based Controller For Nonlinear Discrete-Time Systems With Input Constraints, Pingan He, Jagannathan Sarangapani
Electrical and Computer Engineering Faculty Research & Creative Works
A novel adaptive-critic-based neural network (NN) controller in discrete time is designed to deliver a desired tracking performance for a class of nonlinear systems in the presence of actuator constraints. The constraints of the actuator are treated in the controller design as the saturation nonlinearity. The adaptive critic NN controller architecture based on state feedback includes two NNs: the critic NN is used to approximate the "strategic" utility function, whereas the action NN is employed to minimize both the strategic utility function and the unknown nonlinear dynamic estimation errors. The critic and action NN weight updates are derived by minimizing …