Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 31 - 48 of 48

Full-Text Articles in Physical Sciences and Mathematics

An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong Feb 2017

An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong

Research Collection School Of Computing and Information Systems

We propose a model-based approach to hierarchical reinforcement learning that exploits shared knowledge and selective execution at different levels of abstraction, to efficiently solve large, complex problems. Our framework adopts a new transition dynamics learning algorithm that identifies the common action-feature combinations of the subtasks, and evaluates the subtask execution choices through simulation. The framework is sample efficient, and tolerates uncertain and incomplete problem characterization of the subtasks. We test the framework on common benchmark problems and complex simulated robotic environments. It compares favorably against the stateof-the-art algorithms, and scales well in very large problems.


Towards Autonomous Behavior Learning Of Non-Player Characters In Games, Shu Feng, Ah-Hwee Tan Sep 2016

Towards Autonomous Behavior Learning Of Non-Player Characters In Games, Shu Feng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Non-Player-Characters (NPCs), as found in computer games, can be modelled as intelligent systems, which serve to improve the interactivity and playability of the games. Although reinforcement learning (RL) has been a promising approach to creating the behavior models of non-player characters (NPC), an initial stage of exploration and low performance is typically required. On the other hand, imitative learning (IL) is an effective approach to pre-building a NPC’s behavior model by observing the opponent’s actions, but learning by imitation limits the agent’s performance to that of its opponents. In view of their complementary strengths, this paper proposes a computational model …


Reinforcement Learning Framework For Modeling Spatial Sequential Decisions Under Uncertainty: (Extended Abstract), Truc Viet Le, Siyuan Liu, Hoong Chuin Lau May 2016

Reinforcement Learning Framework For Modeling Spatial Sequential Decisions Under Uncertainty: (Extended Abstract), Truc Viet Le, Siyuan Liu, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

We consider the problem of trajectory prediction, where a trajectory is an ordered sequence of location visits and corresponding timestamps. The problem arises when an agent makes sequential decisions to visit a set of spatial locations of interest. Each location bears a stochastic utility and the agent has a limited budget to spend. Given the agent's observed partial trajectory, our goal is to predict the remaining trajectory. We propose a solution framework to the problem considering both the uncertainty of utility and the budget constraint. We use reinforcement learning (RL) to model the underlying decision processes and inverse RL to …


Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan Dec 2015

Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan

Research Collection School Of Computing and Information Systems

The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes to achieve long-run energy neutrality, where energy supply and demand are balanced in a dynamic environment such that the nodes function continuously. In this paper, we develop a new framework enabling an adaptive duty cycling scheme for sensor networks that takes into account the node battery level, ambient energy that can be harvested, and application-level QoS requirements. We …


A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan Jul 2015

A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper analyzes advanced reinforcement learning techniques and compares some of them to motivated learning. Motivated learning is briefly discussed indicating its relation to reinforcement learning. A black box scenario for comparative analysis of learning efficiency in autonomous agents is developed and described. This is used to analyze selected algorithms. Reported results demonstrate that in the selected category of problems, motivated learning outperformed all reinforcement learning algorithms we compared with.


Integrating Motivated Learning And K-Winner-Take-All To Coordinate Multi-Agent Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Janusz Starzyk, Yuan-Sin Tan, Loo-Nin Teow Aug 2014

Integrating Motivated Learning And K-Winner-Take-All To Coordinate Multi-Agent Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Janusz Starzyk, Yuan-Sin Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

This work addresses the coordination issue in distributed optimization problem (DOP) where multiple distinct and time-critical tasks are performed to satisfy a global objective function. The performance of these tasks has to be coordinated due to the sharing of consumable resources and the dependency on non-consumable resources. Knowing that it can be sub-optimal to predefine the performance of the tasks for large DOPs, the multi-agent reinforcement learning (MARL) framework is adopted wherein an agent is used to learn the performance of each distinct task using reinforcement learning. To coordinate MARL, we propose a novel coordination strategy integrating Motivated Learning (ML) …


Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan Jul 2014

Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Games are good test-beds to evaluate AI methodologies. In recent years, there has been a vast amount of research dealing with real-time computer games other than the traditional board games or card games. This paper illustrates how we create agents by employing FALCON, a self-organizing neural network that performs reinforcement learning, to play a well-known first-person shooter computer game called Unreal Tournament. Rewards used for learning are either obtained from the game environment or estimated using the temporal difference learning scheme. In this way, the agents are able to acquire proper strategies and discover the effectiveness of different weapons without …


Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow Dec 2013

Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

Simulator-based training is in constant pursuit of increasing level of realism. The transition from doctrine-driven computer-generated forces (CGF) to adaptive CGF represents one such effort. The use of doctrine-driven CGF is fraught with challenges such as modeling of complex expert knowledge and adapting to the trainees’ progress in real time. Therefore, this paper reports on how the use of adaptive CGF can overcome these challenges. Using a self-organizing neural network to implement the adaptive CGF, air combat maneuvering strategies are learned incrementally and generalized in real time. The state space and action space are extracted from the same hierarchical doctrine …


Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan Oct 2012

Self-Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …


Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan Apr 2012

Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

A new machine learning approach known as motivated learning (ML) is presented in this work. Motivated learning drives a machine to develop abstract motivations and choose its own goals. ML also provides a self-organizing system that controls a machine’s behavior based on competition between dynamically-changing pain signals. This provides an interplay of externally driven and internally generated control signals. It is demonstrated that ML not only yields a more sophisticated learning mechanism and system of values than reinforcement learning (RL), but is also more efficient in learning complex relations and delivers better performance than RL in dynamically changing environments. In …


Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan Jan 2012

Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan

Research Collection School Of Computing and Information Systems

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …


Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan Oct 2011

Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Topology-based multi-agent systems (TMAS), wherein agents interact with one another according to their spatial relationship in a network, are well suited for problems with topological constraints. In a TMAS system, however, each agent may have a different state space, which can be rather large. Consequently, traditional approaches to multi-agent cooperative learning may not be able to scale up with the complexity of the network topology. In this paper, we propose a cooperative learning strategy, under which autonomous agents are assembled in a binary tree formation (BTF). By constraining the interaction between agents, we effectively unify the state space of individual …


A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj Jul 2011

A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj

Research Collection School Of Computing and Information Systems

This paper presents a hybrid agent architecture that integrates the behaviours of BDI agents, specifically desire and intention, with a neural network based reinforcement learner known as Temporal DifferenceFusion Architecture for Learning and COgNition (TD-FALCON). With the explicit maintenance of goals, the agent performs reinforcement learning with the awareness of its objectives instead of relying on external reinforcement signals. More importantly, the intention module equips the hybrid architecture with deliberative planning capabilities, enabling the agent to purposefully maintain an agenda of actions to perform and reducing the need of constantly sensing the environment. Through reinforcement learning, plans can also be …


A Biologically-Inspired Cognitive Agent Model Integrating Declarative Knowledge And Reinforcement Learning, Ah-Hwee Tan, Gee-Wah Ng Sep 2010

A Biologically-Inspired Cognitive Agent Model Integrating Declarative Knowledge And Reinforcement Learning, Ah-Hwee Tan, Gee-Wah Ng

Research Collection School Of Computing and Information Systems

The paper proposes a biologically-inspired cognitive agent model, known as FALCON-X, based on an integration of the Adaptive Control of Thought (ACT-R) architecture and a class of self-organizing neural networks called fusion Adaptive Resonance Theory (fusion ART). By replacing the production system of ACT-R by a fusion ART model, FALCON-X integrates high-level deliberative cognitive behaviors and real-time learning abilities, based on biologically plausible neural pathways. We illustrate how FALCON-X, consisting of a core inference area interacting with the associated intentional, declarative, perceptual, motor and critic memory modules, can be used to build virtual robots for battles in a simulated RoboCode …


A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong Mar 2010

A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong

Research Collection School Of Computing and Information Systems

This paper presents a self-organizing neural architecture that integrates the features of belief, desire, and intention (BDI) systems with reinforcement learning. Based on fusion Adaptive Resonance Theory (fusion ART), the proposed architecture provides a unified treatment for both intentional and reactive cognitive functionalities. Operating with a sense-act-learn paradigm, the low level reactive module is a fusion ART network that learns action and value policies across the sensory, motor, and feedback channels. During performance, the actions executed by the reactive module are tracked by a high level intention module (also a fusion ART network) that learns to associate sequences of actions …


Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan Jan 2010

Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

We have developed a unified framework to conduct computational experiments with both learning systems: Motivated learning based on Goal Creation System, and reinforcedment learning using RL Q-Learning Algorithm. Future work includes combining motivated learning to set abstract motivations and manage goals with reinforcement learning to learn proper actions. This will allow testing of motivated learning on typical reinforcement learning benchmarks with large dimensionality of the state/action spaces.


Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan Jun 2008

Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Traditional approaches to integrating knowledge into neural network are concerned mainly about supervised learning. This paper presents how a family of self-organizing neural models known as fusion architecture for learning, cognition and navigation (FALCON) can incorporate a priori knowledge and perform knowledge refinement and expansion through reinforcement learning. Symbolic rules are formulated based on pre-existing know-how and inserted into FALCON as a priori knowledge. The availability of knowledge enables FALCON to start performing earlier in the initial learning trials. Through a temporal-difference (TD) learning method, the inserted rules can be refined and expanded according to the evaluative feedback signals received …


Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao Feb 2008

Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao

Research Collection School Of Computing and Information Systems

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the …