Open Access. Powered by Scholars. Published by Universities.®

Databases and Information Systems Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 25 of 25

Full-Text Articles in Databases and Information Systems

A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami Jan 2024

A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami

VMASC Publications

Named Data Network (NDN) is proposed for the Internet as an information-centric architecture. Content storing in the router’s cache plays a significant role in NDN. When a router’s cache becomes full, a cache replacement policy determines which content should be discarded for the new content storage. This paper proposes a new cache replacement policy called Discard of Fast Retrievable Content (DFRC). In DFRC, the retrieval time of the content is evaluated using the FIB table information, and the content with less retrieval time receives more discard priority. An impact weight is also used to involve both the grade of retrieval …


Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu Jul 2023

Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu

Research Collection School Of Computing and Information Systems

Conversational recommendation systems (CRS) aim to interactively acquire user preferences and accordingly recommend items to users. Accurately learning the dynamic user preferences is of crucial importance for CRS. Previous works learn the user preferences with pairwise relations from the interactive conversation and item knowledge, while largely ignoring the fact that factors for a relationship in CRS are multiplex. Specifically, the user likes/dislikes the items that satisfy some attributes (Like/Dislike view). Moreover social influence is another important factor that affects user preference towards the item (Social view), while is largely ignored by previous works in CRS. The user preferences from these …


Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li May 2023

Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li

Research Collection School Of Computing and Information Systems

Domain adaptation enables generalized learning in new environments by transferring knowledge from label-rich source domains to label-scarce target domains. As a more realistic extension, partial domain adaptation (PDA) relaxes the assumption of fully shared label space, and instead deals with the scenario where the target label space is a subset of the source label space. In this paper, we propose a Reinforced Adaptation Network (RAN) to address the challenging PDA problem. Specifically, a deep reinforcement learning model is proposed to learn source data selection policies. Meanwhile, a domain adaptation model is presented to simultaneously determine rewards and learn domain-invariant feature …


End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek Dec 2022

End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek

Research Collection School Of Computing and Information Systems

Hierarchical reinforcement learning (HRL) is a promising approach to perform long-horizon goal-reaching tasks by decomposing the goals into subgoals. In a holistic HRL paradigm, an agent must autonomously discover such subgoals and also learn a hierarchy of policies that uses them to reach the goals. Recently introduced end-to-end HRL methods accomplish this by using the higher-level policy in the hierarchy to directly search the useful subgoals in a continuous subgoal space. However, learning such a policy may be challenging when the subgoal space is large. We propose integrated discovery of salient subgoals (LIDOSS), an end-to-end HRL method with an integrated …


Cloud Provisioning And Management With Deep Reinforcement Learning, Alexandru Tol Jan 2022

Cloud Provisioning And Management With Deep Reinforcement Learning, Alexandru Tol

Master's Projects

The first web applications appeared in the early nineteen nineties. These applica- tions were entirely hosted in house by companies that developed them. In the mid 2000s the concept of a digital cloud was introduced by the then CEO of google Eric Schmidt. Now in the current day most companies will at least partially host their applications on proprietary servers hosted at data-centers or commercial clouds like Amazon Web Services (AWS) or Heroku.

This arrangement seems like a straight forward win-win for both parties, the customer gets rid of the hassle of maintaining a live server for their applications and …


Learning Index Policies For Restless Bandits With Application To Maternal Healthcare, Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe May 2021

Learning Index Policies For Restless Bandits With Application To Maternal Healthcare, Arpita Biswas, Gaurav Aggarwal, Pradeep Varakantham, Milind Tambe

Research Collection School Of Computing and Information Systems

In many community health settings, it is crucial to have a systematic monitoring and intervention process to ensure that the patients adhere to healthcare programs, such as periodic health checks or taking medications. When these interventions are expensive, they can be provided to only a fixed small fraction of the patients at any period of time. Hence, it is important to carefully choose the beneficiaries who should be provided with interventions and when. We model this scenario as a restless multi-armed bandit (RMAB) problem, where each beneficiary is assumed to transition from one state to another depending on the intervention …


Approximate Difference Rewards For Scalable Multigent Reinforcement Learning, Arambam James Singh, Akshat Kumar May 2021

Approximate Difference Rewards For Scalable Multigent Reinforcement Learning, Arambam James Singh, Akshat Kumar

Research Collection School Of Computing and Information Systems

We address the problem of multiagent credit assignment in a large scale multiagent system. Difference rewards (DRs) are an effective tool to tackle this problem, but their exact computation is known to be challenging even for small number of agents. We propose a scalable method to compute difference rewards based on aggregate information in a multiagent system with large number of agents by exploiting the symmetry present in several practical applications. Empirical evaluation on two multiagent domains—air-traffic control and cooperative navigation, shows better solution quality than previous approaches.


Using Reinforcement Learning To Minimize The Probability Of Delay Occurrence In Transportation, Zhiguang Cao, Hongliang Guo, Wen Song, Kaizhou Gao, Zhengghua Chen, Le Zhang, Xuexi Zhang Mar 2020

Using Reinforcement Learning To Minimize The Probability Of Delay Occurrence In Transportation, Zhiguang Cao, Hongliang Guo, Wen Song, Kaizhou Gao, Zhengghua Chen, Le Zhang, Xuexi Zhang

Research Collection School Of Computing and Information Systems

Reducing traffic delay is of crucial importance for the development of sustainable transportation systems, which is a challenging task in the studies of stochastic shortest path (SSP) problem. Existing methods based on the probability tail model to solve the SSP problem, seek for the path that minimizes the probability of delay occurrence, which is equal to maximizing the probability of reaching the destination before a deadline (i.e., arriving on time). However, they suffer from low accuracy or high computational cost. Therefore, we design a novel and practical Q-learning approach where the converged Q-values have the practical meaning as the actual …


Multi-Agent Collaborative Exploration Through Graph-Based Deep Reinforcement Learning, Tianze Luo, Budhitama Subagdja, Ah-Hwee Tan, Ah-Hwee Tan Oct 2019

Multi-Agent Collaborative Exploration Through Graph-Based Deep Reinforcement Learning, Tianze Luo, Budhitama Subagdja, Ah-Hwee Tan, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Autonomous exploration by a single or multiple agents in an unknown environment leads to various applications in automation, such as cleaning, search and rescue, etc. Traditional methods normally take frontier locations and segmented regions of the environment into account to efficiently allocate target locations to different agents to visit. They may employ ad hoc solutions to allocate the task to the agents, but the allocation may not be efficient. In the literature, few studies focused on enhancing the traditional methods by applying machine learning models for agent performance improvement. In this paper, we propose a graph-based deep reinforcement learning approach …


Probabilistic Guided Exploration For Reinforcement Learning In Self-Organizing Neural Networks, Peng Wang, Weigui Jair Zhou, Di Wang, Ah-Hwee Tan Jul 2018

Probabilistic Guided Exploration For Reinforcement Learning In Self-Organizing Neural Networks, Peng Wang, Weigui Jair Zhou, Di Wang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Exploration is essential in reinforcement learning, which expands the search space of potential solutions to a given problem for performance evaluations. Specifically, carefully designed exploration strategy may help the agent learn faster by taking the advantage of what it has learned previously. However, many reinforcement learning mechanisms still adopt simple exploration strategies, which select actions in a pure random manner among all the feasible actions. In this paper, we propose novel mechanisms to improve the existing knowledgebased exploration strategy based on a probabilistic guided approach to select actions. We conduct extensive experiments in a Minefield navigation simulator and the results …


Adopt: Combining Parameter Tuning And Adaptive Operator Ordering For Solving A Class Of Orienteering Problems, Aldy Gunawan, Hoong Chuin Lau, Kun Lu Jul 2018

Adopt: Combining Parameter Tuning And Adaptive Operator Ordering For Solving A Class Of Orienteering Problems, Aldy Gunawan, Hoong Chuin Lau, Kun Lu

Research Collection School Of Computing and Information Systems

Two fundamental challenges in local search based metaheuristics are how to determine parameter configurations and design the underlying Local Search (LS) procedure. In this paper, we propose a framework in order to handle both challenges, called ADaptive OPeraTor Ordering (ADOPT). In this paper, The ADOPT framework is applied to two metaheuristics, namely Iterated Local Search (ILS) and a hybridization of Simulated Annealing and ILS (SAILS) for solving two variants of the Orienteering Problem: the Team Dependent Orienteering Problem (TDOP) and the Team Orienteering Problem with Time Windows (TOPTW). This framework consists of two main processes. The Design of Experiment (DOE) …


An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le Nov 2017

An Integrated Framework For Modeling And Predicting Spatiotemporal Phenomena In Urban Environments, Tuc Viet Le

Dissertations and Theses Collection (Open Access)

This thesis proposes a general solution framework that integrates methods in machine learning in creative ways to solve a diverse set of problems arising in urban environments. It particularly focuses on modeling spatiotemporal data for the purpose of predicting urban phenomena. Concretely, the framework is applied to solve three specific real-world problems: human mobility prediction, trac speed prediction and incident prediction. For human mobility prediction, I use visitor trajectories collected a large theme park in Singapore as a simplified microcosm of an urban area. A trajectory is an ordered sequence of attraction visits and corresponding timestamps produced by a visitor. …


Modeling Trajectories With Recurrent Neural Networks, Hao Wu, Ziyang Chen, Weiwei Sun, Baihua Zheng, Wei Wang Aug 2017

Modeling Trajectories With Recurrent Neural Networks, Hao Wu, Ziyang Chen, Weiwei Sun, Baihua Zheng, Wei Wang

Research Collection School Of Computing and Information Systems

Modeling trajectory data is a building block for many smart-mobility initiatives. Existing approaches apply shallow models such as Markov chain and inverse reinforcement learning to model trajectories, which cannot capture the long-term dependencies. On the other hand, deep models such as Recurrent Neura lNetwork (RNN) have demonstrated their strength of modeling variable length sequences. However, directly adopting RNN to model trajectories is not appropriate because of the unique topological constraints faced by trajectories. Motivated by these findings, we design two RNN-based models which can make full advantage of the strength of RNN to capture variable length sequence and meanwhile to …


Towards Autonomous Behavior Learning Of Non-Player Characters In Games, Shu Feng, Ah-Hwee Tan Sep 2016

Towards Autonomous Behavior Learning Of Non-Player Characters In Games, Shu Feng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Non-Player-Characters (NPCs), as found in computer games, can be modelled as intelligent systems, which serve to improve the interactivity and playability of the games. Although reinforcement learning (RL) has been a promising approach to creating the behavior models of non-player characters (NPC), an initial stage of exploration and low performance is typically required. On the other hand, imitative learning (IL) is an effective approach to pre-building a NPC’s behavior model by observing the opponent’s actions, but learning by imitation limits the agent’s performance to that of its opponents. In view of their complementary strengths, this paper proposes a computational model …


Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan Dec 2015

Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan

Research Collection School Of Computing and Information Systems

The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes to achieve long-run energy neutrality, where energy supply and demand are balanced in a dynamic environment such that the nodes function continuously. In this paper, we develop a new framework enabling an adaptive duty cycling scheme for sensor networks that takes into account the node battery level, ambient energy that can be harvested, and application-level QoS requirements. We …


A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan Jul 2015

A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

This paper analyzes advanced reinforcement learning techniques and compares some of them to motivated learning. Motivated learning is briefly discussed indicating its relation to reinforcement learning. A black box scenario for comparative analysis of learning efficiency in autonomous agents is developed and described. This is used to analyze selected algorithms. Reported results demonstrate that in the selected category of problems, motivated learning outperformed all reinforcement learning algorithms we compared with.


Integrating Motivated Learning And K-Winner-Take-All To Coordinate Multi-Agent Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Janusz Starzyk, Yuan-Sin Tan, Loo-Nin Teow Aug 2014

Integrating Motivated Learning And K-Winner-Take-All To Coordinate Multi-Agent Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Janusz Starzyk, Yuan-Sin Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

This work addresses the coordination issue in distributed optimization problem (DOP) where multiple distinct and time-critical tasks are performed to satisfy a global objective function. The performance of these tasks has to be coordinated due to the sharing of consumable resources and the dependency on non-consumable resources. Knowing that it can be sub-optimal to predefine the performance of the tasks for large DOPs, the multi-agent reinforcement learning (MARL) framework is adopted wherein an agent is used to learn the performance of each distinct task using reinforcement learning. To coordinate MARL, we propose a novel coordination strategy integrating Motivated Learning (ML) …


Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow Dec 2013

Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

Simulator-based training is in constant pursuit of increasing level of realism. The transition from doctrine-driven computer-generated forces (CGF) to adaptive CGF represents one such effort. The use of doctrine-driven CGF is fraught with challenges such as modeling of complex expert knowledge and adapting to the trainees’ progress in real time. Therefore, this paper reports on how the use of adaptive CGF can overcome these challenges. Using a self-organizing neural network to implement the adaptive CGF, air combat maneuvering strategies are learned incrementally and generalized in real time. The state space and action space are extracted from the same hierarchical doctrine …


Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan Apr 2012

Motivated Learning For The Development Of Autonomous Agents, Janusz A. Starzyk, James T. Graham, Pawel Raif, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

A new machine learning approach known as motivated learning (ML) is presented in this work. Motivated learning drives a machine to develop abstract motivations and choose its own goals. ML also provides a self-organizing system that controls a machine’s behavior based on competition between dynamically-changing pain signals. This provides an interplay of externally driven and internally generated control signals. It is demonstrated that ML not only yields a more sophisticated learning mechanism and system of values than reinforcement learning (RL), but is also more efficient in learning complex relations and delivers better performance than RL in dynamically changing environments. In …


Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan Jan 2012

Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan

Research Collection School Of Computing and Information Systems

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …


Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan Oct 2011

Cooperative Reinforcement Learning In Topology-Based Multi-Agent Systems, Dan Xiao, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Topology-based multi-agent systems (TMAS), wherein agents interact with one another according to their spatial relationship in a network, are well suited for problems with topological constraints. In a TMAS system, however, each agent may have a different state space, which can be rather large. Consequently, traditional approaches to multi-agent cooperative learning may not be able to scale up with the complexity of the network topology. In this paper, we propose a cooperative learning strategy, under which autonomous agents are assembled in a binary tree formation (BTF). By constraining the interaction between agents, we effectively unify the state space of individual …


A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj Jul 2011

A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj

Research Collection School Of Computing and Information Systems

This paper presents a hybrid agent architecture that integrates the behaviours of BDI agents, specifically desire and intention, with a neural network based reinforcement learner known as Temporal DifferenceFusion Architecture for Learning and COgNition (TD-FALCON). With the explicit maintenance of goals, the agent performs reinforcement learning with the awareness of its objectives instead of relying on external reinforcement signals. More importantly, the intention module equips the hybrid architecture with deliberative planning capabilities, enabling the agent to purposefully maintain an agenda of actions to perform and reducing the need of constantly sensing the environment. Through reinforcement learning, plans can also be …


A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong Mar 2010

A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong

Research Collection School Of Computing and Information Systems

This paper presents a self-organizing neural architecture that integrates the features of belief, desire, and intention (BDI) systems with reinforcement learning. Based on fusion Adaptive Resonance Theory (fusion ART), the proposed architecture provides a unified treatment for both intentional and reactive cognitive functionalities. Operating with a sense-act-learn paradigm, the low level reactive module is a fusion ART network that learns action and value policies across the sensory, motor, and feedback channels. During performance, the actions executed by the reactive module are tracked by a high level intention module (also a fusion ART network) that learns to associate sequences of actions …


Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan Jun 2008

Self-Organizing Neural Models Integrating Rules And Reinforcement Learning, Teck-Hou Teng, Zhong-Ming Tan, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Traditional approaches to integrating knowledge into neural network are concerned mainly about supervised learning. This paper presents how a family of self-organizing neural models known as fusion architecture for learning, cognition and navigation (FALCON) can incorporate a priori knowledge and perform knowledge refinement and expansion through reinforcement learning. Symbolic rules are formulated based on pre-existing know-how and inserted into FALCON as a priori knowledge. The availability of knowledge enables FALCON to start performing earlier in the initial learning trials. Through a temporal-difference (TD) learning method, the inserted rules can be refined and expanded according to the evaluative feedback signals received …


Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao Feb 2008

Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao

Research Collection School Of Computing and Information Systems

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the …