Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 24 of 24

Full-Text Articles in Engineering

Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang Dec 2023

Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang

Research Collection School Of Computing and Information Systems

Airport ground handling (AGH) offers necessary operations to flights during their turnarounds and is of great importance to the efficiency of airport management and the economics of aviation. Such a problem involves the interplay among the operations that leads to NP-hard problems with complex constraints. Hence, existing methods for AGH are usually designed with massive domain knowledge but still fail to yield high-quality solutions efficiently. In this paper, we aim to enhance the solution quality and computation efficiency for solving AGH. Particularly, we first model AGH as a multiple-fleet vehicle routing problem (VRP) with miscellaneous constraints including precedence, time windows, …


Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai Jul 2023

Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai

Research Collection School Of Computing and Information Systems

Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a newimprovement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar or better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into …


Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam May 2023

Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Using unmanned aerial systems (UAS) for leaf sampling is contributing to a better understanding of the influence of climate change on plant species, and the dynamics of forest ecology by studying hard-to-reach tree canopies. Currently, multiple skilled operators are required for UAS maneuvering and using the leaf sampling tool. This often limits sampling to only the canopy top or periphery. Sim-to-real reinforcement learning (RL) can be leveraged to tackle challenges in the autonomous operation of aerial leaf sampling in the changing environment of a tree canopy. However, trans- ferring an RL controller that is learned in simulation to real UAS …


Continual Optimal Adaptive Tracking Of Uncertain Nonlinear Continuous-Time Systems Using Multilayer Neural Networks, Irfan Ganie, S. (Sarangapani) Jagannathan Jan 2023

Continual Optimal Adaptive Tracking Of Uncertain Nonlinear Continuous-Time Systems Using Multilayer Neural Networks, Irfan Ganie, S. (Sarangapani) Jagannathan

Electrical and Computer Engineering Faculty Research & Creative Works

This study provides a lifelong integral reinforcement learning (LIRL)-based optimal tracking scheme for uncertain nonlinear continuous-time (CT) systems using multilayer neural network (MNN). In this LIRL framework, the optimal control policies are generated by using both the critic neural network (NN) weights and single-layer NN identifier. The critic MNN weight tuning is accomplished using an improved singular value decomposition (SVD) of its activation function gradient. The NN identifier, on the other hand, provides the control coefficient matrix for computing the control policies. An online weight velocity attenuation (WVA)-based consolidation scheme is proposed wherein the significance of weights is derived by …


Step-Wise Deep Learning Models For Solving Routing Problems, Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang Jul 2021

Step-Wise Deep Learning Models For Solving Routing Problems, Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang

Research Collection School Of Computing and Information Systems

Routing problems are very important in intelligent transportation systems. Recently, a number of deep learning-based methods are proposed to automatically learn construction heuristics for solving routing problems. However, these methods do not completely follow Bellman's Principle of Optimality since the visited nodes during construction are still included in the following subtasks, resulting in suboptimal policies. In this article, we propose a novel step-wise scheme which explicitly removes the visited nodes in each node selection step. We apply this scheme to two representative deep models for routing problems, pointer network and transformer attention model (TAM), and significantly improve the performance of …


Approximate Difference Rewards For Scalable Multigent Reinforcement Learning, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau May 2021

Approximate Difference Rewards For Scalable Multigent Reinforcement Learning, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

We address the problem ofmultiagent credit assignment in a large scale multiagent system. Difference rewards (DRs) are an effective tool to tackle this problem, but their exact computation is known to be challenging even for small number of agents. We propose a scalable method to compute difference rewards based on aggregate information in a multiagent system with large number of agents by exploiting the symmetry present in several practical applications. Empirical evaluation on two multiagent domains - air-traffic control and cooperative navigation, shows better solution quality than previous approaches.


Deep Reinforcement Learning Approach To Solve Dynamic Vehicle Routing Problem With Stochastic Customers, Waldy Joe, Hoong Chuin Lau Oct 2020

Deep Reinforcement Learning Approach To Solve Dynamic Vehicle Routing Problem With Stochastic Customers, Waldy Joe, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

In real-world urban logistics operations, changes to the routes and tasks occur in response to dynamic events. To ensure customers’ demands are met, planners need to make these changes quickly (sometimes instantaneously). This paper proposes the formulation of a dynamic vehicle routing problem with time windows and both known and stochastic customers as a route-based Markov Decision Process. We propose a solution approach that combines Deep Reinforcement Learning (specifically neural networks-based TemporalDifference learning with experience replay) to approximate the value function and a routing heuristic based on Simulated Annealing, called DRLSA. Our approach enables optimized re-routing decision to be generated …


Using Taint Analysis And Reinforcement Learning (Tarl) To Repair Autonomous Robot Software, Damian Lyons, Saba Zahra May 2020

Using Taint Analysis And Reinforcement Learning (Tarl) To Repair Autonomous Robot Software, Damian Lyons, Saba Zahra

Faculty Publications

It is important to be able to establish formal performance bounds for autonomous systems. However, formal verification techniques require a model of the environment in which the system operates; a challenge for autonomous systems, especially those expected to operate over longer timescales. This paper describes work in progress to automate the monitor and repair of ROS-based autonomous robot software written for an a-priori partially known and possibly incorrect environment model. A taint analysis method is used to automatically extract the data-flow sequence from input topic to publish topic, and instrument that code. A unique reinforcement learning approximation of MDP utility …


Hierarchical Multiagent Reinforcement Learning For Maritime Traffic Management, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau May 2020

Hierarchical Multiagent Reinforcement Learning For Maritime Traffic Management, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Increasing global maritime traffic coupled with rapid digitization and automation in shipping mandate developing next generation maritime traffic management systems to mitigate congestion, increase safety of navigation, and avoid collisions in busy and geographically constrained ports (such as Singapore's). To achieve these objectives, we model the maritime traffic as a large multiagent system with individual vessels as agents, and VTS (Vessel Traffic Service) authority as a regulatory agent. We develop a hierarchical reinforcement learning approach where vessels first select a high level action based on the underlying traffic flow, and then select the low level action that determines their future …


Wind Power Forecasting Methods Based On Deep Learning: A Survey, Xing Deng, Haijian Shao, Chunlong Hu, Dengbiao Jiang, Yingtao Jiang Jan 2020

Wind Power Forecasting Methods Based On Deep Learning: A Survey, Xing Deng, Haijian Shao, Chunlong Hu, Dengbiao Jiang, Yingtao Jiang

Electrical & Computer Engineering Faculty Research

Accurate wind power forecasting in wind farm can effectively reduce the enormous impact on grid operation safety when high permeability intermittent power supply is connected to the power grid. Aiming to provide reference strategies for relevant researchers as well as practical applications, this paper attempts to provide the literature investigation and methods analysis of deep learning, enforcement learning and transfer learning in wind speed and wind power forecasting modeling. Usually, wind speed and wind power forecasting around a wind farm requires the calculation of the next moment of the definite state, which is usually achieved based on the state of …


Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque Dec 2019

Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Landing an unmanned aerial vehicle (UAV) on a moving platform is a challenging task that often requires exact models of the UAV dynamics, platform characteristics, and environmental conditions. In this thesis, we present and investigate three different machine learning approaches with varying levels of domain knowledge: dynamics randomization, universal policy with system identification, and reinforcement learning with no parameter variation. We first train the policies in simulation, then perform experiments both in simulation, making variations of the system dynamics with wind and friction coefficient, then perform experiments in a real robot system with wind variation. We initially expected that providing …


Adopt: Combining Parameter Tuning And Adaptive Operator Ordering For Solving A Class Of Orienteering Problems, Aldy Gunawan, Hoong Chuin Lau, Kun Lu Jul 2018

Adopt: Combining Parameter Tuning And Adaptive Operator Ordering For Solving A Class Of Orienteering Problems, Aldy Gunawan, Hoong Chuin Lau, Kun Lu

Research Collection School Of Computing and Information Systems

Two fundamental challenges in local search based metaheuristics are how to determine parameter configurations and design the underlying Local Search (LS) procedure. In this paper, we propose a framework in order to handle both challenges, called ADaptive OPeraTor Ordering (ADOPT). In this paper, The ADOPT framework is applied to two metaheuristics, namely Iterated Local Search (ILS) and a hybridization of Simulated Annealing and ILS (SAILS) for solving two variants of the Orienteering Problem: the Team Dependent Orienteering Problem (TDOP) and the Team Orienteering Problem with Time Windows (TOPTW). This framework consists of two main processes. The Design of Experiment (DOE) …


An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong Feb 2017

An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong

Research Collection School Of Computing and Information Systems

We propose a model-based approach to hierarchical reinforcement learning that exploits shared knowledge and selective execution at different levels of abstraction, to efficiently solve large, complex problems. Our framework adopts a new transition dynamics learning algorithm that identifies the common action-feature combinations of the subtasks, and evaluates the subtask execution choices through simulation. The framework is sample efficient, and tolerates uncertain and incomplete problem characterization of the subtasks. We test the framework on common benchmark problems and complex simulated robotic environments. It compares favorably against the stateof-the-art algorithms, and scales well in very large problems.


Reinforcement Learning Framework For Modeling Spatial Sequential Decisions Under Uncertainty: (Extended Abstract), Truc Viet Le, Siyuan Liu, Hoong Chuin Lau May 2016

Reinforcement Learning Framework For Modeling Spatial Sequential Decisions Under Uncertainty: (Extended Abstract), Truc Viet Le, Siyuan Liu, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

We consider the problem of trajectory prediction, where a trajectory is an ordered sequence of location visits and corresponding timestamps. The problem arises when an agent makes sequential decisions to visit a set of spatial locations of interest. Each location bears a stochastic utility and the agent has a limited budget to spend. Given the agent's observed partial trajectory, our goal is to predict the remaining trajectory. We propose a solution framework to the problem considering both the uncertainty of utility and the budget constraint. We use reinforcement learning (RL) to model the underlying decision processes and inverse RL to …


Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan Dec 2015

Adaptive Duty Cycling In Sensor Networks With Energy Harvesting Using Continuous-Time Markov Chain And Fluid Models, Ronald Wai Hong Chan, Pengfei Zhang, Ido Nevat, Sai Ganesh Nagarajan, Alvin Cerdena Valera, Hwee Xian Tan

Research Collection School Of Computing and Information Systems

The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes to achieve long-run energy neutrality, where energy supply and demand are balanced in a dynamic environment such that the nodes function continuously. In this paper, we develop a new framework enabling an adaptive duty cycling scheme for sensor networks that takes into account the node battery level, ambient energy that can be harvested, and application-level QoS requirements. We …


Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan Jul 2014

Creating Autonomous Adaptive Agents In A Real-Time First-Person Shooter Computer Game, Di Wang, Ah-Hwee Tan

Research Collection School Of Computing and Information Systems

Games are good test-beds to evaluate AI methodologies. In recent years, there has been a vast amount of research dealing with real-time computer games other than the traditional board games or card games. This paper illustrates how we create agents by employing FALCON, a self-organizing neural network that performs reinforcement learning, to play a well-known first-person shooter computer game called Unreal Tournament. Rewards used for learning are either obtained from the game environment or estimated using the temporal difference learning scheme. In this way, the agents are able to acquire proper strategies and discover the effectiveness of different weapons without …


Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow Dec 2013

Adaptive Computer‐Generated Forces For Simulator‐Based Training, Expert Systems With Applications, Teck-Hou Teng, Ah-Hwee Tan, Loo-Nin Teow

Research Collection School Of Computing and Information Systems

Simulator-based training is in constant pursuit of increasing level of realism. The transition from doctrine-driven computer-generated forces (CGF) to adaptive CGF represents one such effort. The use of doctrine-driven CGF is fraught with challenges such as modeling of complex expert knowledge and adapting to the trainees’ progress in real time. Therefore, this paper reports on how the use of adaptive CGF can overcome these challenges. Using a self-organizing neural network to implement the adaptive CGF, air combat maneuvering strategies are learned incrementally and generalized in real time. The state space and action space are extracted from the same hierarchical doctrine …


Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan Jan 2012

Self‐Regulating Action Exploration In Reinforcement Learning, Teck-Hou Teng, Ah-Hwee Tan, Yuan-Sin Tan

Research Collection School Of Computing and Information Systems

The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using an inaccurate number of training iterations leads either to the non-convergence or the over-training of the learning agent. This work addresses such issues by proposing a technique to self-regulate the exploration rate and training duration …


A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj Jul 2011

A Hybrid Agent Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yew-Soon Ong, Akejariyawong Tapanuj

Research Collection School Of Computing and Information Systems

This paper presents a hybrid agent architecture that integrates the behaviours of BDI agents, specifically desire and intention, with a neural network based reinforcement learner known as Temporal DifferenceFusion Architecture for Learning and COgNition (TD-FALCON). With the explicit maintenance of goals, the agent performs reinforcement learning with the awareness of its objectives instead of relying on external reinforcement signals. More importantly, the intention module equips the hybrid architecture with deliberative planning capabilities, enabling the agent to purposefully maintain an agenda of actions to perform and reducing the need of constantly sensing the environment. Through reinforcement learning, plans can also be …


A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong Mar 2010

A Self-Organizing Neural Architecture Integrating Desire, Intention And Reinforcement Learning, Ah-Hwee Tan, Yu-Hong Feng, Yew-Soon Ong

Research Collection School Of Computing and Information Systems

This paper presents a self-organizing neural architecture that integrates the features of belief, desire, and intention (BDI) systems with reinforcement learning. Based on fusion Adaptive Resonance Theory (fusion ART), the proposed architecture provides a unified treatment for both intentional and reactive cognitive functionalities. Operating with a sense-act-learn paradigm, the low level reactive module is a fusion ART network that learns action and value policies across the sensory, motor, and feedback channels. During performance, the actions executed by the reactive module are tracked by a high level intention module (also a fusion ART network) that learns to associate sequences of actions …


Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao Feb 2008

Integrating Temporal Difference Methods And Self‐Organizing Neural Networks For Reinforcement Learning With Delayed Evaluative Feedback, Ah-Hwee Tan, Ning Lu, Dan Xiao

Research Collection School Of Computing and Information Systems

This paper presents a neural architecture for learning category nodes encoding mappings across multimodal patterns involving sensory inputs, actions, and rewards. By integrating adaptive resonance theory (ART) and temporal difference (TD) methods, the proposed neural model, called TD fusion architecture for learning, cognition, and navigation (TD-FALCON), enables an autonomous agent to adapt and function in a dynamic environment with immediate as well as delayed evaluative feedback (reinforcement) signals. TD-FALCON learns the value functions of the state-action space estimated through on-policy and off-policy TD learning methods, specifically state-action-reward-state-action (SARSA) and Q-learning. The learned value functions are then used to determine the …


Reinforcement Learning Neural-Network-Based Controller For Nonlinear Discrete-Time Systems With Input Constraints, Pingan He, Jagannathan Sarangapani Jan 2007

Reinforcement Learning Neural-Network-Based Controller For Nonlinear Discrete-Time Systems With Input Constraints, Pingan He, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

A novel adaptive-critic-based neural network (NN) controller in discrete time is designed to deliver a desired tracking performance for a class of nonlinear systems in the presence of actuator constraints. The constraints of the actuator are treated in the controller design as the saturation nonlinearity. The adaptive critic NN controller architecture based on state feedback includes two NNs: the critic NN is used to approximate the "strategic" utility function, whereas the action NN is employed to minimize both the strategic utility function and the unknown nonlinear dynamic estimation errors. The critic and action NN weight updates are derived by minimizing …


Reinforcement Learning-Based Output Feedback Control Of Nonlinear Systems With Input Constraints, Pingan He, Jagannathan Sarangapani Feb 2005

Reinforcement Learning-Based Output Feedback Control Of Nonlinear Systems With Input Constraints, Pingan He, Jagannathan Sarangapani

Electrical and Computer Engineering Faculty Research & Creative Works

A novel neural network (NN) -based output feedback controller with magnitude constraints is designed to deliver a desired tracking performance for a class of multi-input-multi-output (MIMO) discrete-time strict feedback nonlinear systems. Reinforcement learning in discrete time is proposed for the output feedback controller, which uses three NN: 1) a NN observer to estimate the system states with the input-output data; 2) a critic NN to approximate certain strategic utility function; and 3) an action NN to minimize both the strategic utility function and the unknown dynamics estimation errors. The magnitude constraints are manifested as saturation nonlinearities in the output feedback …


Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay Jan 1999

Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay

Electrical & Computer Engineering Faculty Research

This paper suggests an intelligent controller for an automated vehicle planning its own trajectory based on sensor and communication data. The intelligent controller is designed using the learning stochastic automata theory. Using the data received from on-board sensors, two automata (one for lateral actions, one for longitudinal actions) can learn the best possible action to avoid collisions. The system has the advantage of being able to work in unmodeled stochastic environments, unlike adaptive control methods or expert systems. Simulations for simultaneous lateral and longitudinal control of a vehicle provide encouraging results