Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication
- Publication Type
Articles 1 - 17 of 17
Full-Text Articles in Physical Sciences and Mathematics
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Research Collection School Of Computing and Information Systems
Airport ground handling (AGH) offers necessary operations to flights during their turnarounds and is of great importance to the efficiency of airport management and the economics of aviation. Such a problem involves the interplay among the operations that leads to NP-hard problems with complex constraints. Hence, existing methods for AGH are usually designed with massive domain knowledge but still fail to yield high-quality solutions efficiently. In this paper, we aim to enhance the solution quality and computation efficiency for solving AGH. Particularly, we first model AGH as a multiple-fleet vehicle routing problem (VRP) with miscellaneous constraints including precedence, time windows, …
The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson
The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson
Theses and Dissertations--Computer Science
We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. …
Intelligent Adaptive Gossip-Based Broadcast Protocol For Uav-Mec Using Multi-Agent Deep Reinforcement Learning, Zen Ren, Xinghua Li, Yinbin Miao, Zhuowen Li, Zihao Wang, Mengyao Zhu, Ximeng Liu, Deng, Robert H.
Intelligent Adaptive Gossip-Based Broadcast Protocol For Uav-Mec Using Multi-Agent Deep Reinforcement Learning, Zen Ren, Xinghua Li, Yinbin Miao, Zhuowen Li, Zihao Wang, Mengyao Zhu, Ximeng Liu, Deng, Robert H.
Research Collection School Of Computing and Information Systems
UAV-assisted mobile edge computing (UAV-MEC) has been proposed to offer computing resources for smart devices and user equipment. UAV cluster aided MEC rather than one UAV-aided MEC as edge pool is the newest edge computing architecture. Unfortunately, the data packet exchange during edge computing within the UAV cluster hasn't received enough attention. UAVs need to collaborate for the wide implementation of MEC, relying on the gossip-based broadcast protocol. However, gossip has the problem of long propagation delay, where the forwarding probability and neighbors are two factors that are difficult to balance. The existing works improve gossip from only one factor, …
The Impact Of Dynamic Difficulty Adjustment On Player Experience In Video Games, Chineng Vang
The Impact Of Dynamic Difficulty Adjustment On Player Experience In Video Games, Chineng Vang
Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal
Dynamic Difficulty Adjustment (DDA) is a process by which a video game adjusts its level of challenge to match a player’s skill level. Its popularity in the video game industry continues to grow as it has the ability to keep players continuously engaged in a game, a concept referred to as Flow. However, the influence of DDA on games has received mixed responses, specifically that it can enhance player experience as well as hinder it. This paper explores DDA through the Monte Carlo Tree Search algorithm and Reinforcement Learning, gathering feedback from players seeking to understand what about DDA is …
Micro Grid Control Optimization With Load And Solar Prediction, Shaju Saha
Micro Grid Control Optimization With Load And Solar Prediction, Shaju Saha
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
Using renewable energy can save money and keep the environment cleaner. Installing a solar PV system is a one-time cost but it can generate energy for a lifetime. Solar PV does not generate carbon emissions while producing power. This thesis evaluates the value of being able to make accurate predictions in the use of solar energy. It uses predicted solar power and load for a system and a battery to store the energy for future use and calculates the operating cost or profit in several designed conditions. Various factors like a different place, tuning the capacity of sources, changing buy/sell …
Deep Q Learning Applied To Stock Trading, Agnibh Dasgupta
Deep Q Learning Applied To Stock Trading, Agnibh Dasgupta
All Graduate Theses and Dissertations, Spring 1920 to Summer 2023
Developing a strategy for stock trading is a vital task for investors. However, it is challenging to obtain an optimal strategy, given the complex and dynamic nature of the stock market. This thesis aims to explore the applications of Reinforcement Learning with the goal of maximizing returns from market investment, keeping in mind the human aspect of trading by utilizing stock prices represented as candlestick graphs. Furthermore, the algorithm studies public interest patterns in form of graphs extracted from Google Trends to make predictions. Deep Q learning has been used to train an agent based on fused images of stock …
Reinforcement Learning For Zone Based Multiagent Pathfinding Under Uncertainty, Jiajing Ling, Tarun Gupta, Akshat Kumar
Reinforcement Learning For Zone Based Multiagent Pathfinding Under Uncertainty, Jiajing Ling, Tarun Gupta, Akshat Kumar
Research Collection School Of Computing and Information Systems
We address the problem of multiple agents finding their paths from respective sources to destination nodes in a graph (also called MAPF). Most existing approaches assume that all agents move at fixed speed, and that a single node accommodates only a single agent. Motivated by the emerging applications of autonomous vehicles such as drone traffic management, we present zone-based path finding (or ZBPF) where agents move among zones, and agents' movements require uncertain travel time. Furthermore, each zone can accommodate multiple agents (as per its capacity). We also develop a simulator for ZBPF which provides a clean interface from the …
Applying Imitation And Reinforcement Learning To Sparse Reward Environments, Haven Brown
Applying Imitation And Reinforcement Learning To Sparse Reward Environments, Haven Brown
Computer Science and Computer Engineering Undergraduate Honors Theses
The focus of this project was to shorten the time it takes to train reinforcement learning agents to perform better than humans in a sparse reward environment. Finding a general purpose solution to this problem is essential to creating agents in the future capable of managing large systems or performing a series of tasks before receiving feedback. The goal of this project was to create a transition function between an imitation learning algorithm (also referred to as a behavioral cloning algorithm) and a reinforcement learning algorithm. The goal of this approach was to allow an agent to first learn to …
Satellite Constellation Deployment And Management, Joseph Ryan Kopacz
Satellite Constellation Deployment And Management, Joseph Ryan Kopacz
Electronic Theses and Dissertations
This paper will review results and discuss a new method to address the deployment and management of a satellite constellation. The first two chapters will explorer the use of small satellites, and some of the advances in technology that have enabled small spacecraft to maintain modern performance requirements in incredibly small packages.
The third chapter will address the multiple-objective optimization problem for a global persistent coverage constellation of communications spacecraft in Low Earth Orbit. A genetic algorithm was implemented in MATLAB to explore the design space – 288 trillion possibilities – utilizing the Satellite Tool Kit (STK) software developers kit. …
Algebraic Neural Architecture Representation, Evolutionary Neural Architecture Search, And Novelty Search In Deep Reinforcement Learning, Ethan C. Jackson
Algebraic Neural Architecture Representation, Evolutionary Neural Architecture Search, And Novelty Search In Deep Reinforcement Learning, Ethan C. Jackson
Electronic Thesis and Dissertation Repository
Evolutionary algorithms have recently re-emerged as powerful tools for machine learning and artificial intelligence, especially when combined with advances in deep learning developed over the last decade. In contrast to the use of fixed architectures and rigid learning algorithms, we leveraged the open-endedness of evolutionary algorithms to make both theoretical and methodological contributions to deep reinforcement learning. This thesis explores and develops two major areas at the intersection of evolutionary algorithms and deep reinforcement learning: generative network architectures and behaviour-based optimization. Over three distinct contributions, both theoretical and experimental methods were applied to deliver a novel mathematical framework and experimental …
Modeling Trajectories With Recurrent Neural Networks, Hao Wu, Ziyang Chen, Weiwei Sun, Baihua Zheng, Wei Wang
Modeling Trajectories With Recurrent Neural Networks, Hao Wu, Ziyang Chen, Weiwei Sun, Baihua Zheng, Wei Wang
Research Collection School Of Computing and Information Systems
Modeling trajectory data is a building block for many smart-mobility initiatives. Existing approaches apply shallow models such as Markov chain and inverse reinforcement learning to model trajectories, which cannot capture the long-term dependencies. On the other hand, deep models such as Recurrent Neura lNetwork (RNN) have demonstrated their strength of modeling variable length sequences. However, directly adopting RNN to model trajectories is not appropriate because of the unique topological constraints faced by trajectories. Motivated by these findings, we design two RNN-based models which can make full advantage of the strength of RNN to capture variable length sequence and meanwhile to …
An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong
An Efficient Approach To Model-Based Hierarchical Reinforcement Learning, Zhuoru Li, Akshay Narayan, Tze-Yun Leong
Research Collection School Of Computing and Information Systems
We propose a model-based approach to hierarchical reinforcement learning that exploits shared knowledge and selective execution at different levels of abstraction, to efficiently solve large, complex problems. Our framework adopts a new transition dynamics learning algorithm that identifies the common action-feature combinations of the subtasks, and evaluates the subtask execution choices through simulation. The framework is sample efficient, and tolerates uncertain and incomplete problem characterization of the subtasks. We test the framework on common benchmark problems and complex simulated robotic environments. It compares favorably against the stateof-the-art algorithms, and scales well in very large problems.
Seapot-Rl: Selective Exploration Algorithm For Policy Transfer In Rl, Akshay Narayan, Zhuoru Li, Tze-Yun Leong
Seapot-Rl: Selective Exploration Algorithm For Policy Transfer In Rl, Akshay Narayan, Zhuoru Li, Tze-Yun Leong
Research Collection School Of Computing and Information Systems
We propose a new method for transferring a policy from a source task to a target task in model-based reinforcement learning. Our work is motivated by scenarios where a robotic agent operates in similar but challenging environments, such as hospital wards, differentiated by structural arrangements or obstacles, such as furniture. We address problems that require fast responses adapted from incomplete, prior knowledge of the agent in new scenarios. We present an efficient selective exploration strategy that maximally reuses the source task policy. Reuse efficiency is effected through identifying sub-spaces that are different in the target environment, thus limiting the exploration …
A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan
A Comparative Study Between Motivated Learning And Reinforcement Learning, James T. Graham, Janusz A. Starzyk, Zhen Ni, Haibo He, T.-H. Teng, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
This paper analyzes advanced reinforcement learning techniques and compares some of them to motivated learning. Motivated learning is briefly discussed indicating its relation to reinforcement learning. A black box scenario for comparative analysis of learning efficiency in autonomous agents is developed and described. This is used to analyze selected algorithms. Reported results demonstrate that in the selected category of problems, motivated learning outperformed all reinforcement learning algorithms we compared with.
Efficient Reinforcement Learning In Multiple-Agent Systems And Its Application In Cognitive Radio Networks, Jing Zhang
Efficient Reinforcement Learning In Multiple-Agent Systems And Its Application In Cognitive Radio Networks, Jing Zhang
Dissertations
The objective of reinforcement learning in multiple-agent systems is to find an efficient learning method for the agents to behave optimally. Finding Nash equilibrium has become the common learning target for the optimality. However, finding Nash equilibrium is a PPAD (Polynomial Parity Arguments on Directed graphs)-complete problem. The conventional methods can find Nash equilibrium for some special types of Markov games.
This dissertation proposes a new reinforcement learning algorithm to improve the search efficiency and effectiveness for multiple-agent systems. This algorithm is based on the definition of Nash equilibrium and utilizes the greedy and rational features of the agents. When …
Higher-Level Application Of Adaptive Dynamic Programming/Reinforcement Learning – A Next Phase For Controls And System Identification?, George G. Lendaris
Higher-Level Application Of Adaptive Dynamic Programming/Reinforcement Learning – A Next Phase For Controls And System Identification?, George G. Lendaris
Systems Science Friday Noon Seminar Series
Humans have the ability to make use of experience while performing system identification and selecting control actions for changing situations. In contrast to current technological implementations that slow down as more knowledge is stored, as more experience is gained, human processing speeds up and has enhanced effectiveness. An emerging experience-based (“higher level”) approach promises to endow our technology with enhanced efficiency and effectiveness.
The notions of context and context discernment are important to understanding this human ability. These are defined as appropriate to controls and system-identification. Some general background on controls, Dynamic Programming, and Adaptive Critic leading to Adaptive Dynamic …
Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan
Motivated Learning As An Extension Of Reinforcement Learning, Janusz Starzyk, Pawel Raif, Ah-Hwee Tan
Research Collection School Of Computing and Information Systems
We have developed a unified framework to conduct computational experiments with both learning systems: Motivated learning based on Goal Creation System, and reinforcedment learning using RL Q-Learning Algorithm. Future work includes combining motivated learning to set abstract motivations and manage goals with reinforcement learning to learn proper actions. This will allow testing of motivated learning on typical reinforcement learning benchmarks with large dimensionality of the state/action spaces.