Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Engineering

Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque Dec 2019

Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque

Department of Computer Science and Engineering: Dissertations, Theses, and Student Research

Landing an unmanned aerial vehicle (UAV) on a moving platform is a challenging task that often requires exact models of the UAV dynamics, platform characteristics, and environmental conditions. In this thesis, we present and investigate three different machine learning approaches with varying levels of domain knowledge: dynamics randomization, universal policy with system identification, and reinforcement learning with no parameter variation. We first train the policies in simulation, then perform experiments both in simulation, making variations of the system dynamics with wind and friction coefficient, then perform experiments in a real robot system with wind variation. We initially expected that providing …


A Comparison Of Contextual Bandit Approaches To Human-In-The-Loop Robot Task Completion With Infrequent Feedback, Matt Mcneill, Damian Lyons Nov 2019

A Comparison Of Contextual Bandit Approaches To Human-In-The-Loop Robot Task Completion With Infrequent Feedback, Matt Mcneill, Damian Lyons

Faculty Publications

Artificially intelligent assistive agents are playing an increased role in our work and homes. In contrast with currently predominant conversational agents, whose intelligence derives from dialogue trees and external modules, a fully autonomous domestic or workplace robot must carry out more complex reasoning. Such a robot must make good decisions as soon as possible, learn from experience, respond to feedback, and rely on feedback only as much as necessary. In this research, we narrow the focus of a hypothetical robot assistant to a room tidying task in a simulated domestic environment. Given an item, the robot chooses where to put …


A Deep Recurrent Q Network Towards Self-Adapting Distributed Microservices Architecture (In Press), Basel Magableh Jan 2019

A Deep Recurrent Q Network Towards Self-Adapting Distributed Microservices Architecture (In Press), Basel Magableh

Articles

One desired aspect of microservices architecture is the ability to self-adapt its own architecture and behaviour in response to changes in the operational environment. To achieve the desired high levels of self-adaptability, this research implements the distributed microservices architectures model, as informed by the MAPE-K model. The proposed architecture employs a multi adaptation agents supported by a centralised controller, that can observe the environment and execute a suitable adaptation action. The adaptation planning is managed by a deep recurrent Q-network (DRQN). It is argued that such integration between DRQN and MDP agents in a MAPE-K model offers distributed microservice architecture …


A Graph-Based Reinforcement Learning Method With Converged State Exploration And Exploitation, Han Li, Tianding Chen, Hualiang Teng, Yingtao Jiang Jan 2019

A Graph-Based Reinforcement Learning Method With Converged State Exploration And Exploitation, Han Li, Tianding Chen, Hualiang Teng, Yingtao Jiang

Civil and Environmental Engineering and Construction Faculty Research

In any classical value-based reinforcement learning method, an agent, despite of its continuous interactions with the environment, is yet unable to quickly generate a complete and independent description of the entire environment, leaving the learning method to struggle with a difficult dilemma of choosing between the two tasks, namely exploration and exploitation. This problem becomes more pronounced when the agent has to deal with a dynamic environment, of which the configuration and/or parameters are constantly changing. In this paper, this problem is approached by first mapping a reinforcement learning scheme to a directed graph, and the set that contains all …