Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 6 of 6

Full-Text Articles in Engineering

Reinforcement-Learning-Based Adaptive Tracking Control For A Space Continuum Robot Based On Reinforcement Learning, Da Jiang, Zhiqin Cai, Zhongzhen Liu, Haijun Peng, Zhigang Wu Oct 2022

Reinforcement-Learning-Based Adaptive Tracking Control For A Space Continuum Robot Based On Reinforcement Learning, Da Jiang, Zhiqin Cai, Zhongzhen Liu, Haijun Peng, Zhigang Wu

Journal of System Simulation

Abstract: Aiming at the tracking control for three-arm space continuum robot in space active debris removal manipulation, an adaptive sliding mode control algorithm based on deep reinforcement learning is proposed. Through BP network, a data-driven dynamic model is developed as the predictive model to guide the reinforcement learning to adjust the sliding mode controller's parameters online, and finally realize a real-time tracking control. Simulation results show that the proposed data-driven predictive model can accurately predict the robot's dynamic characteristics with the relative error within ±1% to random trajectories. Compared with the fixed-parameter sliding mode controller, the proposed adaptive controller …


Application Of Improved Q Learning Algorithm In Job Shop Scheduling Problem, Yejian Zhao, Yanhong Wang, Jun Zhang, Hongxia Yu, Zhongda Tian Jun 2022

Application Of Improved Q Learning Algorithm In Job Shop Scheduling Problem, Yejian Zhao, Yanhong Wang, Jun Zhang, Hongxia Yu, Zhongda Tian

Journal of System Simulation

Abstract: Aiming at the job shop scheduling in a dynamic environment, a dynamic scheduling algorithm based on an improved Q learning algorithm and dispatching rules is proposed. The state space of the dynamic scheduling algorithm is described with the concept of "the urgency of remaining tasks" and a reward function with the purpose of "the higher the slack, the higher the penalty" is disigned. In view of the problem that the greedy strategy will select the sub-optimal actions in the later stage of learning, the traditional Q learning algorithm is improved by introducing an action selection strategy based on the …


A Deep Reinforcement Learning Approach With Prioritized Experience Replay And Importance Factor For Makespan Minimization In Manufacturing, Jose Napoleon Martinez Apr 2022

A Deep Reinforcement Learning Approach With Prioritized Experience Replay And Importance Factor For Makespan Minimization In Manufacturing, Jose Napoleon Martinez

LSU Doctoral Dissertations

In this research, we investigated the application of deep reinforcement learning (DRL) to a common manufacturing scheduling optimization problem, max makespan minimization. In this application, tasks are scheduled to undergo processing in identical processing units (for instance, identical machines, machining centers, or cells). The optimization goal is to assign the jobs to be scheduled to units to minimize the maximum processing time (i.e., makespan) on any unit.

Machine learning methods have the potential to "learn" structures in the distribution of job times that could lead to improved optimization performance and time over traditional optimization methods, as well as to adapt …


Research On The Construction Method Of Simulation Evaluation Index Of Operation Effectiveness Operation Concept Traction, Ziwei Zhang, Liang Li, Zhiming Dong, Yifei Wang, Li Duan Mar 2022

Research On The Construction Method Of Simulation Evaluation Index Of Operation Effectiveness Operation Concept Traction, Ziwei Zhang, Liang Li, Zhiming Dong, Yifei Wang, Li Duan

Journal of System Simulation

Abstract: Agents are difficult to be directly modeled and simulated due to the complexity of their own interaction and learning behaviors. Aiming at the common problems in the discrete simulation of the agent, the event transfer mechanism of the discrete event system specification (DEVS) atomic model is applied to express the interaction and learning of an agent. Through the interaction mode of the agent, the transfer control of multi-state external events, the port connection mode, as well as the introduction of reinforcement learning event transfer representation, a discrete simulation construction method of the agent based on the DEVS atomic model …


Multiagent Routing Problem With Dynamic Target Arrivals Solved Via Approximate Dynamic Programming, Andrew E. Mogan Mar 2022

Multiagent Routing Problem With Dynamic Target Arrivals Solved Via Approximate Dynamic Programming, Andrew E. Mogan

Theses and Dissertations

This research formulates and solves the multiagent routing problem with dynamic target arrivals (MRP-DTA), a stochastic system wherein a team of autonomous unmanned aerial vehicles (AUAVs) executes a strike coordination and reconnaissance (SCAR) mission against a notional adversary. Dynamic target arrivals that occur during the mission present the team of AUAVs with a sequential decision-making process which we model via a Markov Decision Process (MDP). To combat the curse of dimensionality, we construct and implement a hybrid approximate dynamic programming (ADP) algorithmic framework that employs a parametric cost function approximation (CFA) which augments a direct lookahead (DLA) model via a …


Team Air Combat Using Model-Based Reinforcement Learning, David A. Mottice Mar 2022

Team Air Combat Using Model-Based Reinforcement Learning, David A. Mottice

Theses and Dissertations

We formulate the first generalized air combat maneuvering problem (ACMP), called the MvN ACMP, wherein M friendly AUCAVs engage against N enemy AUCAVs, developing a Markov decision process (MDP) model to control the team of M Blue AUCAVs. The MDP model leverages a 5-degree-of-freedom aircraft state transition model and formulates a directed energy weapon capability. Instead, a model-based reinforcement learning approach is adopted wherein an approximate policy iteration algorithmic strategy is implemented to attain high-quality approximate policies relative to a high performing benchmark policy. The ADP algorithm utilizes a multi-layer neural network for the value function approximation regression mechanism. One-versus-one …