Open Access. Powered by Scholars. Published by Universities.®
- Institution
- Publication
- Publication Type
Articles 1 - 5 of 5
Full-Text Articles in Engineering
Robot Arm Control Method Based On Deep Reinforcement Learning, Heyu Li, Zhilong Zhao, Gu Lei, Liqin Guo, Zeng Bi, Tingyu Lin
Robot Arm Control Method Based On Deep Reinforcement Learning, Heyu Li, Zhilong Zhao, Gu Lei, Liqin Guo, Zeng Bi, Tingyu Lin
Journal of System Simulation
Abstract: Deep reinforcement learning continues to explore in the environment and adjusts the neural network parameters by the reward function. The actual production line can not be used as the trial and error environment for the algorithm, so there is not enough data. For that, this paper constructs a virtual robot arm simulation environment, including the robot arm and the object. The Deep Deterministic Policy Gradient (DDPG),in which the state variables and reward function are set,is trained by deep reinforcement learning algorithm in the simulation environment to realize the target of controlling the robot arm to move the gripper below …
Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque
Domain Adaptation In Unmanned Aerial Vehicles Landing Using Reinforcement Learning, Pedro Lucas Franca Albuquerque
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Landing an unmanned aerial vehicle (UAV) on a moving platform is a challenging task that often requires exact models of the UAV dynamics, platform characteristics, and environmental conditions. In this thesis, we present and investigate three different machine learning approaches with varying levels of domain knowledge: dynamics randomization, universal policy with system identification, and reinforcement learning with no parameter variation. We first train the policies in simulation, then perform experiments both in simulation, making variations of the system dynamics with wind and friction coefficient, then perform experiments in a real robot system with wind variation. We initially expected that providing …
Joint Manufacturing And Onsite Microgrid System Control Using Markov Decision Process And Neural Network Integrated Reinforcement Learning, Wenqing Hu, Zeyi Sun, Y. Zhang, Y. Li
Joint Manufacturing And Onsite Microgrid System Control Using Markov Decision Process And Neural Network Integrated Reinforcement Learning, Wenqing Hu, Zeyi Sun, Y. Zhang, Y. Li
Mathematics and Statistics Faculty Research & Creative Works
Onsite microgrid generation systems with renewable sources are considered a promising complementary energy supply system for manufacturing plant, especially when outage occurs during which the energy supplied from the grid is not available. Compared to the widely recognized benefits in terms of the resilience improvement when it is used as a backup energy system, the operation along with the electricity grid to support the manufacturing operations in non-emergent mode has been less investigated. In this paper, we propose a joint dynamic decision-making model for the optimal control for both manufacturing system and onsite generation system. Markov Decision Process (MDP) is …
Dp-Q(Λ): Real-Time Path Planning For Multi-Agent In Large-Scale Web3d Scene, Fengting Yan, Jinyuan Jia
Dp-Q(Λ): Real-Time Path Planning For Multi-Agent In Large-Scale Web3d Scene, Fengting Yan, Jinyuan Jia
Journal of System Simulation
Abstract: The path planning of multi-agent in an unknown large-scale scene needs an efficient and stable algorithm, and needs to solve multi-agent collision avoidance problem, and then completes a real-time path planning in Web3D. To solve above problems, the DP-Q(λ) algorithm is proposed; and the direction constraints, high reward or punishment weight training methods are used to adjust the values of reward or punishment by using a probability p (0-1 random number). The value from reward or punishment determines its next step path planning strategy. If the next position is free, the agent could walk to it. The above strategy …
Less Is More: Beating The Market With Recurrent Reinforcement Learning, Louis Kurt Bernhard Steinmeister
Less Is More: Beating The Market With Recurrent Reinforcement Learning, Louis Kurt Bernhard Steinmeister
Masters Theses
"Multiple recurrent reinforcement learners were implemented to make trading decisions based on real and freely available macro-economic data. The learning algorithm and different reinforcement functions (the Differential Sharpe Ratio, Differential Downside Deviation Ratio and Returns) were revised and the performances were compared while transaction costs were taken into account. (This is important for practical implementations even though many publications ignore this consideration.) It was assumed that the traders make long-short decisions in the S&P500 with complementary 3-month treasury bill investments. Leveraged positions in the S&P500 were disallowed. Notably, the Differential Sharpe Ratio and the Differential Downside Deviation Ratio are risk …