Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 7 of 7

Full-Text Articles in Computer Sciences

Hierarchical Control Of Multi-Agent Reinforcement Learning Team In Real-Time Strategy (Rts) Games, Weigui Jair Zhou, Budhitama Subagdja, Ah-Hwee Tan, Darren Wee Sze Ong Dec 2021

Hierarchical Control Of Multi-Agent Reinforcement Learning Team In Real-Time Strategy (Rts) Games, Weigui Jair Zhou, Budhitama Subagdja, Ah-Hwee Tan, Darren Wee Sze Ong

Research Collection School Of Computing and Information Systems

Coordinated control of multi-agent teams is an important task in many real-time strategy (RTS) games. In most prior work, micromanagement is the commonly used strategy whereby individual agents operate independently and make their own combat decisions. On the other extreme, some employ a macromanagement strategy whereby all agents are controlled by a single decision model. In this paper, we propose a hierarchical command and control architecture, consisting of a single high-level and multiple low-level reinforcement learning agents operating in a dynamic environment. This hierarchical model enables the low-level unit agents to make individual decisions while taking commands from the high-level …


Dqn-Based Path Planning Method And Simulation For Submarine And Warship In Naval Battlefield, Xiaodong Huang, Haitao Yuan, Bi Jing, Liu Tao Oct 2021

Dqn-Based Path Planning Method And Simulation For Submarine And Warship In Naval Battlefield, Xiaodong Huang, Haitao Yuan, Bi Jing, Liu Tao

Journal of System Simulation

Abstract: To realize multi-agent intelligent planning and target tracking in complex naval battlefield environment, the work focuses on agents (submarine or warship), and proposes a simulation method based on reinforcement learning algorithm called Deep Q Network (DQN). Two neural networks with the same structure and different parameters are designed to update real and predicted Q values for the convergence of value functions. An ε-greedy algorithm is proposed to design an action selection mechanism, and a reward function is designed for the naval battlefield environment to increase the update velocity and generalization ability of Learning with Experience Replay (LER). Simulation results …


Research On Experimental Method Of Joint Operation Simulation Based On Human-Machine Hybrid Intelligence, Ma Jun, Jingyu Yang, Wu Xi Oct 2021

Research On Experimental Method Of Joint Operation Simulation Based On Human-Machine Hybrid Intelligence, Ma Jun, Jingyu Yang, Wu Xi

Journal of System Simulation

Abstract: In view of the difficulties that the joint operation simulation experiment methods are mainly for guiding equipment evaluation and demonstration, which is difficult to effectively support the research of operation problems, a joint operation simulation experiment method based on human-machine hybrid intelligence is proposed. The classification, generation and accumulation process of the knowledge in joint operation simulation experiment are clarified. Through the detailed descriptions of experimental interaction process, experimental operation process, experimental driving mode, simulation operation mode, supporting system structure, etc., a joint operation simulation experiment framework based on man-machine hybrid intelligence is constructed. It provides a new method …


Study On Next-Generation Strategic Wargame System, Wu Xi, Xianglin Meng, Jingyu Yang Sep 2021

Study On Next-Generation Strategic Wargame System, Wu Xi, Xianglin Meng, Jingyu Yang

Journal of System Simulation

Abstract: Strategic wargame is an important support to the strategic decision. The research status and challenges of the strategic wargame are analyzed, and the influence of big data and artificial intelligence technology on the strategic wargame system is studied. The prospects and key technologies of the next-generation strategic wargame system are studied, including the construction of event association graph for strategic topics, generation of strategic decision sparse samples based on generative adversarial nets, gaming strategy learning of human-in-loop hybrid enhancement, and public opinion dissemination modeling technology based on social network. The development trend of the strategic wargame is proposed.


Self-Learning-Based Multiple Spacecraft Evasion Decision Making Simulation Under Sparse Reward Condition, Zhao Yu, Jifeng Guo, Yan Peng, Chengchao Bai Aug 2021

Self-Learning-Based Multiple Spacecraft Evasion Decision Making Simulation Under Sparse Reward Condition, Zhao Yu, Jifeng Guo, Yan Peng, Chengchao Bai

Journal of System Simulation

Abstract: In order to improve the ability of spacecraft formation to evade multiple interceptors, aiming at the low success rate of traditional procedural maneuver evasion, a multi-agent cooperative autonomous decision-making algorithm, which is based on deep reinforcement learning method, is proposed. Based on the actor-critic architecture, a multi-agent reinforcement learning algorithm is designed, in which a weighted linear fitting method is proposed to solve the reliability allocation problem of the self-learning system. To solve the sparse reward problem in task scenario, a sparse reward reinforcement learning method based on inverse value method is proposed. According to the task scenario, …


Learning To Assign: Towards Fair Task Assignment In Large-Scale Ride Hailing, Dingyuan Shi, Yongxin Tong, Zimu Zhou, Bingchen Song, Weifeng Lv, Qiang Yang Aug 2021

Learning To Assign: Towards Fair Task Assignment In Large-Scale Ride Hailing, Dingyuan Shi, Yongxin Tong, Zimu Zhou, Bingchen Song, Weifeng Lv, Qiang Yang

Research Collection School Of Computing and Information Systems

Ride hailing is a widespread shared mobility application where the central issue is to assign taxi requests to drivers with various objectives. Despite extensive research on task assignment in ride hailing, the fairness of earnings among drivers is largely neglected. Pioneer studies on fair task assignment in ride hailing are ineffective and inefficient due to their myopic optimization perspective and timeconsuming assignment techniques. In this work, we propose LAF, an effective and efficient task assignment scheme that optimizes both utility and fairness. We adopt reinforcement learning to make assignments in a holistic manner and propose a set of acceleration techniques …


Step-Wise Deep Learning Models For Solving Routing Problems, Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang Jul 2021

Step-Wise Deep Learning Models For Solving Routing Problems, Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang

Research Collection School Of Computing and Information Systems

Routing problems are very important in intelligent transportation systems. Recently, a number of deep learning-based methods are proposed to automatically learn construction heuristics for solving routing problems. However, these methods do not completely follow Bellman's Principle of Optimality since the visited nodes during construction are still included in the following subtasks, resulting in suboptimal policies. In this article, we propose a novel step-wise scheme which explicitly removes the visited nodes in each node selection step. We apply this scheme to two representative deep models for routing problems, pointer network and transformer attention model (TAM), and significantly improve the performance of …