Open Access. Powered by Scholars. Published by Universities.®

Engineering Commons

Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics

Research Collection School Of Computing and Information Systems

2021

Multi-agent Planning And Learning

Articles 1 - 2 of 2

Full-Text Articles in Engineering

Learning And Exploiting Shaped Reward Models For Large Scale Multiagent Rl, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau Aug 2021

Learning And Exploiting Shaped Reward Models For Large Scale Multiagent Rl, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of …


Learning And Exploiting Shaped Reward Models For Large Scale Multiagent Rl, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau Aug 2021

Learning And Exploiting Shaped Reward Models For Large Scale Multiagent Rl, Arambam James Singh, Akshat Kumar, Hoong Chuin Lau

Research Collection School Of Computing and Information Systems

Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of …