Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek Dec 2022

End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek

Research Collection School Of Computing and Information Systems

Hierarchical reinforcement learning (HRL) is a promising approach to perform long-horizon goal-reaching tasks by decomposing the goals into subgoals. In a holistic HRL paradigm, an agent must autonomously discover such subgoals and also learn a hierarchy of policies that uses them to reach the goals. Recently introduced end-to-end HRL methods accomplish this by using the higher-level policy in the hierarchy to directly search the useful subgoals in a continuous subgoal space. However, learning such a policy may be challenging when the subgoal space is large. We propose integrated discovery of salient subgoals (LIDOSS), an end-to-end HRL method with an integrated …


Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo Oct 2022

Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Known-item video search is effective with human-in-the-loop to interactively investigate the search result and refine the initial query. Nevertheless, when the first few pages of results are swamped with visually similar items, or the search target is hidden deep in the ranked list, finding the know-item target usually requires a long duration of browsing and result inspection. This paper tackles the problem by reinforcement learning, aiming to reach a search target within a few rounds of interaction by long-term learning from user feedbacks. Specifically, the system interactively plans for navigation path based on feedback and recommends a potential target that …


Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo Jun 2022

Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo

Research Collection School Of Computing and Information Systems

Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates, the user needs to painstakingly inspect every search result. The experience is frustrated with repeated watching of similar clips, and more frustratingly, the search targets may be overlooked due to mental tiredness. This paper explores reinforcement learning-based (RL) searching to relieve the user from the burden of brute force inspection. Specifically, the system maintains a graph …


Heterogeneous Attentions For Solving Pickup And Delivery Problem Via Deep Reinforcement Learning, Jingwen Li, Liang Xin, Zhiguang Cao, Andrew Lim, Wen Song, Jie Zhang Mar 2022

Heterogeneous Attentions For Solving Pickup And Delivery Problem Via Deep Reinforcement Learning, Jingwen Li, Liang Xin, Zhiguang Cao, Andrew Lim, Wen Song, Jie Zhang

Research Collection School Of Computing and Information Systems

Recently, there is an emerging trend to apply deep reinforcement learning to solve the vehicle routing problem (VRP), where a learnt policy governs the selection of next node for visiting. However, existing methods could not handle well the pairing and precedence relationships in the pickup and delivery problem (PDP), which is a representative variant of VRP. To address this challenging issue, we leverage a novel neural network integrated with a heterogeneous attention mechanism to empower the policy in deep reinforcement learning to automatically select the nodes. In particular, the heterogeneous attention mechanism specifically prescribes attentions for each role of the …