Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Singapore Management University

Research Collection School Of Computing and Information Systems

2023

Deep reinforcement learning

Articles 1 - 4 of 4

Full-Text Articles in Physical Sciences and Mathematics

Generalization Through Diversity: Improving Unsupervised Environment Design, Wenjun Li, Pradeep Varakantham, Dexun Li Aug 2023

Generalization Through Diversity: Improving Unsupervised Environment Design, Wenjun Li, Pradeep Varakantham, Dexun Li

Research Collection School Of Computing and Information Systems

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e.g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board). Due to this dependence, small changes in the environment (e.g., positions of obstacles in the maze, size of the board) can severely affect the effectiveness of the policy learned by the agent. To that end, existing work has proposed training RL agents on an adaptive curriculum of environments (generated automatically) to improve performance on out-of-distribution (OOD) test scenarios. Specifically, existing research has employed the potential for the …


Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li May 2023

Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li

Research Collection School Of Computing and Information Systems

Domain adaptation enables generalized learning in new environments by transferring knowledge from label-rich source domains to label-scarce target domains. As a more realistic extension, partial domain adaptation (PDA) relaxes the assumption of fully shared label space, and instead deals with the scenario where the target label space is a subset of the source label space. In this paper, we propose a Reinforced Adaptation Network (RAN) to address the challenging PDA problem. Specifically, a deep reinforcement learning model is proposed to learn source data selection policies. Meanwhile, a domain adaptation model is presented to simultaneously determine rewards and learn domain-invariant feature …


A Review On Learning To Solve Combinatorial Optimisation Problems In Manufacturing, Cong Zhang, Yaoxin Wu, Yining Ma, Wen Song, Zhang Le, Zhiguang Cao, Jie Zhang Mar 2023

A Review On Learning To Solve Combinatorial Optimisation Problems In Manufacturing, Cong Zhang, Yaoxin Wu, Yining Ma, Wen Song, Zhang Le, Zhiguang Cao, Jie Zhang

Research Collection School Of Computing and Information Systems

An efficient manufacturing system is key to maintaining a healthy economy today. With the rapid development of science and technology and the progress of human society, the modern manufacturing system is becoming increasingly complex, posing new challenges to both academia and industry. Ever since the beginning of industrialisation, leaps in manufacturing technology have always accompanied technological breakthroughs from other fields, for example, mechanics, physics, and computational science. Recently, machine learning (ML) technology, one of the crucial subjects of artificial intelligence, has made remarkable progress in many areas. This study thoroughly reviews how ML, specifically deep (reinforcement) learning, motivates new ideas …


Flexible Job-Shop Scheduling Via Graph Neural Network And Deep Reinforcement Learning, Wen Song, Xinyang Chen, Qiqiang Li, Zhiguang Cao Feb 2023

Flexible Job-Shop Scheduling Via Graph Neural Network And Deep Reinforcement Learning, Wen Song, Xinyang Chen, Qiqiang Li, Zhiguang Cao

Research Collection School Of Computing and Information Systems

Recently, deep reinforcement learning (DRL) has been applied to learn priority dispatching rules (PDRs) for solving complex scheduling problems. However, the existing works face challenges in dealing with flexibility, which allows an operation to be scheduled on one out of multiple machines and is often required in practice. Such one-to-many relationship brings additional complexity in both decision making and state representation. This article considers the well-known flexible job-shop scheduling problem and addresses these issues by proposing a novel DRL method to learn high-quality PDRs end to end. The operation selection and the machine assignment are combined as a composite decision. …