Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Institution
-
- Singapore Management University (25)
- China Simulation Federation (19)
- MBZUAI (6)
- San Jose State University (4)
- University of Denver (4)
-
- Air Force Institute of Technology (3)
- University of Massachusetts Amherst (3)
- Western University (3)
- Bucknell University (2)
- Chapman University (2)
- Georgia Southern University (2)
- University of Kentucky (2)
- University of Nevada, Las Vegas (2)
- Clemson University (1)
- Fordham University (1)
- Gettysburg College (1)
- Loyola University Chicago (1)
- Michigan Technological University (1)
- Minnesota State University, Mankato (1)
- Missouri State University (1)
- Technological University Dublin (1)
- The University of Southern Mississippi (1)
- University of Arkansas, Fayetteville (1)
- University of Louisville (1)
- West Chester University (1)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (23)
- Journal of System Simulation (19)
- Electronic Theses and Dissertations (7)
- Machine Learning Faculty Publications (6)
- Master's Projects (4)
-
- Doctoral Dissertations (3)
- Electronic Thesis and Dissertation Repository (3)
- Dissertations and Theses Collection (Open Access) (2)
- Electrical & Computer Engineering Faculty Research (2)
- Faculty Publications (2)
- Theses and Dissertations (2)
- Theses and Dissertations--Computer Science (2)
- All Dissertations (1)
- All Graduate Theses, Dissertations, and Other Capstone Projects (1)
- Articles (1)
- Biology, Chemistry, and Environmental Sciences Faculty Articles and Research (1)
- Computer Science Faculty Publications (1)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
- Computer Science: Faculty Publications and Other Works (1)
- Dissertations, Master's Theses and Master's Reports (1)
- Faculty Conference Papers and Presentations (1)
- Honors Theses (1)
- MSU Graduate Theses (1)
- Master's Theses (1)
- Mathematics, Physics, and Computer Science Faculty Articles and Research (1)
- West Chester University Master’s Theses (1)
- Publication Type
Articles 1 - 30 of 89
Full-Text Articles in Physical Sciences and Mathematics
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
Biology, Chemistry, and Environmental Sciences Faculty Articles and Research
The discovery of novel therapeutic compounds through de novo drug design represents a critical challenge in the field of pharmaceutical research. Traditional drug discovery approaches are often resource intensive and time consuming, leading researchers to explore innovative methods that harness the power of deep learning and reinforcement learning techniques. Here, we introduce a novel drug design approach called drugAI that leverages the Encoder–Decoder Transformer architecture in tandem with Reinforcement Learning via a Monte Carlo Tree Search (RL-MCTS) to expedite the process of drug discovery while ensuring the production of valid small molecules with drug-like characteristics and strong binding affinities towards …
Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev
Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev
Electronic Theses and Dissertations
Reinforcement learning (RL) is a subfield of machine learning concerned with agents learning to behave optimally by interacting with an environment. One of the most important topics in RL is how the agent should explore, that is, how to choose actions in order to rate their impact on long-term reward. For example, a simple baseline strategy might be uniformly random action selection. This thesis investigates the heuristic idea that agents will learn faster if they explore by factoring the environment’s state into their decision and intentionally choose actions which are as different as possible from what they have previously observed. …
Research And Development Of Simulation Training Platform For Multi-Agent Collaborative Decision-Making, Cheng Cheng, Zhijie Chen, Ziming Guo, Ni Li
Research And Development Of Simulation Training Platform For Multi-Agent Collaborative Decision-Making, Cheng Cheng, Zhijie Chen, Ziming Guo, Ni Li
Journal of System Simulation
Abstract: Reinforcement learning simulation platform can be an interactive and training environment for reinforcement learning. In order to make the simulation platform compatible with the multi-agent reinforcement learning algorithms and meet the needs of simulation in military field, the similar processes in multi-agent reinforcement learning algorithms are refined and a unified interface is designed to embed and verify different types of deep reinforcement learning algorithms on the simulation platform and to optimize the back-end service of the simulation platform to accelerate the training process of the algorithm model. The experimental results show that, by unifing the interface, the simulation platform …
Intercell Dynamic Scheduling Method Based On Deep Reinforcement Learning, Jing Ni, Mengke Ma
Intercell Dynamic Scheduling Method Based On Deep Reinforcement Learning, Jing Ni, Mengke Ma
Journal of System Simulation
Abstract: In order to solve the intercell scheduling problem of dynamic arrival of machining tasks and realize adaptive scheduling in the complex and changeable environment of the intelligent factory, a scheduling method based on a deep Q network is proposed. A complex network with cells as nodes and workpiece intercell machining path as directed edges is constructed, and the degree value is introduced to define the state space with intercell scheduling characteristics. A compound scheduling rule composed of a workpiece layer, unit layer, and machine layer is designed, and hierarchical optimization makes the scheduling scheme more global. Since double deep …
Uav-Enabled Task Offloading Strategy For Vehicular Edge Computing Networks, Feng Hu, Haiyang Gu, Jun Lin
Uav-Enabled Task Offloading Strategy For Vehicular Edge Computing Networks, Feng Hu, Haiyang Gu, Jun Lin
Journal of System Simulation
Abstract: As intelligent vehicles are equipped with more and more sensors, the explosive growth of sensor data is generated, which brings severe challenges to vehicular communication and computing. In addition, the modern road presents a three-dimensional structure, and the system architecture of traditional vehicular networks cannot guarantee full coverage and seamless computing. A task offloading strategy for UAV-assisted and 6G-enabled (Sixth Generation) vehicular edge computing networks is proposed. Furthermore, a flexible and intelligent vehicular edge computing mode is composed by vehicles and UAVs, which provide three-dimensional edge computing services for delay-sensitive and computation-intensive vehicular tasks, and ensure timely processing and …
Imitative Generation Of Optimal Guidance Law Based On Reinforcement Learning, Zhengxuan Jia, Tingyu Lin, Yingying Xiao, Guoqiang Shi, Hao Wang, Bi Zeng, Yiming Ou, Pengpeng Zhao
Imitative Generation Of Optimal Guidance Law Based On Reinforcement Learning, Zhengxuan Jia, Tingyu Lin, Yingying Xiao, Guoqiang Shi, Hao Wang, Bi Zeng, Yiming Ou, Pengpeng Zhao
Journal of System Simulation
Abstract: Under the background of high-speed maneuvering target interception, an optimal guidance law generation method for head-on interception independent of target acceleration estimation is proposed based on deep reinforcement learning. In addition, its effectiveness is verified through simulation experiments. As the simulation results suggest, the proposed method successfully achieves head-on interception of high-speed maneuvering targets in 3D space and largely reduces the requirement for target estimation with strong uncertainty, and it is more applicable than the optimal control method.
Aircraft Assignment Method For Optimal Utilization Of Maintenance Intervals, Runxia Guo, Yifu Wang
Aircraft Assignment Method For Optimal Utilization Of Maintenance Intervals, Runxia Guo, Yifu Wang
Journal of System Simulation
Abstract: The aircraft assignment problem is studied from a maintenance assurance perspective. In order to ensure its continuous airworthiness, civil aircraft are required to perform maintenance tasks, i. e., scheduled inspections, at specified intervals. The scheduled inspection interval is usually controlled by the number of flight cycles (FC), flight hours (FH), or flight days (FD), whichever comes first. In order to make balanced use of the inspection interval, an aircraft assignment model for a given fleet size is developed to optimize the maintenance interval utilization, and it is solved by a reinforcement learning algorithm to minimize the variance of the …
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Machine Learning Faculty Publications
We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the …
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
Advancements in reinforcement learning (RL) have demonstrated superhuman performance in complex tasks such as Starcraft, Go, Chess etc. However, knowledge transfer from Artificial "Experts" to humans remain a significant challenge. A promising avenue for such transfer would be the use of curricula. Recent methods in curricula generation focuses on training RL agents efficiently, yet such methods rely on surrogate measures to track student progress, and are not suited for training robots in the real world (or more ambitiously humans). In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents …
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Research Collection School Of Computing and Information Systems
Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a newimprovement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar or better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into …
Reinforcement Learning For Sequential Decision Making With Constraints, Jiajing Ling
Reinforcement Learning For Sequential Decision Making With Constraints, Jiajing Ling
Dissertations and Theses Collection (Open Access)
Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model such problems, constrained Markov decision processes and constrained decentralized partially observable Markov decision processes have been proposed for single-agent and multi-agent settings, respectively. A significant challenge in solving constrained Dec-POMDP is determining the contribution of each agent to the primary objective and constraint violations. To address this issue, we propose a fictitious play-based method that uses Lagrangian Relaxation …
An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan
An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan
Electronic Theses and Dissertations
Video games are an incredibly popular pastime enjoyed by people of all ages world wide. Many different kinds of games exist, but most games feature some elements of the player overcoming some challenge, usually through gameplay. These challenges are insurmountable for some people and may turn them off to video games as a pastime. Games can be made more accessible to players of little skill and/or experience through the use of Dynamic Difficulty Adjustment (DDA) systems that adjust the difficulty of the game in response to the player’s performance. This research seeks to establish the effectiveness of machine learning techniques …
Multi-Agent Cooperative Combat Simulation In Naval Battlefield With Reinforcement Learning, Ding Shi, Xuefeng Yan, Lina Gong, Jingxuan Zhang, Donghai Guan, Mingqiang Wei
Multi-Agent Cooperative Combat Simulation In Naval Battlefield With Reinforcement Learning, Ding Shi, Xuefeng Yan, Lina Gong, Jingxuan Zhang, Donghai Guan, Mingqiang Wei
Journal of System Simulation
Abstract: Due to the rapidly-changed situations of future naval battlefields, it is urgent to realize the high-quality combat simulation in naval battlefields based on artificial intelligence to comprehensively optimize and improve the combat effectiveness of our army and defeat the enemy. The collaboration of combat units is the key point and how to realize the balanced decision-making among multiple agents is the first task. Based on decoupling priority experience replay mechanism and attention mechanism, a multi-agent reinforcement learning-based cooperative combat simulation (MARL-CCSA) network is proposed. Based on the expert experience, a multi-scale reward function is designed, on which a naval …
Research On Unmanned Swarm Combat System Adaptive Evolution Model Simulation, Zhiqiang Li, Yuanlong Li, Laixiang Yin, Xiangping Ma
Research On Unmanned Swarm Combat System Adaptive Evolution Model Simulation, Zhiqiang Li, Yuanlong Li, Laixiang Yin, Xiangping Ma
Journal of System Simulation
Abstract: Aiming at the fact that the intelligent unmanned swarm combat system is mainly composed of large-scale combat individuals with limited behavioral capabilities and has limited ability to adapt to the changes of battlefield environment and combat opponents, a learning evolution method combining genetic algorithm and reinforcement learning is proposed to construct an individual-based unmanned bee colony combat system evolution model. To improve the adaptive evolution efficiency of bee colony combat system, an improved genetic algorithm is proposed to improve the learning and evolution speed of bee colony individuals by using individual-specific mutation optimization strategy. Simulation experiment on …
Dqn-Based Joint Scheduling Method Of Heterogeneous Tt&C Resources, Naiyang Xue, Dan Ding, Yutong Jia, Zhiqiang Wang, Yuan Liu
Dqn-Based Joint Scheduling Method Of Heterogeneous Tt&C Resources, Naiyang Xue, Dan Ding, Yutong Jia, Zhiqiang Wang, Yuan Liu
Journal of System Simulation
Abstract: Joint scheduling of heterogeneous TT&C resources as research object, a deep Q network (DQN) algorithm based on reinforcement learning is proposed. The characteristics of the joint scheduling problem of heterogeneous TT&C resources being fully analyzied and mathematical language being used to describe the constraints affecting the solution, a resource joint scheduling model is established. From the perspective of applying reinforcement learning, two neural networks with the same structure and the action selection strategies based onεgreedy algorithm are respectively designed after Markov decision process description, and DQN solution framework is established. The simulation results show that DQN-based heterogeneous …
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
One approach to guaranteeing safety in Reinforcement Learning is through cost constraints that are imposed on trajectories. Recent works in constrained RL have developed methods that ensure constraints can be enforced even at learning time while maximizing the overall value of the policy. Unfortunately, as demonstrated in our experimental results, such approaches do not perform well on complex multi-level tasks, with longer episode lengths or sparse rewards. To that end, wepropose a scalable hierarchical approach for constrained RL problems that employs backward cost value functions in the context of task hierarchy and a novel intrinsic reward function in lower levels …
Reinforcement Learning Enhanced Pichunter For Interactive Search, Zhixin Ma, Jiaxin Wu, Weixiong Loo, Chong-Wah Ngo
Reinforcement Learning Enhanced Pichunter For Interactive Search, Zhixin Ma, Jiaxin Wu, Weixiong Loo, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
With the tremendous increase in video data size, search performance could be impacted significantly. Specifically, in an interactive system, a real-time system allows a user to browse, search and refine a query. Without a speedy system quickly, the main ingredient to engage a user to stay focused, an interactive system becomes less effective even with a sophisticated deep learning system. This paper addresses this challenge by leveraging approximate search, Bayesian inference, and reinforcement learning. For approximate search, we apply a hierarchical navigable small world, which is an efficient approximate nearest neighbor search algorithm. To quickly prune the search scope, we …
The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson
The Basil Technique: Bias Adaptive Statistical Inference Learning Agents For Learning From Human Feedback, Jonathan Indigo Watson
Theses and Dissertations--Computer Science
We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. …
Navigating Classic Atari Games With Deep Learning, Ayan Abhiranya Singh
Navigating Classic Atari Games With Deep Learning, Ayan Abhiranya Singh
Master's Projects
Games for the Atari 2600 console provide great environments for testing reinforcement learning algorithms. In reinforcement learning algorithms, an agent typically learns about its environment via the delivery of periodic rewards. Deep Q-Learning, a variant of Q-Learning, utilizes neural networks which train a Q-function to predict the highest future reward given an input state and action. Deep Q-learning has shown great results in training agents to play Atari 2600 games like Space Invaders and Breakout. However, Deep Q-Learning has historically struggled with learning how to play games with greater emphasis on exploration and delayed rewards, like Ms. PacMan. In this …
End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek
End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek
Research Collection School Of Computing and Information Systems
Hierarchical reinforcement learning (HRL) is a promising approach to perform long-horizon goal-reaching tasks by decomposing the goals into subgoals. In a holistic HRL paradigm, an agent must autonomously discover such subgoals and also learn a hierarchy of policies that uses them to reach the goals. Recently introduced end-to-end HRL methods accomplish this by using the higher-level policy in the hierarchy to directly search the useful subgoals in a continuous subgoal space. However, learning such a policy may be challenging when the subgoal space is large. We propose integrated discovery of salient subgoals (LIDOSS), an end-to-end HRL method with an integrated …
Reinforcement-Learning-Based Adaptive Tracking Control For A Space Continuum Robot Based On Reinforcement Learning, Da Jiang, Zhiqin Cai, Zhongzhen Liu, Haijun Peng, Zhigang Wu
Reinforcement-Learning-Based Adaptive Tracking Control For A Space Continuum Robot Based On Reinforcement Learning, Da Jiang, Zhiqin Cai, Zhongzhen Liu, Haijun Peng, Zhigang Wu
Journal of System Simulation
Abstract: Aiming at the tracking control for three-arm space continuum robot in space active debris removal manipulation, an adaptive sliding mode control algorithm based on deep reinforcement learning is proposed. Through BP network, a data-driven dynamic model is developed as the predictive model to guide the reinforcement learning to adjust the sliding mode controller's parameters online, and finally realize a real-time tracking control. Simulation results show that the proposed data-driven predictive model can accurately predict the robot's dynamic characteristics with the relative error within ±1% to random trajectories. Compared with the fixed-parameter sliding mode controller, the proposed adaptive controller …
Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo
Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Known-item video search is effective with human-in-the-loop to interactively investigate the search result and refine the initial query. Nevertheless, when the first few pages of results are swamped with visually similar items, or the search target is hidden deep in the ranked list, finding the know-item target usually requires a long duration of browsing and result inspection. This paper tackles the problem by reinforcement learning, aiming to reach a search target within a few rounds of interaction by long-term learning from user feedbacks. Specifically, the system interactively plans for navigation path based on feedback and recommends a potential target that …
Fdrl Approach For Association And Resource Allocation In Multi-Uav Air-To-Ground Iomt Network, Abegaz Mohammed, Aiman Erbad, Hayla Nahom, Abdullatif Albaseer, Mohammed Abdallah, Mohsen Guizani
Fdrl Approach For Association And Resource Allocation In Multi-Uav Air-To-Ground Iomt Network, Abegaz Mohammed, Aiman Erbad, Hayla Nahom, Abdullatif Albaseer, Mohammed Abdallah, Mohsen Guizani
Machine Learning Faculty Publications
In 6G networks, unmanned aerial vehicles (UAVs) can serve as aerial flying base stations (AFBS) with aerial mobile edge computing (AMEC) server capabilities. AFBS is an increasingly popular solution for delivering time-sensitive applications, extending network coverage, and assisting ground base stations in the healthcare systems for remote areas with limited infrastructure. Furthermore, the UAVs are deployed in the healthcare system to support the Internet of medical things (IoMT) devices in data collection, medical equipment distribution, and providing smart services. However, ensuring the privacy and security of patients’ data with the limited UAV resources is a major challenge. In this paper, …
Sdq: Stochastic Differentiable Quantization With Mixed Precision, Xijie Huang, Zhiqiang Shen, Shichao Li, Zechun Liu, Xianghong Hu, Jeffry Wicaksana, Eric Xing, Kwang Ting Cheng
Sdq: Stochastic Differentiable Quantization With Mixed Precision, Xijie Huang, Zhiqiang Shen, Shichao Li, Zechun Liu, Xianghong Hu, Jeffry Wicaksana, Eric Xing, Kwang Ting Cheng
Machine Learning Faculty Publications
In order to deploy deep models in a computationally efficient manner, model quantization approaches have been frequently used. In addition, as new hardware that supports mixed bitwidth arithmetic operations, recent research on mixed precision quantization (MPQ) begins to fully leverage the capacity of representation by searching optimized bitwidths for different layers and modules in a network. However, previous studies mainly search the MPQ strategy in a costly scheme using reinforcement learning, neural architecture search, etc., or simply utilize partial prior knowledge for bitwidth assignment, which might be biased on locality of information and is sub-optimal. In this work, we present …
Application Of Improved Q Learning Algorithm In Job Shop Scheduling Problem, Yejian Zhao, Yanhong Wang, Jun Zhang, Hongxia Yu, Zhongda Tian
Application Of Improved Q Learning Algorithm In Job Shop Scheduling Problem, Yejian Zhao, Yanhong Wang, Jun Zhang, Hongxia Yu, Zhongda Tian
Journal of System Simulation
Abstract: Aiming at the job shop scheduling in a dynamic environment, a dynamic scheduling algorithm based on an improved Q learning algorithm and dispatching rules is proposed. The state space of the dynamic scheduling algorithm is described with the concept of "the urgency of remaining tasks" and a reward function with the purpose of "the higher the slack, the higher the penalty" is disigned. In view of the problem that the greedy strategy will select the sub-optimal actions in the later stage of learning, the traditional Q learning algorithm is improved by introducing an action selection strategy based on the …
Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac
Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac
Machine Learning Faculty Publications
This paper introduces a Reinforcement Learning approach to better generalize heuristic dispatching rules on the Job-shop Scheduling Problem (JSP). Current models on the JSP do not focus on generalization, although, as we show in this work, this is key to learning better heuristics on the problem. A well-known technique to improve generalization is to learn on increasingly complex instances using Curriculum Learning (CL). However, as many works in the literature indicate, this technique might suffer from catastrophic forgetting when transferring the learned skills between different problem sizes. To address this issue, we introduce a novel Adversarial Curriculum Learning (ACL) strategy, …
Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu
Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu
Machine Learning Faculty Publications
Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment. Previous model-based offline RL methods learn fully connected nets as world-models to map the states and actions to the next-step states. However, it is sensible that a world-model should adhere to the underlying causal effect such that it will support learning an effective policy generalizing well in unseen states. In this paper, We first provide theoretical results that causal world-models can outperform plain world-models for offline RL by incorporating the causal structure into the generalization error …
Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo
Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates, the user needs to painstakingly inspect every search result. The experience is frustrated with repeated watching of similar clips, and more frustratingly, the search targets may be overlooked due to mental tiredness. This paper explores reinforcement learning-based (RL) searching to relieve the user from the burden of brute force inspection. Specifically, the system maintains a graph …
Pervasive Machine Learning For Smart Radio Environments Enabled By Reconfigurable Intelligent Surfaces, George C. Alexandropoulos, Kyriakos Stylianopoulos, Chongwen Huang, Chau Yuen, Mehdi Bennis, Mérouane Debbah
Pervasive Machine Learning For Smart Radio Environments Enabled By Reconfigurable Intelligent Surfaces, George C. Alexandropoulos, Kyriakos Stylianopoulos, Chongwen Huang, Chau Yuen, Mehdi Bennis, Mérouane Debbah
Machine Learning Faculty Publications
The emerging technology of Reconfigurable Intelligent Surfaces (RISs) is provisioned as an enabler of smart wireless environments, offering a highly scalable, low-cost, hardware-efficient, and almost energy-neutral solution for dynamic control of the propagation of electromagnetic signals over the wireless medium, ultimately providing increased environmental intelligence for diverse operation objectives. One of the major challenges with the envisioned dense deployment of RISs in such reconfigurable radio environments is the efficient configuration of multiple metasurfaces with limited, or even the absence of, computing hardware. In this paper, we consider multi-user and multi-RIS-empowered wireless systems, and present a thorough survey of the online …
Research On The Construction Method Of Simulation Evaluation Index Of Operation Effectiveness Operation Concept Traction, Ziwei Zhang, Liang Li, Zhiming Dong, Yifei Wang, Li Duan
Research On The Construction Method Of Simulation Evaluation Index Of Operation Effectiveness Operation Concept Traction, Ziwei Zhang, Liang Li, Zhiming Dong, Yifei Wang, Li Duan
Journal of System Simulation
Abstract: Agents are difficult to be directly modeled and simulated due to the complexity of their own interaction and learning behaviors. Aiming at the common problems in the discrete simulation of the agent, the event transfer mechanism of the discrete event system specification (DEVS) atomic model is applied to express the interaction and learning of an agent. Through the interaction mode of the agent, the transfer control of multi-state external events, the port connection mode, as well as the introduction of reinforcement learning event transfer representation, a discrete simulation construction method of the agent based on the DEVS atomic model …