Open Access. Powered by Scholars. Published by Universities.®
![Digital Commons Network](http://assets.bepress.com/20200205/img/dcn/DCsunburst.png)
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (88)
- Artificial Intelligence and Robotics (39)
- Engineering (27)
- Databases and Information Systems (23)
- Computer Engineering (11)
-
- Operations Research, Systems Engineering and Industrial Engineering (11)
- Social and Behavioral Sciences (10)
- Numerical Analysis and Scientific Computing (9)
- OS and Networks (9)
- Public Affairs, Public Policy and Public Administration (9)
- Transportation (9)
- Theory and Algorithms (8)
- Electrical and Computer Engineering (5)
- Computer and Systems Architecture (4)
- Graphics and Human Computer Interfaces (3)
- Information Security (3)
- Life Sciences (3)
- Software Engineering (3)
- Chemistry (2)
- Digital Communications and Networking (2)
- Mathematics (2)
- Other Chemistry (2)
- Programming Languages and Compilers (2)
- Statistics and Probability (2)
- Agriculture (1)
- Analytical, Diagnostic and Therapeutic Techniques and Equipment (1)
- Biological and Chemical Physics (1)
- Business (1)
- Categorical Data Analysis (1)
- Institution
-
- Singapore Management University (49)
- MBZUAI (7)
- Missouri University of Science and Technology (6)
- Brigham Young University (4)
- Air Force Institute of Technology (3)
-
- Technological University Dublin (3)
- University of Massachusetts Amherst (3)
- Chapman University (2)
- Edith Cowan University (2)
- University of Nebraska - Lincoln (2)
- University of Nevada, Las Vegas (2)
- Zayed University (2)
- Bucknell University (1)
- City University of New York (CUNY) (1)
- Fordham University (1)
- Georgia Southern University (1)
- Gettysburg College (1)
- Loyola University Chicago (1)
- Old Dominion University (1)
- Western Washington University (1)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (48)
- Faculty Publications (8)
- Machine Learning Faculty Publications (7)
- Articles (3)
- Computer Science Department Faculty Publication Series (3)
-
- Electrical and Computer Engineering Faculty Research & Creative Works (3)
- All Works (2)
- Computer Science Faculty Research & Creative Works (2)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (2)
- Electrical & Computer Engineering Faculty Research (2)
- Biology, Chemistry, and Environmental Sciences Faculty Articles and Research (1)
- Computer Science Faculty Publications (1)
- Computer Science Graduate and Undergraduate Student Scholarship (1)
- Computer Science: Faculty Publications and Other Works (1)
- Department of Mathematical Sciences Faculty Publications (1)
- Faculty Conference Papers and Presentations (1)
- Mathematics and Statistics Faculty Research & Creative Works (1)
- Mathematics, Physics, and Computer Science Faculty Articles and Research (1)
- Publications and Research (1)
- Research Collection Lee Kong Chian School Of Business (1)
- Research outputs 2011 (1)
- Research outputs 2022 to 2026 (1)
- VMASC Publications (1)
Articles 1 - 30 of 93
Full-Text Articles in Physical Sciences and Mathematics
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
Biology, Chemistry, and Environmental Sciences Faculty Articles and Research
The discovery of novel therapeutic compounds through de novo drug design represents a critical challenge in the field of pharmaceutical research. Traditional drug discovery approaches are often resource intensive and time consuming, leading researchers to explore innovative methods that harness the power of deep learning and reinforcement learning techniques. Here, we introduce a novel drug design approach called drugAI that leverages the Encoder–Decoder Transformer architecture in tandem with Reinforcement Learning via a Monte Carlo Tree Search (RL-MCTS) to expedite the process of drug discovery while ensuring the production of valid small molecules with drug-like characteristics and strong binding affinities towards …
Energy Consumption Optimization Of Uav-Assisted Traffic Monitoring Scheme With Tiny Reinforcement Learning, Xiangjie Kong, Chenhao Ni, Gaohui Duan, Guojiang Shen, Yao Yang, Sajal K. Das
Energy Consumption Optimization Of Uav-Assisted Traffic Monitoring Scheme With Tiny Reinforcement Learning, Xiangjie Kong, Chenhao Ni, Gaohui Duan, Guojiang Shen, Yao Yang, Sajal K. Das
Computer Science Faculty Research & Creative Works
Unmanned Aerial Vehicles (UAVs) can capture pictures of road conditions in all directions and from different angles by carrying high-definition cameras, which helps gather relevant road data more effectively. However, due to their limited energy capacity, drones face challenges in performing related tasks for an extended period. Therefore, a crucial concern is how to plan the path of UAVs and minimize energy consumption. To address this problem, we propose a multi-agent deep deterministic policy gradient based (MADDPG) algorithm for UAV path planning (MAUP). Considering the energy consumption and memory usage of MAUP, we have conducted optimizations to reduce consumption on …
A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami
A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami
VMASC Publications
Named Data Network (NDN) is proposed for the Internet as an information-centric architecture. Content storing in the router’s cache plays a significant role in NDN. When a router’s cache becomes full, a cache replacement policy determines which content should be discarded for the new content storage. This paper proposes a new cache replacement policy called Discard of Fast Retrievable Content (DFRC). In DFRC, the retrieval time of the content is evaluated using the FIB table information, and the content with less retrieval time receives more discard priority. An impact weight is also used to involve both the grade of retrieval …
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Research Collection School Of Computing and Information Systems
Airport ground handling (AGH) offers necessary operations to flights during their turnarounds and is of great importance to the efficiency of airport management and the economics of aviation. Such a problem involves the interplay among the operations that leads to NP-hard problems with complex constraints. Hence, existing methods for AGH are usually designed with massive domain knowledge but still fail to yield high-quality solutions efficiently. In this paper, we aim to enhance the solution quality and computation efficiency for solving AGH. Particularly, we first model AGH as a multiple-fleet vehicle routing problem (VRP) with miscellaneous constraints including precedence, time windows, …
Decentralized Multimedia Data Sharing In Iov: A Learning-Based Equilibrium Of Supply And Demand, Jiani Fan, Minrui Xu, Jiale Guo, Lwin Khin Shar, Jiawen Kang, Dusit Niyato, Kwok-Yan Lam
Decentralized Multimedia Data Sharing In Iov: A Learning-Based Equilibrium Of Supply And Demand, Jiani Fan, Minrui Xu, Jiale Guo, Lwin Khin Shar, Jiawen Kang, Dusit Niyato, Kwok-Yan Lam
Research Collection School Of Computing and Information Systems
The Internet of Vehicles (IoV) has great potential to transform transportation systems by enhancing road safety, reducing traffic congestion, and improving user experience through onboard infotainment applications. Decentralized data sharing can improve security, privacy, reliability, and facilitate infotainment data sharing in IoVs. However, decentralized data sharing may not achieve the expected efficiency if there are IoV users who only want to consume the shared data but are not willing to contribute their own data to the community, resulting in incomplete information observed by other vehicles and infrastructure, which can introduce additional transmission latency. Therefore, in this paper, by modeling the …
Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone
Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone
Articles
In today’s complex industrial environment, operators are often faced with challenging situations that require quick and accurate decision-making. The human-machine interface (HMI) can display too much information, leading to information overload and potentially compromising the operator’s ability to respond effectively. To address this challenge, decision support models are needed to assist operators in identifying and responding to potential safety incidents. In this paper, we present an experiment to evaluate the effectiveness of a recommendation system in addressing the challenge of information overload. The case study focuses on a formaldehyde production simulator and examines the performance of an improved Human-Machine Interface …
Asynchronous Fdrl-Based Low-Latency Computation Offloading For Integrated Terrestrial And Non-Terrestrial Power Iot, Sifeng Li, Sunxuan Zhang, Zhao Wang, Zhenyu Zhou, Xiaoyan Wang, Shahid Mumtaz, Mohsen Guizani, Valerio Frascolla
Asynchronous Fdrl-Based Low-Latency Computation Offloading For Integrated Terrestrial And Non-Terrestrial Power Iot, Sifeng Li, Sunxuan Zhang, Zhao Wang, Zhenyu Zhou, Xiaoyan Wang, Shahid Mumtaz, Mohsen Guizani, Valerio Frascolla
Machine Learning Faculty Publications
Integrated terrestrial and non-terrestrial power internet of things (IPIoT) has emerged as a paradigm shift to three-dimensional vertical communication networks for power systems in the 6G era. Computation offloading plays key roles in enabling real-time data processing and analysis for electric services. However, computation offloading in IPIoT still faces challenges of coupling between task offloading and computation resource allocation, resource heterogeneity and dynamics, and degraded model training caused by electromagnetic interference (EMI). In this article, we propose an asynchronous federated deep reinforcement learning (AFDRL)-based computation offloading framework for IPIoT, where models are uploaded asynchronously for federated averaging to relieve network …
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Machine Learning Faculty Publications
We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the …
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
Advancements in reinforcement learning (RL) have demonstrated superhuman performance in complex tasks such as Starcraft, Go, Chess etc. However, knowledge transfer from Artificial "Experts" to humans remain a significant challenge. A promising avenue for such transfer would be the use of curricula. Recent methods in curricula generation focuses on training RL agents efficiently, yet such methods rely on surrogate measures to track student progress, and are not suited for training robots in the real world (or more ambitiously humans). In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents …
Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu
Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu
Research Collection School Of Computing and Information Systems
Conversational recommendation systems (CRS) aim to interactively acquire user preferences and accordingly recommend items to users. Accurately learning the dynamic user preferences is of crucial importance for CRS. Previous works learn the user preferences with pairwise relations from the interactive conversation and item knowledge, while largely ignoring the fact that factors for a relationship in CRS are multiplex. Specifically, the user likes/dislikes the items that satisfy some attributes (Like/Dislike view). Moreover social influence is another important factor that affects user preference towards the item (Social view), while is largely ignored by previous works in CRS. The user preferences from these …
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Research Collection School Of Computing and Information Systems
Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a newimprovement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar or better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into …
Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau
Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau
Research Collection School Of Computing and Information Systems
Effective police patrol scheduling is essential in projecting police presence and ensuring readiness in responding to unexpected events in urban environments. However, scheduling patrols can be a challenging task as it requires balancing between two conflicting objectives namely projecting presence (proactive patrol) and incident response (reactive patrol). This task is made even more challenging with the fact that patrol schedules do not remain static as occurrences of dynamic incidents can disrupt the existing schedules. In this paper, we propose a solution to this problem using Multi-Agent Reinforcement Learning (MARL) to address the Dynamic Bi-objective Police Patrol Dispatching and Rescheduling Problem …
Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam
Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Using unmanned aerial systems (UAS) for leaf sampling is contributing to a better understanding of the influence of climate change on plant species, and the dynamics of forest ecology by studying hard-to-reach tree canopies. Currently, multiple skilled operators are required for UAS maneuvering and using the leaf sampling tool. This often limits sampling to only the canopy top or periphery. Sim-to-real reinforcement learning (RL) can be leveraged to tackle challenges in the autonomous operation of aerial leaf sampling in the changing environment of a tree canopy. However, trans- ferring an RL controller that is learned in simulation to real UAS …
Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li
Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li
Research Collection School Of Computing and Information Systems
Domain adaptation enables generalized learning in new environments by transferring knowledge from label-rich source domains to label-scarce target domains. As a more realistic extension, partial domain adaptation (PDA) relaxes the assumption of fully shared label space, and instead deals with the scenario where the target label space is a subset of the source label space. In this paper, we propose a Reinforced Adaptation Network (RAN) to address the challenging PDA problem. Specifically, a deep reinforcement learning model is proposed to learn source data selection policies. Meanwhile, a domain adaptation model is presented to simultaneously determine rewards and learn domain-invariant feature …
A Review On Derivative Hedging Using Reinforcement Learning, Peng Liu
A Review On Derivative Hedging Using Reinforcement Learning, Peng Liu
Research Collection Lee Kong Chian School Of Business
Hedging is a common trading activity to manage the risk of engaging in transactions that involve derivatives such as options. Perfect and timely hedging, however, is an impossible task in the real market that characterizes discrete-time transactions with costs. Recent years have witnessed reinforcement learning (RL) in formulating optimal hedging strategies. Specifically, different RL algorithms have been applied to learn the optimal offsetting position based on market conditions, offering an automatic risk management solution that proposes optimal hedging strategies while catering to both market dynamics and restrictions. In this article, the author provides a comprehensive review of the use of …
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
One approach to guaranteeing safety in Reinforcement Learning is through cost constraints that are imposed on trajectories. Recent works in constrained RL have developed methods that ensure constraints can be enforced even at learning time while maximizing the overall value of the policy. Unfortunately, as demonstrated in our experimental results, such approaches do not perform well on complex multi-level tasks, with longer episode lengths or sparse rewards. To that end, wepropose a scalable hierarchical approach for constrained RL problems that employs backward cost value functions in the context of task hierarchy and a novel intrinsic reward function in lower levels …
Reinforcement Learning Enhanced Pichunter For Interactive Search, Zhixin Ma, Jiaxin Wu, Weixiong Loo, Chong-Wah Ngo
Reinforcement Learning Enhanced Pichunter For Interactive Search, Zhixin Ma, Jiaxin Wu, Weixiong Loo, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
With the tremendous increase in video data size, search performance could be impacted significantly. Specifically, in an interactive system, a real-time system allows a user to browse, search and refine a query. Without a speedy system quickly, the main ingredient to engage a user to stay focused, an interactive system becomes less effective even with a sophisticated deep learning system. This paper addresses this challenge by leveraging approximate search, Bayesian inference, and reinforcement learning. For approximate search, we apply a hierarchical navigable small world, which is an efficient approximate nearest neighbor search algorithm. To quickly prune the search scope, we …
Learning Feature Embedding Refiner For Solving Vehicle Routing Problems, Jingwen Li, Yining Ma, Zhiguang Cao, Yaoxin Wu, Wen Song, Jie Zhang, Yeow Meng Chee
Learning Feature Embedding Refiner For Solving Vehicle Routing Problems, Jingwen Li, Yining Ma, Zhiguang Cao, Yaoxin Wu, Wen Song, Jie Zhang, Yeow Meng Chee
Research Collection School Of Computing and Information Systems
While the encoder–decoder structure is widely used in the recent neural construction methods for learning to solve vehicle routing problems (VRPs), they are less effective in searching solutions due to deterministic feature embeddings and deterministic probability distributions. In this article, we propose the feature embedding refiner (FER) with a novel and generic encoder–refiner–decoder structure to boost the existing encoder–decoder structured deep models. It is model-agnostic that the encoder and the decoder can be from any pretrained neural construction method. Regarding the introduced refiner network, we design its architecture by combining the standard gated recurrent units (GRU) cell with two new …
Continual Optimal Adaptive Tracking Of Uncertain Nonlinear Continuous-Time Systems Using Multilayer Neural Networks, Irfan Ganie, S. (Sarangapani) Jagannathan
Continual Optimal Adaptive Tracking Of Uncertain Nonlinear Continuous-Time Systems Using Multilayer Neural Networks, Irfan Ganie, S. (Sarangapani) Jagannathan
Electrical and Computer Engineering Faculty Research & Creative Works
This study provides a lifelong integral reinforcement learning (LIRL)-based optimal tracking scheme for uncertain nonlinear continuous-time (CT) systems using multilayer neural network (MNN). In this LIRL framework, the optimal control policies are generated by using both the critic neural network (NN) weights and single-layer NN identifier. The critic MNN weight tuning is accomplished using an improved singular value decomposition (SVD) of its activation function gradient. The NN identifier, on the other hand, provides the control coefficient matrix for computing the control policies. An online weight velocity attenuation (WVA)-based consolidation scheme is proposed wherein the significance of weights is derived by …
Intelligent Adaptive Gossip-Based Broadcast Protocol For Uav-Mec Using Multi-Agent Deep Reinforcement Learning, Zen Ren, Xinghua Li, Yinbin Miao, Zhuowen Li, Zihao Wang, Mengyao Zhu, Ximeng Liu, Deng, Robert H.
Intelligent Adaptive Gossip-Based Broadcast Protocol For Uav-Mec Using Multi-Agent Deep Reinforcement Learning, Zen Ren, Xinghua Li, Yinbin Miao, Zhuowen Li, Zihao Wang, Mengyao Zhu, Ximeng Liu, Deng, Robert H.
Research Collection School Of Computing and Information Systems
UAV-assisted mobile edge computing (UAV-MEC) has been proposed to offer computing resources for smart devices and user equipment. UAV cluster aided MEC rather than one UAV-aided MEC as edge pool is the newest edge computing architecture. Unfortunately, the data packet exchange during edge computing within the UAV cluster hasn't received enough attention. UAVs need to collaborate for the wide implementation of MEC, relying on the gossip-based broadcast protocol. However, gossip has the problem of long propagation delay, where the forwarding probability and neighbors are two factors that are difficult to balance. The existing works improve gossip from only one factor, …
Malbot-Drl: Malware Botnet Detection Using Deep Reinforcement Learning In Iot Networks, Mohammad Al-Fawa'reh, Jumana Abu-Khalaf, Patryk Szewczyk, James J. Kang
Malbot-Drl: Malware Botnet Detection Using Deep Reinforcement Learning In Iot Networks, Mohammad Al-Fawa'reh, Jumana Abu-Khalaf, Patryk Szewczyk, James J. Kang
Research outputs 2022 to 2026
In the dynamic landscape of cyber threats, multi-stage malware botnets have surfaced as significant threats of concern. These sophisticated threats can exploit Internet of Things (IoT) devices to undertake an array of cyberattacks, ranging from basic infections to complex operations such as phishing, cryptojacking, and distributed denial of service (DDoS) attacks. Existing machine learning solutions are often constrained by their limited generalizability across various datasets and their inability to adapt to the mutable patterns of malware attacks in real world environments, a challenge known as model drift. This limitation highlights the pressing need for adaptive Intrusion Detection Systems (IDS), capable …
End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek
End-To-End Hierarchical Reinforcement Learning With Integrated Subgoal Discovery, Shubham Pateria, Budhitama Subagdja, Ah-Hwee Tan, Chai Quek
Research Collection School Of Computing and Information Systems
Hierarchical reinforcement learning (HRL) is a promising approach to perform long-horizon goal-reaching tasks by decomposing the goals into subgoals. In a holistic HRL paradigm, an agent must autonomously discover such subgoals and also learn a hierarchy of policies that uses them to reach the goals. Recently introduced end-to-end HRL methods accomplish this by using the higher-level policy in the hierarchy to directly search the useful subgoals in a continuous subgoal space. However, learning such a policy may be challenging when the subgoal space is large. We propose integrated discovery of salient subgoals (LIDOSS), an end-to-end HRL method with an integrated …
Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo
Interactive Video Corpus Moment Retrieval Using Reinforcement Learning, Zhixin Ma, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Known-item video search is effective with human-in-the-loop to interactively investigate the search result and refine the initial query. Nevertheless, when the first few pages of results are swamped with visually similar items, or the search target is hidden deep in the ranked list, finding the know-item target usually requires a long duration of browsing and result inspection. This paper tackles the problem by reinforcement learning, aiming to reach a search target within a few rounds of interaction by long-term learning from user feedbacks. Specifically, the system interactively plans for navigation path based on feedback and recommends a potential target that …
Fdrl Approach For Association And Resource Allocation In Multi-Uav Air-To-Ground Iomt Network, Abegaz Mohammed, Aiman Erbad, Hayla Nahom, Abdullatif Albaseer, Mohammed Abdallah, Mohsen Guizani
Fdrl Approach For Association And Resource Allocation In Multi-Uav Air-To-Ground Iomt Network, Abegaz Mohammed, Aiman Erbad, Hayla Nahom, Abdullatif Albaseer, Mohammed Abdallah, Mohsen Guizani
Machine Learning Faculty Publications
In 6G networks, unmanned aerial vehicles (UAVs) can serve as aerial flying base stations (AFBS) with aerial mobile edge computing (AMEC) server capabilities. AFBS is an increasingly popular solution for delivering time-sensitive applications, extending network coverage, and assisting ground base stations in the healthcare systems for remote areas with limited infrastructure. Furthermore, the UAVs are deployed in the healthcare system to support the Internet of medical things (IoMT) devices in data collection, medical equipment distribution, and providing smart services. However, ensuring the privacy and security of patients’ data with the limited UAV resources is a major challenge. In this paper, …
An Adaptive Multi-Level Quantization-Based Reinforcement Learning Model For Enhancing Uav Landing On Moving Targets, Najmaddin Abo Mosali, Syariful Syafiq Shamsudin, Salama A. Mostafa, Omar Alfandi, Rosli Omar, Najib Al-Fadhali, Mazin Abed Mohammed, R. Q. Malik, Mustafa Musa Jaber, Abdu Saif
An Adaptive Multi-Level Quantization-Based Reinforcement Learning Model For Enhancing Uav Landing On Moving Targets, Najmaddin Abo Mosali, Syariful Syafiq Shamsudin, Salama A. Mostafa, Omar Alfandi, Rosli Omar, Najib Al-Fadhali, Mazin Abed Mohammed, R. Q. Malik, Mustafa Musa Jaber, Abdu Saif
All Works
The autonomous landing of an unmanned aerial vehicle (UAV) on a moving platform is an essential functionality in various UAV-based applications. It can be added to a teleoperation UAV system or part of an autonomous UAV control system. Various robust and predictive control systems based on the traditional control theory are used for operating a UAV. Recently, some attempts were made to land a UAV on a moving target using reinforcement learning (RL). Vision is used as a typical way of sensing and detecting the moving target. Mainly, the related works have deployed a deep-neural network (DNN) for RL, which …
Sdq: Stochastic Differentiable Quantization With Mixed Precision, Xijie Huang, Zhiqiang Shen, Shichao Li, Zechun Liu, Xianghong Hu, Jeffry Wicaksana, Eric Xing, Kwang Ting Cheng
Sdq: Stochastic Differentiable Quantization With Mixed Precision, Xijie Huang, Zhiqiang Shen, Shichao Li, Zechun Liu, Xianghong Hu, Jeffry Wicaksana, Eric Xing, Kwang Ting Cheng
Machine Learning Faculty Publications
In order to deploy deep models in a computationally efficient manner, model quantization approaches have been frequently used. In addition, as new hardware that supports mixed bitwidth arithmetic operations, recent research on mixed precision quantization (MPQ) begins to fully leverage the capacity of representation by searching optimized bitwidths for different layers and modules in a network. However, previous studies mainly search the MPQ strategy in a costly scheme using reinforcement learning, neural architecture search, etc., or simply utilize partial prior knowledge for bitwidth assignment, which might be biased on locality of information and is sub-optimal. In this work, we present …
Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac
Learning To Generalize Dispatching Rules On The Job Shop Scheduling, Zangir Iklassov, Dmitrii Medvedev, Ruben Solozabal, Martin Takac
Machine Learning Faculty Publications
This paper introduces a Reinforcement Learning approach to better generalize heuristic dispatching rules on the Job-shop Scheduling Problem (JSP). Current models on the JSP do not focus on generalization, although, as we show in this work, this is key to learning better heuristics on the problem. A well-known technique to improve generalization is to learn on increasingly complex instances using Curriculum Learning (CL). However, as many works in the literature indicate, this technique might suffer from catastrophic forgetting when transferring the learned skills between different problem sizes. To address this issue, we introduce a novel Adversarial Curriculum Learning (ACL) strategy, …
Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu
Offline Reinforcement Learning With Causal Structured World Models, Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu
Machine Learning Faculty Publications
Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment. Previous model-based offline RL methods learn fully connected nets as world-models to map the states and actions to the next-step states. However, it is sensible that a world-model should adhere to the underlying causal effect such that it will support learning an effective policy generalizing well in unseen states. In this paper, We first provide theoretical results that causal world-models can outperform plain world-models for offline RL by incorporating the causal structure into the generalization error …
Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo
Reinforcement Learning-Based Interactive Video Search, Zhixin Ma, Jiaxin Wu, Zhijian Hou, Chong-Wah Ngo
Research Collection School Of Computing and Information Systems
Despite the rapid progress in text-to-video search due to the advancement of cross-modal representation learning, the existing techniques still fall short in helping users to rapidly identify the search targets. Particularly, in the situation that a system suggests a long list of similar candidates, the user needs to painstakingly inspect every search result. The experience is frustrated with repeated watching of similar clips, and more frustratingly, the search targets may be overlooked due to mental tiredness. This paper explores reinforcement learning-based (RL) searching to relieve the user from the burden of brute force inspection. Specifically, the system maintains a graph …
Pervasive Machine Learning For Smart Radio Environments Enabled By Reconfigurable Intelligent Surfaces, George C. Alexandropoulos, Kyriakos Stylianopoulos, Chongwen Huang, Chau Yuen, Mehdi Bennis, Mérouane Debbah
Pervasive Machine Learning For Smart Radio Environments Enabled By Reconfigurable Intelligent Surfaces, George C. Alexandropoulos, Kyriakos Stylianopoulos, Chongwen Huang, Chau Yuen, Mehdi Bennis, Mérouane Debbah
Machine Learning Faculty Publications
The emerging technology of Reconfigurable Intelligent Surfaces (RISs) is provisioned as an enabler of smart wireless environments, offering a highly scalable, low-cost, hardware-efficient, and almost energy-neutral solution for dynamic control of the propagation of electromagnetic signals over the wireless medium, ultimately providing increased environmental intelligence for diverse operation objectives. One of the major challenges with the envisioned dense deployment of RISs in such reconfigurable radio environments is the efficient configuration of multiple metasurfaces with limited, or even the absence of, computing hardware. In this paper, we consider multi-user and multi-RIS-empowered wireless systems, and present a thorough survey of the online …