Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (175)
- Artificial Intelligence and Robotics (88)
- Engineering (64)
- Computer Engineering (39)
- Operations Research, Systems Engineering and Industrial Engineering (32)
-
- Numerical Analysis and Scientific Computing (28)
- Databases and Information Systems (25)
- Systems Science (19)
- Theory and Algorithms (17)
- Electrical and Computer Engineering (16)
- Social and Behavioral Sciences (14)
- OS and Networks (12)
- Public Affairs, Public Policy and Public Administration (9)
- Transportation (9)
- Software Engineering (8)
- Statistics and Probability (6)
- Graphics and Human Computer Interfaces (5)
- Computer and Systems Architecture (4)
- Mathematics (4)
- Other Computer Sciences (4)
- Chemistry (3)
- Data Science (3)
- Information Security (3)
- Life Sciences (3)
- Aerospace Engineering (2)
- Applied Mathematics (2)
- Business (2)
- Digital Communications and Networking (2)
- Finance and Financial Management (2)
- Institution
-
- Singapore Management University (51)
- China Simulation Federation (19)
- Brigham Young University (12)
- Missouri University of Science and Technology (7)
- TÜBİTAK (7)
-
- University of Massachusetts Amherst (7)
- MBZUAI (6)
- Air Force Institute of Technology (5)
- University of Denver (4)
- Georgia Southern University (3)
- Portland State University (3)
- San Jose State University (3)
- Selected Works (3)
- Technological University Dublin (3)
- Utah State University (3)
- Western University (3)
- Bucknell University (2)
- Chapman University (2)
- Edith Cowan University (2)
- Nova Southeastern University (2)
- University of Kentucky (2)
- University of Louisville (2)
- University of Nebraska - Lincoln (2)
- University of Nevada, Las Vegas (2)
- University of Texas Rio Grande Valley (2)
- Western Michigan University (2)
- Zayed University (2)
- California State University, San Bernardino (1)
- City University of New York (CUNY) (1)
- Clemson University (1)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (48)
- Journal of System Simulation (19)
- Theses and Dissertations (12)
- Electronic Theses and Dissertations (10)
- Faculty Publications (8)
-
- Turkish Journal of Electrical Engineering and Computer Sciences (7)
- Machine Learning Faculty Publications (6)
- Doctoral Dissertations (4)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (3)
- Articles (3)
- Computer Science Department Faculty Publication Series (3)
- Dissertations (3)
- Electrical and Computer Engineering Faculty Research & Creative Works (3)
- Electronic Thesis and Dissertation Repository (3)
- Master's Projects (3)
- All Works (2)
- CCE Theses and Dissertations (2)
- Computer Science Faculty Research & Creative Works (2)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (2)
- Dissertations and Theses (2)
- Dissertations and Theses Collection (Open Access) (2)
- Electrical & Computer Engineering Faculty Research (2)
- Theses and Dissertations--Computer Science (2)
- All Dissertations (1)
- All Graduate Theses, Dissertations, and Other Capstone Projects (1)
- Andrew G. Barto (1)
- Biology, Chemistry, and Environmental Sciences Faculty Articles and Research (1)
- Computer Science Faculty Publications (1)
- Computer Science Graduate and Undergraduate Student Scholarship (1)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
Articles 1 - 30 of 187
Full-Text Articles in Physical Sciences and Mathematics
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
De Novo Drug Design Using Transformer-Based Machine Translation And Reinforcement Learning Of An Adaptive Monte Carlo Tree Search, Dony Ang, Cyril Rakovski, Hagop S. Atamian
Biology, Chemistry, and Environmental Sciences Faculty Articles and Research
The discovery of novel therapeutic compounds through de novo drug design represents a critical challenge in the field of pharmaceutical research. Traditional drug discovery approaches are often resource intensive and time consuming, leading researchers to explore innovative methods that harness the power of deep learning and reinforcement learning techniques. Here, we introduce a novel drug design approach called drugAI that leverages the Encoder–Decoder Transformer architecture in tandem with Reinforcement Learning via a Monte Carlo Tree Search (RL-MCTS) to expedite the process of drug discovery while ensuring the production of valid small molecules with drug-like characteristics and strong binding affinities towards …
Energy Consumption Optimization Of Uav-Assisted Traffic Monitoring Scheme With Tiny Reinforcement Learning, Xiangjie Kong, Chenhao Ni, Gaohui Duan, Guojiang Shen, Yao Yang, Sajal K. Das
Energy Consumption Optimization Of Uav-Assisted Traffic Monitoring Scheme With Tiny Reinforcement Learning, Xiangjie Kong, Chenhao Ni, Gaohui Duan, Guojiang Shen, Yao Yang, Sajal K. Das
Computer Science Faculty Research & Creative Works
Unmanned Aerial Vehicles (UAVs) can capture pictures of road conditions in all directions and from different angles by carrying high-definition cameras, which helps gather relevant road data more effectively. However, due to their limited energy capacity, drones face challenges in performing related tasks for an extended period. Therefore, a crucial concern is how to plan the path of UAVs and minimize energy consumption. To address this problem, we propose a multi-agent deep deterministic policy gradient based (MADDPG) algorithm for UAV path planning (MAUP). Considering the energy consumption and memory usage of MAUP, we have conducted optimizations to reduce consumption on …
A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami
A New Cache Replacement Policy In Named Data Network Based On Fib Table Information, Mehran Hosseinzadeh, Neda Moghim, Samira Taheri, Nasrin Gholami
VMASC Publications
Named Data Network (NDN) is proposed for the Internet as an information-centric architecture. Content storing in the router’s cache plays a significant role in NDN. When a router’s cache becomes full, a cache replacement policy determines which content should be discarded for the new content storage. This paper proposes a new cache replacement policy called Discard of Fast Retrievable Content (DFRC). In DFRC, the retrieval time of the content is evaluated using the FIB table information, and the content with less retrieval time receives more discard priority. An impact weight is also used to involve both the grade of retrieval …
Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev
Reinforcement Learning: Applying Low Discrepancy Action Selection To Deep Deterministic Policy Gradient, Aleksandr Svishchev
Electronic Theses and Dissertations
Reinforcement learning (RL) is a subfield of machine learning concerned with agents learning to behave optimally by interacting with an environment. One of the most important topics in RL is how the agent should explore, that is, how to choose actions in order to rate their impact on long-term reward. For example, a simple baseline strategy might be uniformly random action selection. This thesis investigates the heuristic idea that agents will learn faster if they explore by factoring the environment’s state into their decision and intentionally choose actions which are as different as possible from what they have previously observed. …
Research And Development Of Simulation Training Platform For Multi-Agent Collaborative Decision-Making, Cheng Cheng, Zhijie Chen, Ziming Guo, Ni Li
Research And Development Of Simulation Training Platform For Multi-Agent Collaborative Decision-Making, Cheng Cheng, Zhijie Chen, Ziming Guo, Ni Li
Journal of System Simulation
Abstract: Reinforcement learning simulation platform can be an interactive and training environment for reinforcement learning. In order to make the simulation platform compatible with the multi-agent reinforcement learning algorithms and meet the needs of simulation in military field, the similar processes in multi-agent reinforcement learning algorithms are refined and a unified interface is designed to embed and verify different types of deep reinforcement learning algorithms on the simulation platform and to optimize the back-end service of the simulation platform to accelerate the training process of the algorithm model. The experimental results show that, by unifing the interface, the simulation platform …
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Neural Airport Ground Handling, Yaoxin Wu, Jianan Zhou, Yunwen Xia, Xianli Zhang, Zhiguang Cao, Jie Zhang
Research Collection School Of Computing and Information Systems
Airport ground handling (AGH) offers necessary operations to flights during their turnarounds and is of great importance to the efficiency of airport management and the economics of aviation. Such a problem involves the interplay among the operations that leads to NP-hard problems with complex constraints. Hence, existing methods for AGH are usually designed with massive domain knowledge but still fail to yield high-quality solutions efficiently. In this paper, we aim to enhance the solution quality and computation efficiency for solving AGH. Particularly, we first model AGH as a multiple-fleet vehicle routing problem (VRP) with miscellaneous constraints including precedence, time windows, …
Intercell Dynamic Scheduling Method Based On Deep Reinforcement Learning, Jing Ni, Mengke Ma
Intercell Dynamic Scheduling Method Based On Deep Reinforcement Learning, Jing Ni, Mengke Ma
Journal of System Simulation
Abstract: In order to solve the intercell scheduling problem of dynamic arrival of machining tasks and realize adaptive scheduling in the complex and changeable environment of the intelligent factory, a scheduling method based on a deep Q network is proposed. A complex network with cells as nodes and workpiece intercell machining path as directed edges is constructed, and the degree value is introduced to define the state space with intercell scheduling characteristics. A compound scheduling rule composed of a workpiece layer, unit layer, and machine layer is designed, and hierarchical optimization makes the scheduling scheme more global. Since double deep …
Uav-Enabled Task Offloading Strategy For Vehicular Edge Computing Networks, Feng Hu, Haiyang Gu, Jun Lin
Uav-Enabled Task Offloading Strategy For Vehicular Edge Computing Networks, Feng Hu, Haiyang Gu, Jun Lin
Journal of System Simulation
Abstract: As intelligent vehicles are equipped with more and more sensors, the explosive growth of sensor data is generated, which brings severe challenges to vehicular communication and computing. In addition, the modern road presents a three-dimensional structure, and the system architecture of traditional vehicular networks cannot guarantee full coverage and seamless computing. A task offloading strategy for UAV-assisted and 6G-enabled (Sixth Generation) vehicular edge computing networks is proposed. Furthermore, a flexible and intelligent vehicular edge computing mode is composed by vehicles and UAVs, which provide three-dimensional edge computing services for delay-sensitive and computation-intensive vehicular tasks, and ensure timely processing and …
Imitative Generation Of Optimal Guidance Law Based On Reinforcement Learning, Zhengxuan Jia, Tingyu Lin, Yingying Xiao, Guoqiang Shi, Hao Wang, Bi Zeng, Yiming Ou, Pengpeng Zhao
Imitative Generation Of Optimal Guidance Law Based On Reinforcement Learning, Zhengxuan Jia, Tingyu Lin, Yingying Xiao, Guoqiang Shi, Hao Wang, Bi Zeng, Yiming Ou, Pengpeng Zhao
Journal of System Simulation
Abstract: Under the background of high-speed maneuvering target interception, an optimal guidance law generation method for head-on interception independent of target acceleration estimation is proposed based on deep reinforcement learning. In addition, its effectiveness is verified through simulation experiments. As the simulation results suggest, the proposed method successfully achieves head-on interception of high-speed maneuvering targets in 3D space and largely reduces the requirement for target estimation with strong uncertainty, and it is more applicable than the optimal control method.
Task Distillation: Transforming Reinforcement Learning Into Supervised Learning, Connor Wilhelm
Task Distillation: Transforming Reinforcement Learning Into Supervised Learning, Connor Wilhelm
Theses and Dissertations
Recent work in dataset distillation focuses on distilling supervised classification datasets into smaller, synthetic supervised datasets in order to reduce per-model costs of training, to provide interpretability, and to anonymize data. Distillation and its benefits can be extended to a wider array of tasks. We propose a generalization of dataset distillation, which we call task distillation. Using techniques similar to those used in dataset distillation, any learning task can be distilled into a compressed synthetic task. Task distillation allows for transmodal distillations, where a task of one modality is distilled into a synthetic task of another modality, allowing a more …
Decentralized Multimedia Data Sharing In Iov: A Learning-Based Equilibrium Of Supply And Demand, Jiani Fan, Minrui Xu, Jiale Guo, Lwin Khin Shar, Jiawen Kang, Dusit Niyato, Kwok-Yan Lam
Decentralized Multimedia Data Sharing In Iov: A Learning-Based Equilibrium Of Supply And Demand, Jiani Fan, Minrui Xu, Jiale Guo, Lwin Khin Shar, Jiawen Kang, Dusit Niyato, Kwok-Yan Lam
Research Collection School Of Computing and Information Systems
The Internet of Vehicles (IoV) has great potential to transform transportation systems by enhancing road safety, reducing traffic congestion, and improving user experience through onboard infotainment applications. Decentralized data sharing can improve security, privacy, reliability, and facilitate infotainment data sharing in IoVs. However, decentralized data sharing may not achieve the expected efficiency if there are IoV users who only want to consume the shared data but are not willing to contribute their own data to the community, resulting in incomplete information observed by other vehicles and infrastructure, which can introduce additional transmission latency. Therefore, in this paper, by modeling the …
Aircraft Assignment Method For Optimal Utilization Of Maintenance Intervals, Runxia Guo, Yifu Wang
Aircraft Assignment Method For Optimal Utilization Of Maintenance Intervals, Runxia Guo, Yifu Wang
Journal of System Simulation
Abstract: The aircraft assignment problem is studied from a maintenance assurance perspective. In order to ensure its continuous airworthiness, civil aircraft are required to perform maintenance tasks, i. e., scheduled inspections, at specified intervals. The scheduled inspection interval is usually controlled by the number of flight cycles (FC), flight hours (FH), or flight days (FD), whichever comes first. In order to make balanced use of the inspection interval, an aircraft assignment model for a given fleet size is developed to optimize the maintenance interval utilization, and it is solved by a reinforcement learning algorithm to minimize the variance of the …
Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone
Dynamic Influence Diagram-Based Deep Reinforcement Learning Framework And Application For Decision Support For Operators In Control Rooms, Joseph Mietkiewicz, Ammar N. Abbas, Chidera Winifred Amazu, Anders L. Madsen, Gabriele Baldissone
Articles
In today’s complex industrial environment, operators are often faced with challenging situations that require quick and accurate decision-making. The human-machine interface (HMI) can display too much information, leading to information overload and potentially compromising the operator’s ability to respond effectively. To address this challenge, decision support models are needed to assist operators in identifying and responding to potential safety incidents. In this paper, we present an experiment to evaluate the effectiveness of a recommendation system in addressing the challenge of information overload. The case study focuses on a formaldehyde production simulator and examines the performance of an improved Human-Machine Interface …
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Reinforcement Learning Approach To Stochastic Vehicle Routing Problem With Correlated Demands, Zangir Iklassov, Ikboljon Sobirov, Ruben Solozabal, Martin Takac
Machine Learning Faculty Publications
We present a novel end-to-end framework for solving the Vehicle Routing Problem with stochastic demands (VRPSD) using Reinforcement Learning (RL). Our formulation incorporates the correlation between stochastic demands through other observable stochastic variables, thereby offering an experimental demonstration of the theoretical premise that non-i.i.d. stochastic demands provide opportunities for improved routing solutions. Our approach bridges the gap in the application of RL to VRPSD and consists of a parameterized stochastic policy optimized using a policy gradient algorithm to generate a sequence of actions that form the solution. Our model outperforms previous state-of-the-art metaheuristics and demonstrates robustness to changes in the …
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Transferable Curricula Through Difficulty Conditioned Generators, Sidney Tio, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
Advancements in reinforcement learning (RL) have demonstrated superhuman performance in complex tasks such as Starcraft, Go, Chess etc. However, knowledge transfer from Artificial "Experts" to humans remain a significant challenge. A promising avenue for such transfer would be the use of curricula. Recent methods in curricula generation focuses on training RL agents efficiently, yet such methods rely on surrogate measures to track student progress, and are not suited for training robots in the real world (or more ambitiously humans). In this paper, we introduce a method named Parameterized Environment Response Model (PERM) that shows promising results in training RL agents …
A Machine Learning Approach To Constructing Ramsey Graphs Leads To The Trahtenbrot-Zykov Problem., Emily Hawboldt
A Machine Learning Approach To Constructing Ramsey Graphs Leads To The Trahtenbrot-Zykov Problem., Emily Hawboldt
Electronic Theses and Dissertations
Attempts at approaching the well-known and difficult problem of constructing Ramsey graphs via machine learning lead to another difficult problem posed by Zykov in 1963 (now commonly referred to as the Trahtenbrot-Zykov problem): For which graphs F does there exist some graph G such that the neighborhood of every vertex in G induces a subgraph isomorphic to F? Chapter 1 provides a brief introduction to graph theory. Chapter 2 introduces Ramsey theory for graphs. Chapter 3 details a reinforcement learning implementation for Ramsey graph construction. The implementation is based on board game software, specifically the AlphaZero program and its …
Insights Into The Application Of Deep Reinforcement Learning In Healthcare And Materials Science, Benjamin R. Smith
Insights Into The Application Of Deep Reinforcement Learning In Healthcare And Materials Science, Benjamin R. Smith
Doctoral Dissertations
Reinforcement learning (RL) is a type of machine learning designed to optimize sequential decision-making. While controlled environments have served as a foundation for RL research, due to the growth in data volumes and deep learning methods, it is now increasingly being applied to real-world problems. In our work, we explore and attempt to overcome challenges that occur when applying RL to solve problems in healthcare and materials science.
First, we explore how issues in bias and data completeness affect healthcare applications of RL. To understand how bias has already been considered in this area, we survey the literature for existing …
Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu
Multi-View Hypergraph Contrastive Policy Learning For Conversational Recommendation, Sen Zhao, Wei Wei, Xian-Ling Mao, Shuai: Yang Zhu, Zujie Wen, Dangyang Chen, Feida Zhu, Feida Zhu
Research Collection School Of Computing and Information Systems
Conversational recommendation systems (CRS) aim to interactively acquire user preferences and accordingly recommend items to users. Accurately learning the dynamic user preferences is of crucial importance for CRS. Previous works learn the user preferences with pairwise relations from the interactive conversation and item knowledge, while largely ignoring the fact that factors for a relationship in CRS are multiplex. Specifically, the user likes/dislikes the items that satisfy some attributes (Like/Dislike view). Moreover social influence is another important factor that affects user preference towards the item (Social view), while is largely ignored by previous works in CRS. The user preferences from these …
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Imitation Improvement Learning For Large-Scale Capacitated Vehicle Routing Problems, The Viet Bui, Tien Mai
Research Collection School Of Computing and Information Systems
Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a newimprovement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar or better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into …
Reinforcement Learning For Sequential Decision Making With Constraints, Jiajing Ling
Reinforcement Learning For Sequential Decision Making With Constraints, Jiajing Ling
Dissertations and Theses Collection (Open Access)
Reinforcement learning is a widely used approach to tackle problems in sequential decision making where an agent learns from rewards or penalties. However, in decision-making problems that involve safety or limited resources, the agent's exploration is often limited by constraints. To model such problems, constrained Markov decision processes and constrained decentralized partially observable Markov decision processes have been proposed for single-agent and multi-agent settings, respectively. A significant challenge in solving constrained Dec-POMDP is determining the contribution of each agent to the primary objective and constraint violations. To address this issue, we propose a fictitious play-based method that uses Lagrangian Relaxation …
An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan
An Investigation Into Machine Learning Techniques For Designing Dynamic Difficulty Agents In Real-Time Games, Ryan Adare Dunagan
Electronic Theses and Dissertations
Video games are an incredibly popular pastime enjoyed by people of all ages world wide. Many different kinds of games exist, but most games feature some elements of the player overcoming some challenge, usually through gameplay. These challenges are insurmountable for some people and may turn them off to video games as a pastime. Games can be made more accessible to players of little skill and/or experience through the use of Dynamic Difficulty Adjustment (DDA) systems that adjust the difficulty of the game in response to the player’s performance. This research seeks to establish the effectiveness of machine learning techniques …
Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau
Dynamic Police Patrol Scheduling With Multi-Agent Reinforcement Learning, Songhan Wong, Waldy Joe, Hoong Chuin Lau
Research Collection School Of Computing and Information Systems
Effective police patrol scheduling is essential in projecting police presence and ensuring readiness in responding to unexpected events in urban environments. However, scheduling patrols can be a challenging task as it requires balancing between two conflicting objectives namely projecting presence (proactive patrol) and incident response (reactive patrol). This task is made even more challenging with the fact that patrol schedules do not remain static as occurrences of dynamic incidents can disrupt the existing schedules. In this paper, we propose a solution to this problem using Multi-Agent Reinforcement Learning (MARL) to address the Dynamic Bi-objective Police Patrol Dispatching and Rescheduling Problem …
Detecting Complex Cyber Attacks Using Decoys With Online Reinforcement Learning, Marcus Gutierrez
Detecting Complex Cyber Attacks Using Decoys With Online Reinforcement Learning, Marcus Gutierrez
Open Access Theses & Dissertations
Most vulnerabilities discovered in cybersecurity can be associated with their own singular piece of software. I investigate complex vulnerabilities, which may require multiple software to be present. These complex vulnerabilities represent 16.6% of all documented vulnerabilities and are more dangerous on average than their simple vulnerability counterparts. In addition to this, because they often require multiple pieces of software to be present, they are harder to identify overall as specific combinations are needed for the vulnerability to appear.
I consider the motivating scenario where an attacker is repeatedly deploying exploits that use complex vulnerabilities into an Airport Wi-Fi. The network …
Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li
Reinforced Adaptation Network For Partial Domain Adaptation, Keyu Wu, Min Wu, Zhenghua Chen, Ruibing Jin, Wei Cui, Zhiguang Cao, Xiaoli Li
Research Collection School Of Computing and Information Systems
Domain adaptation enables generalized learning in new environments by transferring knowledge from label-rich source domains to label-scarce target domains. As a more realistic extension, partial domain adaptation (PDA) relaxes the assumption of fully shared label space, and instead deals with the scenario where the target label space is a subset of the source label space. In this paper, we propose a Reinforced Adaptation Network (RAN) to address the challenging PDA problem. Specifically, a deep reinforcement learning model is proposed to learn source data selection policies. Meanwhile, a domain adaptation model is presented to simultaneously determine rewards and learn domain-invariant feature …
Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam
Sim-To-Real Reinforcement Learning Framework For Autonomous Aerial Leaf Sampling, Ashraful Islam
Department of Computer Science and Engineering: Dissertations, Theses, and Student Research
Using unmanned aerial systems (UAS) for leaf sampling is contributing to a better understanding of the influence of climate change on plant species, and the dynamics of forest ecology by studying hard-to-reach tree canopies. Currently, multiple skilled operators are required for UAS maneuvering and using the leaf sampling tool. This often limits sampling to only the canopy top or periphery. Sim-to-real reinforcement learning (RL) can be leveraged to tackle challenges in the autonomous operation of aerial leaf sampling in the changing environment of a tree canopy. However, trans- ferring an RL controller that is learned in simulation to real UAS …
Research On Unmanned Swarm Combat System Adaptive Evolution Model Simulation, Zhiqiang Li, Yuanlong Li, Laixiang Yin, Xiangping Ma
Research On Unmanned Swarm Combat System Adaptive Evolution Model Simulation, Zhiqiang Li, Yuanlong Li, Laixiang Yin, Xiangping Ma
Journal of System Simulation
Abstract: Aiming at the fact that the intelligent unmanned swarm combat system is mainly composed of large-scale combat individuals with limited behavioral capabilities and has limited ability to adapt to the changes of battlefield environment and combat opponents, a learning evolution method combining genetic algorithm and reinforcement learning is proposed to construct an individual-based unmanned bee colony combat system evolution model. To improve the adaptive evolution efficiency of bee colony combat system, an improved genetic algorithm is proposed to improve the learning and evolution speed of bee colony individuals by using individual-specific mutation optimization strategy. Simulation experiment on …
Multi-Agent Cooperative Combat Simulation In Naval Battlefield With Reinforcement Learning, Ding Shi, Xuefeng Yan, Lina Gong, Jingxuan Zhang, Donghai Guan, Mingqiang Wei
Multi-Agent Cooperative Combat Simulation In Naval Battlefield With Reinforcement Learning, Ding Shi, Xuefeng Yan, Lina Gong, Jingxuan Zhang, Donghai Guan, Mingqiang Wei
Journal of System Simulation
Abstract: Due to the rapidly-changed situations of future naval battlefields, it is urgent to realize the high-quality combat simulation in naval battlefields based on artificial intelligence to comprehensively optimize and improve the combat effectiveness of our army and defeat the enemy. The collaboration of combat units is the key point and how to realize the balanced decision-making among multiple agents is the first task. Based on decoupling priority experience replay mechanism and attention mechanism, a multi-agent reinforcement learning-based cooperative combat simulation (MARL-CCSA) network is proposed. Based on the expert experience, a multi-scale reward function is designed, on which a naval …
A Review On Derivative Hedging Using Reinforcement Learning, Peng Liu
A Review On Derivative Hedging Using Reinforcement Learning, Peng Liu
Research Collection Lee Kong Chian School Of Business
Hedging is a common trading activity to manage the risk of engaging in transactions that involve derivatives such as options. Perfect and timely hedging, however, is an impossible task in the real market that characterizes discrete-time transactions with costs. Recent years have witnessed reinforcement learning (RL) in formulating optimal hedging strategies. Specifically, different RL algorithms have been applied to learn the optimal offsetting position based on market conditions, offering an automatic risk management solution that proposes optimal hedging strategies while catering to both market dynamics and restrictions. In this article, the author provides a comprehensive review of the use of …
Dqn-Based Joint Scheduling Method Of Heterogeneous Tt&C Resources, Naiyang Xue, Dan Ding, Yutong Jia, Zhiqiang Wang, Yuan Liu
Dqn-Based Joint Scheduling Method Of Heterogeneous Tt&C Resources, Naiyang Xue, Dan Ding, Yutong Jia, Zhiqiang Wang, Yuan Liu
Journal of System Simulation
Abstract: Joint scheduling of heterogeneous TT&C resources as research object, a deep Q network (DQN) algorithm based on reinforcement learning is proposed. The characteristics of the joint scheduling problem of heterogeneous TT&C resources being fully analyzied and mathematical language being used to describe the constraints affecting the solution, a resource joint scheduling model is established. From the perspective of applying reinforcement learning, two neural networks with the same structure and the action selection strategies based onεgreedy algorithm are respectively designed after Markov decision process description, and DQN solution framework is established. The simulation results show that DQN-based heterogeneous …
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Constrained Reinforcement Learning In Hard Exploration Problems, Pankayaraj Pathmanathan, Pradeep Varakantham
Research Collection School Of Computing and Information Systems
One approach to guaranteeing safety in Reinforcement Learning is through cost constraints that are imposed on trajectories. Recent works in constrained RL have developed methods that ensure constraints can be enforced even at learning time while maximizing the overall value of the policy. Unfortunately, as demonstrated in our experimental results, such approaches do not perform well on complex multi-level tasks, with longer episode lengths or sparse rewards. To that end, wepropose a scalable hierarchical approach for constrained RL problems that employs backward cost value functions in the context of task hierarchy and a novel intrinsic reward function in lower levels …