Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
- Discipline
-
- Computer Sciences (177)
- Artificial Intelligence and Robotics (89)
- Engineering (64)
- Computer Engineering (39)
- Operations Research, Systems Engineering and Industrial Engineering (32)
-
- Numerical Analysis and Scientific Computing (28)
- Databases and Information Systems (25)
- Systems Science (19)
- Theory and Algorithms (17)
- Electrical and Computer Engineering (16)
- Social and Behavioral Sciences (14)
- OS and Networks (12)
- Public Affairs, Public Policy and Public Administration (9)
- Transportation (9)
- Software Engineering (8)
- Statistics and Probability (6)
- Graphics and Human Computer Interfaces (5)
- Computer and Systems Architecture (4)
- Mathematics (4)
- Other Computer Sciences (4)
- Chemistry (3)
- Data Science (3)
- Information Security (3)
- Life Sciences (3)
- Aerospace Engineering (2)
- Applied Mathematics (2)
- Business (2)
- Digital Communications and Networking (2)
- Finance and Financial Management (2)
- Institution
-
- Singapore Management University (51)
- China Simulation Federation (19)
- Brigham Young University (12)
- MBZUAI (7)
- Missouri University of Science and Technology (7)
-
- TÜBİTAK (7)
- University of Massachusetts Amherst (7)
- Air Force Institute of Technology (5)
- San Jose State University (4)
- University of Denver (4)
- Georgia Southern University (3)
- Portland State University (3)
- Selected Works (3)
- Technological University Dublin (3)
- Utah State University (3)
- Western University (3)
- Bucknell University (2)
- Chapman University (2)
- Edith Cowan University (2)
- Nova Southeastern University (2)
- University of Kentucky (2)
- University of Louisville (2)
- University of Nebraska - Lincoln (2)
- University of Nevada, Las Vegas (2)
- University of Texas Rio Grande Valley (2)
- Western Michigan University (2)
- Zayed University (2)
- California State University, San Bernardino (1)
- City University of New York (CUNY) (1)
- Clemson University (1)
- Publication Year
- Publication
-
- Research Collection School Of Computing and Information Systems (48)
- Journal of System Simulation (19)
- Theses and Dissertations (12)
- Electronic Theses and Dissertations (10)
- Faculty Publications (8)
-
- Machine Learning Faculty Publications (7)
- Turkish Journal of Electrical Engineering and Computer Sciences (7)
- Doctoral Dissertations (4)
- Master's Projects (4)
- All Graduate Theses and Dissertations, Spring 1920 to Summer 2023 (3)
- Articles (3)
- Computer Science Department Faculty Publication Series (3)
- Dissertations (3)
- Electrical and Computer Engineering Faculty Research & Creative Works (3)
- Electronic Thesis and Dissertation Repository (3)
- All Works (2)
- CCE Theses and Dissertations (2)
- Computer Science Faculty Research & Creative Works (2)
- Department of Computer Science and Engineering: Dissertations, Theses, and Student Research (2)
- Dissertations and Theses (2)
- Dissertations and Theses Collection (Open Access) (2)
- Electrical & Computer Engineering Faculty Research (2)
- Theses and Dissertations--Computer Science (2)
- All Dissertations (1)
- All Graduate Theses, Dissertations, and Other Capstone Projects (1)
- Andrew G. Barto (1)
- Biology, Chemistry, and Environmental Sciences Faculty Articles and Research (1)
- Computer Science Faculty Publications (1)
- Computer Science Graduate and Undergraduate Student Scholarship (1)
- Computer Science and Computer Engineering Undergraduate Honors Theses (1)
Articles 181 - 189 of 189
Full-Text Articles in Physical Sciences and Mathematics
Reinforcement Learning-Based Output Feedback Control Of Nonlinear Systems With Input Constraints, Pingan He, Jagannathan Sarangapani
Reinforcement Learning-Based Output Feedback Control Of Nonlinear Systems With Input Constraints, Pingan He, Jagannathan Sarangapani
Electrical and Computer Engineering Faculty Research & Creative Works
A novel neural network (NN) -based output feedback controller with magnitude constraints is designed to deliver a desired tracking performance for a class of multi-input-multi-output (MIMO) discrete-time strict feedback nonlinear systems. Reinforcement learning in discrete time is proposed for the output feedback controller, which uses three NN: 1) a NN observer to estimate the system states with the input-output data; 2) a critic NN to approximate certain strategic utility function; and 3) an action NN to minimize both the strategic utility function and the unknown dynamics estimation errors. The magnitude constraints are manifested as saturation nonlinearities in the output feedback …
Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson
Variable Resolution Discretization In The Joint Space, Christopher K. Monson, Kevin Seppi, David Wingate, Todd S. Peterson
Faculty Publications
We present JoSTLe, an algorithm that performs value iteration on control problems with continuous actions, allowing this useful reinforcement learning technique to be applied to problems where a priori action discretization is inadequate. The algorithm is an extension of a variable resolution technique that works for problems with continuous states and discrete actions. Results are given that indicate that JoSTLe is a promising step toward reinforcement learning in a fully continuous domain.
Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura
Incremental Policy Learning: An Equilibrium Selection Algorithm For Reinforcement Learning Agents With Common Interests, Nancy Fulda, Dan A. Ventura
Faculty Publications
We present an equilibrium selection algorithm for reinforcement learning agents that incrementally adjusts the probability of executing each action based on the desirability of the outcome obtained in the last time step. The algorithm assumes that at least one coordination equilibrium exists and requires that the agents have a heuristic for determining whether or not the equilibrium was obtained. In deterministic environments with one or more strict coordination equilibria, the algorithm will learn to play an optimal equilibrium as long as the heuristic is accurate. Empirical data demonstrate that the algorithm is also effective in stochastic environments and is able …
Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate
Solving Large Mdps Quickly With Partitioned Value Iteration, David Wingate
Theses and Dissertations
Value iteration is not typically considered a viable algorithm for solving large-scale MDPs because it converges too slowly. However, its performance can be dramatically improved by eliminating redundant or useless backups, and by backing up states in the right order. We present several methods designed to help structure value dependency, and present a systematic study of companion prioritization techniques which focus computation in useful regions of the state space. In order to scale to solve ever larger problems, we evaluate all enhancements and methods in the context of parallelizability. Using the enhancements, we discover that in many instances the limiting …
Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura
Target Sets: A Tool For Understanding And Predicting The Behavior Of Interacting Q-Learners, Nancy Fulda, Dan A. Ventura
Faculty Publications
Reinforcement learning agents that interact in a common environment frequently affect each others’ perceived transition and reward distributions. This can result in convergence of the agents to a sub-optimal equilibrium or even to a solution that is not an equilibrium at all. Several modifications to the Q-learning algorithm have been proposed which enable agents to converge to optimal equilibria under specified conditions. This paper presents the concept of target sets as an aid to understanding why these modifications have been successful and as a tool to assist in the development of new modifications which are applicable in a wider range …
Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura
Dynamic Joint Action Perception For Q-Learning Agents, Nancy Fulda, Dan A. Ventura
Faculty Publications
Q-learning is a reinforcement learning algorithm that learns expected utilities for state-action transitions through successive interactions with the environment. The algorithm's simplicity as well as its convergence properties have made it a popular algorithm for study. However, its non-parametric representation of utilities limits its effectiveness in environments with large amounts of perceptual input. For example, in multiagent systems, each agent may need to consider the action selections of its counterparts in order to learn effective behaviors. This creates a joint action space which grows exponentially with the number of agents in the system. In such situations, the Q-learning algorithm quickly …
Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay
Multiple Stochastic Learning Automata For Vehicle Path Control In An Automated Highway System, Cem Unsal, Pushkin Kachroo, John S. Bay
Electrical & Computer Engineering Faculty Research
This paper suggests an intelligent controller for an automated vehicle planning its own trajectory based on sensor and communication data. The intelligent controller is designed using the learning stochastic automata theory. Using the data received from on-board sensors, two automata (one for lateral actions, one for longitudinal actions) can learn the best possible action to avoid collisions. The system has the advantage of being able to work in unmodeled stochastic environments, unlike adaptive control methods or expert systems. Simulations for simultaneous lateral and longitudinal control of a vehicle provide encouraging results
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Computer Science Department Faculty Publication Series
The execution order of a block of computer instructions on a pipelined machine can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results using both rollouts and reinforcement learning to construct heuristics for scheduling basic blocks. In simulation, the rollout scheduler outperformed a commercial scheduler, and the reinforcement learning scheduler performed almost as well as the commercial scheduler.
Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto
Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto
Computer Science Department Faculty Publication Series
We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Squares TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton's TD(A) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement …