Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

University of Massachusetts Amherst

Computer Science Department Faculty Publication Series

Reinforcement learning

Publication Year

Articles 1 - 3 of 3

Full-Text Articles in Physical Sciences and Mathematics

Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan Jan 2006

Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan

Computer Science Department Faculty Publication Series

In this paper we introduce proto-transfer leaning, a new framework for transfer learning. We explore solutions to transfer learning within reinforcement learning through the use of spectral methods. Proto-value functions (PVFs) are basis functions computed from a spectral analysis of random walks on the state space graph. They naturally lead to the ability to transfer knowledge and representation between related tasks or domains. We investigate task transfer by using the same PVFs in Markov decision processes (MDPs) with different rewards functions. Additionally, our experiments in domain transfer explore applying the Nyström method for interpolation of PVFs between MDPs of different …


Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto Jan 1999

Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto

Computer Science Department Faculty Publication Series

The execution order of a block of computer instructions on a pipelined machine can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results using both rollouts and reinforcement learning to construct heuristics for scheduling basic blocks. In simulation, the rollout scheduler outperformed a commercial scheduler, and the reinforcement learning scheduler performed almost as well as the commercial scheduler.


Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto Jan 1996

Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto

Computer Science Department Faculty Publication Series

We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Squares TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton's TD(A) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement …