Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Computer Sciences

1996

Computer Science Department Faculty Publication Series

Markov Decision Problems

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto Jan 1996

Linear Least-Squares Algorithms For Temporal Difference Learning, Steven J. Bradtke, Andrew G. Barto

Computer Science Department Faculty Publication Series

We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Squares TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton's TD(A) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement …