Open Access. Powered by Scholars. Published by Universities.®
Physical Sciences and Mathematics Commons™
Open Access. Powered by Scholars. Published by Universities.®
Articles 1 - 3 of 3
Full-Text Articles in Physical Sciences and Mathematics
Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan
Proto-Transfer Learning In Markov Decision Processes Using Spectral Methods, Kimberly Ferguson, Sridhar Mahadevan
Sridhar Mahadevan
In this paper we introduce proto-transfer leaning, a new framework for transfer learning. We explore solutions to transfer learning within reinforcement learning through the use of spectral methods. Proto-value functions (PVFs) are basis functions computed from a spectral analysis of random walks on the state space graph. They naturally lead to the ability to transfer knowledge and representation between related tasks or domains. We investigate task transfer by using the same PVFs in Markov decision processes (MDPs) with different rewards functions. Additionally, our experiments in domain transfer explore applying the Nyström method for interpolation of PVFs between MDPs of different …
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Scheduling Straight-Line Code Using Reinforcement Learning And Rollouts, Amy Mcgovern, Eliot Moss, Andrew G. Barto
Andrew G. Barto
The execution order of a block of computer instructions on a pipelined machine can make a difference in its running time by a factor of two or more. In order to achieve the best possible speed, compilers use heuristic schedulers appropriate to each specific architecture implementation. However, these heuristic schedulers are time-consuming and expensive to build. We present empirical results using both rollouts and reinforcement learning to construct heuristics for scheduling basic blocks. In simulation, the rollout scheduler outperformed a commercial scheduler, and the reinforcement learning scheduler performed almost as well as the commercial scheduler.
Global Optimization For Value Function Approximation, Marek Petrik, Shlomo Zilberstein
Global Optimization For Value Function Approximation, Marek Petrik, Shlomo Zilberstein
Shlomo Zilberstein
Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bilinear programming formulation of value function approximation, which employs global optimization. The formulation provides strong a priori guarantees on both robust and expected policy loss by minimizing specific norms of the Bellman residual. Solving a bilinear program optimally is NP-hard, but this is unavoidable because the Bellman-residual minimization itself is NP-hard. We describe and analyze both optimal and approximate algorithms for solving bilinear programs. The analysis shows that this algorithm offers a convergent generalization …