Learning near-optimal policies with Bellman-residual minimization based tted policy iteration and a single sample path, Machine Learning, p.89129, 2008. ,

Regularized Policy Iteration, 2008. ,

Error Bounds for Approximate Policy Iteration, 2003. ,

Iterative Methods for Sparse Linear Systems, 2003. ,

DOI : 10.1137/1.9780898718003

Optimality of Reinforcement Learning Algorithms with Linear Function Approximation, 2002. ,

Fast Gradient-Descent Methods for Temporal-Dierence Learning with Linear Function Approximation, 2009. ,

Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,

DOI : 10.1109/TNN.1998.712192

Tight performance bounds on greedy policies based on imperfect value functions, 1993. ,

New Error Bounds for Approximations from Projected Linear Equations, 2008. ,