G. Davies, S. Mallat, M. Avellaneda, and . Approximations, Adaptive greedy approximations, Constructive Approximation, vol.21, issue.1, pp.57-98, 1997.
DOI : 10.1007/BF02678430

G. Gordon, Stable Function Approximation in Dynamic Programming, Proceedings of the International Conference on Machine Learning, 1995.
DOI : 10.1016/B978-1-55860-377-6.50040-2

G. J. Gordon, Approximate solutions to Markov Decision Processes, 1999.

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2001.

S. Kakade and J. Langford, « Approximately Optimal Approximate Reinforcement Learning, Proceedings of the 19th International Conference on Machine Learning, 2002.

D. Koller and R. Parr, « Policy Iteration for Factored MDPs, Proceedings of the 16th conference on Uncertainty in Artificial Intelligence, 2000.

M. Lagoudakis, R. Parr, and . Least, Squares Policy Iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003.

R. Munos, « Error Bounds for Approximate Policy Iteration, 19th International Conference on Machine Learning, 2003.

D. Pollard, Convergence of Stochastic Processes, 1984.
DOI : 10.1007/978-1-4612-5254-2

J. Rust, Numerical Dynamic Programming in Economics, Handbook of Computational Economics, 1996.

A. Samuel, Some studies in machine learning using the game of checkers », IBM Journal on Research and Developmentp, pp.210-229, 1959.

V. Vapnik, Statistical Learning Theory, 1998.

V. Vapnik, S. E. Golowich, and A. Smola, « Support Vector Method for Function Approximation, Regression Estimation and Signal Processing, Advances in Neural Information Processing Systemsp, pp.281-287, 1997.

R. Williams and L. Baird, « Tight Performance Bounds on Greedy Policies Based on Imperfect Value Functions, 1993.