A. Antos, R. Munos, and C. Szepesvári, Fitted Qiteration in continuous action-space MDPs, Proceedings of NIPS, pp.9-16, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00185311

D. Bertsekas and J. Tsitsiklis, Neuro-Dynamic Programming, Athena Scientific, 1996.
DOI : 10.1007/0-306-48332-7_333

P. Canbolat and U. Rothblum, (Approximate) iterated successive approximations algorithm for sequential decision processes, Annals of Operations Research, vol.3, issue.3, pp.1-12
DOI : 10.1007/s10479-012-1073-x

D. Ernst, P. Geurts, and L. Wehenkel, Tree-based batch mode reinforcement learning, Journal of Machine Learning Research, vol.6, pp.503-556, 2005.

A. Farahmand, R. Munos, and C. Szepesvári, Error propagation for approximate policy and value iteration, Proceedings of NIPS, pp.568-576, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00830154

A. Fern, S. Yoon, and R. Givan, Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes, Journal of Artificial Intelligence Research, vol.25, pp.75-118, 2006.

V. Gabillon, A. Lazaric, M. Ghavamzadeh, and B. Scherrer, Classification-based policy iteration with a critic, Proceedings of ICML, pp.1049-1056, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00590972

M. Lagoudakis and R. Parr, Reinforcement Learning as Classification: Leveraging Modern Classifiers, Proceedings of ICML, pp.424-431, 2003.

A. Lazaric, M. Ghavamzadeh, M. , and R. , Analysis of a Classification-based Policy Iteration Algorithm, Proceedings of ICML, pp.607-614, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00482065

R. Munos, Error Bounds for Approximate Policy Iteration, Proceedings of ICML, pp.560-567, 2003.

R. Munos, Performance Bounds in $L_p$???norm for Approximate Value Iteration, SIAM Journal on Control and Optimization, vol.46, issue.2, pp.541-561, 2007.
DOI : 10.1137/040614384

R. Munos and C. Szepesvári, Finite-Time Bounds for Fitted Value Iteration, Journal of Machine Learning Research, vol.9, pp.815-857, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00120882

M. Puterman and M. Shin, Modified Policy Iteration Algorithms for Discounted Markov Decision Problems, Management Science, vol.24, issue.11, 1978.
DOI : 10.1287/mnsc.24.11.1127

. Scherrer, . Bruno, . Gabillon, . Victor, . Ghavamzadeh et al., Approximate Modified Policy Iteration, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00758882

C. Szepesvári, Reinforcement Learning Algorithms for MDPs, Wiley Encyclopedia of Operations Research, 2010.
DOI : 10.1002/9780470400531.eorms0714

C. Thiery and B. Scherrer, Performance bound for Approximate Optimistic Policy Iteration, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00480952