Fitted Qiteration in continuous action-space MDPs, Proceedings of NIPS, pp.9-16, 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00185311
Neuro-Dynamic Programming, Athena Scientific, 1996. ,
DOI : 10.1007/0-306-48332-7_333
(Approximate) iterated successive approximations algorithm for sequential decision processes, Annals of Operations Research, vol.3, issue.3, pp.1-12 ,
DOI : 10.1007/s10479-012-1073-x
Tree-based batch mode reinforcement learning, Journal of Machine Learning Research, vol.6, pp.503-556, 2005. ,
Error propagation for approximate policy and value iteration, Proceedings of NIPS, pp.568-576, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-00830154
Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes, Journal of Artificial Intelligence Research, vol.25, pp.75-118, 2006. ,
Classification-based policy iteration with a critic, Proceedings of ICML, pp.1049-1056, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00590972
Reinforcement Learning as Classification: Leveraging Modern Classifiers, Proceedings of ICML, pp.424-431, 2003. ,
Analysis of a Classification-based Policy Iteration Algorithm, Proceedings of ICML, pp.607-614, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00482065
Error Bounds for Approximate Policy Iteration, Proceedings of ICML, pp.560-567, 2003. ,
Performance Bounds in $L_p$???norm for Approximate Value Iteration, SIAM Journal on Control and Optimization, vol.46, issue.2, pp.541-561, 2007. ,
DOI : 10.1137/040614384
Finite-Time Bounds for Fitted Value Iteration, Journal of Machine Learning Research, vol.9, pp.815-857, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00120882
Modified Policy Iteration Algorithms for Discounted Markov Decision Problems, Management Science, vol.24, issue.11, 1978. ,
DOI : 10.1287/mnsc.24.11.1127
Approximate Modified Policy Iteration, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00758882
Reinforcement Learning Algorithms for MDPs, Wiley Encyclopedia of Operations Research, 2010. ,
DOI : 10.1002/9780470400531.eorms0714
Performance bound for Approximate Optimistic Policy Iteration, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00480952