Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, COLT-19, pp.574-588, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, 2007. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp.330-337, 2007. ,
DOI : 10.1109/ADPRL.2007.368207
URL : https://hal.archives-ouvertes.fr/inria-00124833
Stochastic Optimal Control (The Discrete Time Case), 1978. ,
Tree-based batch mode reinforcement learning, Journal of Machine Learning Research, vol.6, pp.503-556, 2005. ,
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
An introduction to support vector machines (and other kernel-based learning methods), 2000. ,
DOI : 10.1017/CBO9780511801389
Generalization in reinforcement learning: Safely approximating the value function, NIPS-7, pp.369-376, 1995. ,
Fat-Shattering and the Learnability of Real-Valued Functions, Journal of Computer and System Sciences, vol.52, issue.3, pp.434-452, 1996. ,
DOI : 10.1006/jcss.1996.0033
?-entropy and ?-capacity of sets in functional space, pp.277-364, 1961. ,
Finite time bounds for sampling based fitted value iteration Computer and Automation Research Institute of the Hungarian Academy of Sciences, pp.13-17, 2006. ,
PEGASUS: A policy search method for large MDPs and POMDPs, Proceedings of the 16th Conference in Uncertainty in Artificial Intelligence, pp.406-415, 2000. ,
Sample complexity of policy search with known dynamics, NIPS-19, 2007. ,
Neural Network Learning: Theoretical Foundations, 1999. ,
DOI : 10.1017/CBO9780511624216
Neural Fitted Q Iteration ??? First Experiences with a Data Efficient Neural Reinforcement Learning Method, 16th European Conference on Machine Learning, pp.317-328, 2005. ,
DOI : 10.1007/11564096_32
Batch reinforcement learning in a complex domain, Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems , AAMAS '07, 2007. ,
DOI : 10.1145/1329125.1329241