Least-squares policy iteration, Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003. ,
Tree-based batch mode reinforcement learning, Journal of Machine Learning Research, vol.6, pp.503-556, 2005. ,
Finite time bounds for sampling based fitted value iteration, ICML'2005, pp.881-886, 2005. ,
Stochastic Optimal Control (The Discrete Time Case), 1978. ,
Rates of convergence for empirical processes of stationary mixing sequences. The Annals of Probability, pp.94-116, 1994. ,
Nonparametric time series prediction through adaptive model selection, Machine Learning, pp.5-34, 2000. ,
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, COLT-19, pp.574-588, 2006. ,
URL : https://hal.archives-ouvertes.fr/hal-00830201
Neural Network Learning: Theoretical Foundations, 1999. ,
DOI : 10.1017/CBO9780511624216
Approximate action-value iteration in continuous state spaces: learning with a single trajectory, 2006. ,
Introduction to approximation theory, 1966. ,
A distribution-free theory of nonparametric regression, 2002. ,
DOI : 10.1007/b97848
Sphere packing numbers for subsets of the Boolean n-cube with bounded Vapnik-Chervonenkis dimension, Journal of Combinatorial Theory, Series A, vol.69, issue.2, pp.217-232, 1995. ,
DOI : 10.1016/0097-3165(95)90052-7
Error bounds for approximate policy iteration, ICML'2003, pp.560-567, 2003. ,
Finite time bounds for sampling based fitted value iteration, Journal of Machine Learning Research, 2005. ,
URL : https://hal.archives-ouvertes.fr/inria-00120882
An optimal multigrid algorithm for continuous state discrete time stochastic control, Proceedings of the 27th IEEE Conference on Decision and Control, pp.898-914, 1991. ,
DOI : 10.1109/CDC.1988.194660
Strict Stationarity of Generalized Autoregressive Processes, The Annals of Probability, vol.20, issue.4, pp.1714-1730, 1992. ,
DOI : 10.1214/aop/1176989526