A. Antos, C. Szepesvári, and R. Munos, Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, vol.22, issue.1, pp.89-129, 2008.
DOI : 10.1007/s10994-007-5038-2

URL : https://hal.archives-ouvertes.fr/hal-00830201

J. A. Boyan, Technical Update: Least-Squares Temporal Difference Learning, Machine Learning, vol.49, issue.2/3, pp.233-246, 1999.
DOI : 10.1023/A:1017936530646

S. J. Bradtke and A. G. Barto, Linear Least-Squares algorithms for temporal difference learning, Machine Learning, vol.22, issue.1-3, pp.33-57, 1996.

S. S. Chen, D. L. Donoho, and M. A. Saunders, Atomic Decomposition by Basis Pursuit, SIAM Journal on Scientific Computing, vol.20, issue.1, pp.33-61, 1999.
DOI : 10.1137/S1064827596304010

B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, Least Angle Regression, Annals of Statistics, vol.32, issue.2, pp.407-499, 2004.

A. Farahmand, M. Ghavamzadeh, C. Szepesvári, and S. Mannor, Regularized policy iteration, 22nd Annual Conference on Neural Information Processing Systems (NIPS 21, 2008.

M. Ghavamzadeh, A. Lazaric, R. Munos, and M. Hoffman, Finite-Sample Analysis of Lasso-TD, International Conference on Machine Learning, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00830149

M. W. Hoffman, A. Lazaric, M. Ghavamzadeh, and R. Munos, Regularized Least Squares Temporal Difference Learning with Nested ???2 and ???1 Penalization, European Workshop on Reinforcement Learning, 2011.
DOI : 10.1007/978-3-642-29946-9_13

J. Johns and S. Mahadevan, Constructing basis functions from directed graphs for value function approximation, Proceedings of the 24th international conference on Machine learning, ICML '07, 2007.
DOI : 10.1145/1273496.1273545

J. Johns, C. Painter-wakefield, and R. Parr, Linear Complementarity for Regularized Policy Evaluation and Improvement, pp.1009-1017, 2010.

J. Z. Kolter and A. Y. Ng, Regularization and feature selection in least-squares temporal difference learning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553442

M. Loth, M. Davy, and P. Preux, Sparse Temporal Difference Learning Using LASSO, 2007 IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, 2007.
DOI : 10.1109/ADPRL.2007.368210

URL : https://hal.archives-ouvertes.fr/inria-00117075

R. Munos, Error bounds for approximate policy iteration, International Conference on Machine Learning, 2003.

R. Parr, L. Li, G. Taylor, C. Painter-wakefield, and M. L. Littman, An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning, Proceedings of the 25th international conference on Machine learning, ICML '08, pp.752-75908, 2008.
DOI : 10.1145/1390156.1390251

M. Petrik, G. Taylor, R. Parr, and S. Zilberstein, Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes, Proceedings of ICML, 2010.

S. Rosset and J. Zhu, Piecewise linear regularized solution paths, The Annals of Statistics, vol.35, issue.3, pp.1012-1030, 2007.
DOI : 10.1214/009053606000001370

URL : http://arxiv.org/abs/0708.2197

B. Scherrer, Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view, 27th International Conference on Machine Learning -ICML 2010, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00537403

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), 1998.
DOI : 10.1007/978-1-4615-3618-5

R. S. Sutton, H. R. Maei, D. Precup, S. Bhatnagar, D. Silver et al., Fast gradient-descent methods for temporal-difference learning with linear function approximation, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.993-1000, 2009.
DOI : 10.1145/1553374.1553501

C. Szepesvári, Algorithms for Reinforcement Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, vol.4, issue.1, 2010.
DOI : 10.2200/S00268ED1V01Y201005AIM009

G. Taylor and R. Parr, Kernelized value function approximation for reinforcement learning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, 2009.
DOI : 10.1145/1553374.1553504

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.5900

C. Thiery and B. Scherrer, Building Controllers for Tetris, ICGA Journal, vol.32, issue.1, pp.3-11, 2009.
DOI : 10.3233/ICG-2009-32102

URL : https://hal.archives-ouvertes.fr/inria-00418954

R. Tibshirani, Regression Shrinkage and Selection via the Lasso, Journal of the Royal Statistical Society. Series B (Methodological), vol.58, issue.1, pp.267-288, 1996.
DOI : 10.1111/j.1467-9868.2011.00771.x

H. Zou, The Adaptive Lasso and Its Oracle Properties, Journal of the American Statistical Association, vol.101, issue.476, pp.1418-1429, 2006.
DOI : 10.1198/016214506000000735

H. Zou and H. H. Zhang, On the adaptive elastic-net with a diverging number of parameters, The Annals of Statistics, vol.37, issue.4, pp.1733-1751, 2009.
DOI : 10.1214/08-AOS625