B. P. and R. Y. Tsybakov-a, Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, pp.1705-1732, 2009.

B. S. Barto-a, Linear Least-Squares algorithms for temporal difference learning, Machine Learning, pp.33-57, 1996.

C. E. Tao-t, The Dantzig selector : statistical estimation when p is much larger than n, Annals of Statistics, vol.35, issue.6, pp.2313-2351, 2007.

E. B. Hastie-t and J. I. Tibshirani-r, Least Angle Regression, Annals of Statistics, vol.32, issue.2, pp.407-499, 2004.

F. A. , G. M. , and S. C. Mannor-s, Regularized Policy Iteration, Proc. of NIPS 21, 2008.

F. A. Szepesvári-c, Model selection in reinforcement learning, Machine Learning Journal, vol.85, issue.3, pp.299-332, 2011.

G. M. Pietquin-o, A Brief Survey of Parametric Value Function Approximation, 2010.

G. M. Scherrer-b, 1 -penalized projected Bellman residual, Proc. of EWRL 9, 2011.

G. M. , L. A. , and M. R. Hoffman-m, Finite-Sample Analysis of Lasso- TD, Proc. of ICML, 2011.

H. M. , L. A. , and G. M. Munos-r, Regularized Least Squares Temporal Difference learning with nested 2 and 1 penalization, Proc. of EWRL 9, 2011.

J. J. Painter- and W. C. Parr-r, Linear Complementarity for Regularized Policy Evaluation and Improvement, Proc. of NIPS 23, pp.1009-1017, 2010.

K. J. Ng-a, Regularization and Feature Selection in Least-Squares Temporal Difference Learning, Proc. of ICML, 2009.

S. R. Barto-a, Reinforcement Learning, 1998.
DOI : 10.1016/B978-012526430-3/50003-9

Y. H. Bertsekas-d, Error Bounds for Approximations from Projected Linear Equations, Mathematics of Operations Research, vol.35, pp.306-329, 2010.