Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path, Machine Learning, vol.22, issue.1, pp.89-129, 2008. ,
DOI : 10.1007/978-1-4612-5254-2
URL : https://hal.archives-ouvertes.fr/hal-00830201
Pattern Recognition and Machine Learning, 2006. ,
Variational relevance vector machines, In: Uncertainty in Artificial Intelligence, pp.46-53, 2000. ,
Variational Inference: A Review for Statisticians, Journal of the American Statistical Association, vol.2, issue.518, 2016. ,
DOI : 10.1016/j.neuroimage.2007.04.054
URL : http://arxiv.org/abs/1601.00670
Technical update: Least-squares temporal difference learning, Machine Learning, vol.49, issue.2/3, pp.233-246, 2002. ,
DOI : 10.1023/A:1017936530646
Linear least-squares algorithms for temporal difference learning, Machine Learning, vol.22, issue.1, pp.33-57, 1996. ,
DOI : 10.1007/978-0-585-33656-5_4
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.143.857
Policy evaluation with temporal differences: A survey and comparison, Journal of Machine Learning Research, vol.15, pp.809-883, 2014. ,
Least angle regression, Annals of Statistics, vol.32, pp.407-499, 2004. ,
Gaussian Process Reinforcement Learning, International Conference on Machine Learning, pp.201-208, 2005. ,
DOI : 10.1007/978-1-4899-7502-7_109-1
Regularized policy iteration, Advances in Neural Information Processing Systems 21, pp.441-448, 2008. ,
???1-Penalized Projected Bellman Residual, Recent Advances in Reinforcement Learning -9th European Workshop, pp.89-101, 2011. ,
DOI : 10.1007/978-3-642-29946-9_12
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.6770
A dantzig selector approach to temporal difference learning, International Conference on Machine Learning, pp.1399-1406, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00749480
Incremental least-square temporal difference learning, The Twenty-first National Conference on Artificial Intelligence (AAAI), pp.356-361, 2006. ,
Finite-sample analysis of lasso-td, International Conference on Machine Learning, pp.1177-1184, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00830149
The elements of statistical learning: data mining, inference and prediction, 2009. ,
Regularized Least Squares Temporal Difference Learning with Nested ???2 and ???1 Penalization, Recent Advances in Reinforcement Learning -9th European Workshop, pp.102-114, 2011. ,
DOI : 10.1007/978-3-642-29946-9_13
Linear complementarity for regularized policy evaluation and improvement, Advances in Neural Information Processing Systems 23, pp.1009-1017, 2010. ,
An Introduction to Variational Methods for Graphical Models, Machine Learning, vol.37, issue.2, pp.183-233, 1999. ,
DOI : 10.1007/978-94-011-5014-9_5
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.106.3844
Regularization and feature selection in least-squares temporal difference learning, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.521-528, 2009. ,
DOI : 10.1145/1553374.1553442
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.5506
Least-squares policy iteration, The Journal of Machine Learning Research, vol.4, pp.1107-1149, 2003. ,
Finite-sample analysis of LSTD, International Conference on Machine Learning, pp.615-622, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00482189
Dantzig selector with an approximately optimal denoising matrix and its application in sparse reinforcement learning, Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, UAI, 2016. ,
Bias and variance in value function estimation, Twenty-first international conference on Machine learning , ICML '04, 2004. ,
DOI : 10.1145/1015330.1015402
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.946
Least squares policy evaluation algorithms with linear function approximation, Discrete Event Dynamic Systems, vol.13, issue.1/2, pp.79-110, 2003. ,
DOI : 10.1023/A:1022192903948
Greedy algorithms for sparse reinforcement learning, International Conference on Machine Learning, 2012. ,
Statistical field theory, Frontiers in Physics, 1988. ,
Statistical analysis of l1-penalized linear estimation with applications, 2011. ,
Markov Decision Processes : Discrete Stochastic Dynamic Programming, 2005. ,
DOI : 10.1002/9780470316887
Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks, vol.9, issue.5, 1998. ,
DOI : 10.1109/TNN.1998.712192
Fast gradient-descent methods for temporal-difference learning with linear function approximation, Proceedings of the 26th Annual International Conference on Machine Learning, ICML '09, pp.993-1000, 2009. ,
DOI : 10.1145/1553374.1553501
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.5674
Sparse bayesian learning and the relevance vector machine, Journal of Machine Learning Research, vol.1, pp.211-244, 2001. ,
Machine Learning for Intelligent Agents, Greece, 2015. ,
Value Function Approximation through Sparse Bayesian Modeling, Recent Advances in Reinforcement Learning -9th European Workshop, pp.128-139, 2011. ,
DOI : 10.1007/978-3-642-29946-9_15
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.231.3634