| inria-00117075, version 1 |
|
|
| Voir la fiche détaillée | BibTeX EndNote TEI RefWorks |
|
|
|||||||
| IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning (2007) |
| We consider the problem of on-line value function estimation in reinforcement learning. We concentrate on the function approximator to use. To try to break the curse of dimensionality, we focus on non parametric function approximators. We propose to fit the use of kernels into the temporal difference algorithms by using regression via the LASSO. We introduce the equi-gradient descent algorithm (EGD) which is a direct adaptation of the one recently introduced in the LARS algorithm family for solving the LASSO. We advocate our choice of the EGD as a judicious algorithm for these tasks. We present the EGD algorithm in details as well as some experimental results. We insist on the qualities of the EGD for reinforcement learning. |
|
|
|
|
|
|
|
|
| a – | |
| b – | |
| c – | |
| 1 : | SEQUEL (INRIA Futurs) |
| INRIA – CNRS : UMR8022 – CNRS : UMR8146 – Université des Sciences et Technologies de Lille - Lille I – Université Charles de Gaulle - Lille III – Ecole Centrale de Lille | |
| 2 : | Laboratoire d'Automatique, Génie Informatique et Signal (LAGIS) |
| CNRS : UMR8146 – Université des Sciences et Technologies de Lille - Lille I – Ecole Centrale de Lille |
|
|
|
|
|
|
|
|
| Domaine | : | Informatique/Apprentissage |
| inria-00117075, version 1 | |
| http://hal.inria.fr/inria-00117075/fr/ | |
| oai:hal.inria.fr:inria-00117075_v1 | |
| Contributeur : Manuel Loth | |
| Soumis le : Jeudi 30 Novembre 2006, 13:15:51 | |
| Dernière modification le : Jeudi 25 Janvier 2007, 16:23:22 | |