Recursive Least-Squares Learning with Eligibility Traces

Bruno Scherrer 1 Matthieu Geist 2
1 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In the framework of Markov Decision Processes, we consider the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We describe a systematic approach for adapting on-policy learning least squares algorithms of the literature (LSTD, LSPE, FPKF and GPTD/KTD) to off-policy learning with eligibility traces. This leads to two known algorithms, LSTD($\lambda$)/LSPE($\lambda$) and suggests new extensions of FPKF and GPTD/KTD. We describe their recursive implementation, discuss their convergence properties, and illustrate their behavior experimentally. Overall, our study suggests that the state-of-art LSTD($\lambda$) remains the best least-squares algorithm.
Type de document :
Communication dans un congrès
European Wrokshop on Reinforcement Learning (EWRL 11), Sep 2011, Athens, Greece. 2011
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00644511
Contributeur : Bruno Scherrer <>
Soumis le : jeudi 24 novembre 2011 - 15:22:08
Dernière modification le : jeudi 29 mars 2018 - 11:06:04
Document(s) archivé(s) le : samedi 25 février 2012 - 02:26:38

Fichier

ewrl.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00644511, version 1

Collections

Citation

Bruno Scherrer, Matthieu Geist. Recursive Least-Squares Learning with Eligibility Traces. European Wrokshop on Reinforcement Learning (EWRL 11), Sep 2011, Athens, Greece. 2011. 〈hal-00644511〉

Partager

Métriques

Consultations de la notice

351

Téléchargements de fichiers

154