Skip to Main content Skip to Navigation
New interface
Conference papers

Recursive Least-Squares Learning with Eligibility Traces

Bruno Scherrer 1 Matthieu Geist 2 
1 MAIA - Autonomous intelligent machine
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In the framework of Markov Decision Processes, we consider the problem of learning a linear approximation of the value function of some fixed policy from one trajectory possibly generated by some other policy. We describe a systematic approach for adapting on-policy learning least squares algorithms of the literature (LSTD, LSPE, FPKF and GPTD/KTD) to off-policy learning with eligibility traces. This leads to two known algorithms, LSTD($\lambda$)/LSPE($\lambda$) and suggests new extensions of FPKF and GPTD/KTD. We describe their recursive implementation, discuss their convergence properties, and illustrate their behavior experimentally. Overall, our study suggests that the state-of-art LSTD($\lambda$) remains the best least-squares algorithm.
Document type :
Conference papers
Complete list of metadata

Cited literature [19 references]  Display  Hide  Download
Contributor : Bruno Scherrer Connect in order to contact the contributor
Submitted on : Thursday, November 24, 2011 - 3:22:08 PM
Last modification on : Saturday, June 25, 2022 - 7:45:40 PM
Long-term archiving on: : Saturday, February 25, 2012 - 2:26:38 AM


Files produced by the author(s)


  • HAL Id : hal-00644511, version 1



Bruno Scherrer, Matthieu Geist. Recursive Least-Squares Learning with Eligibility Traces. European Wrokshop on Reinforcement Learning (EWRL 11), Sep 2011, Athens, Greece. ⟨hal-00644511⟩



Record views


Files downloads