Sparse Temporal Difference Learning using LASSO

Manuel Loth 1 Manuel Davy 1, 2 Philippe Preux 1
1 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
2 LAGIS-SI
LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal
Abstract : We consider the problem of on-line value function estimation in reinforcement learning. We concentrate on the function approximator to use. To try to break the curse of dimensionality, we focus on non parametric function approximators. We propose to fit the use of kernels into the temporal difference algorithms by using regression via the LASSO. We introduce the equi-gradient descent algorithm (EGD) which is a direct adaptation of the one recently introduced in the LARS algorithm family for solving the LASSO. We advocate our choice of the EGD as a judicious algorithm for these tasks. We present the EGD algorithm in details as well as some experimental results. We insist on the qualities of the EGD for reinforcement learning.
Document type :
Conference papers
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.inria.fr/inria-00117075
Contributor : Manuel Loth <>
Submitted on : Thursday, November 30, 2006 - 1:15:51 PM
Last modification on : Thursday, February 21, 2019 - 10:52:49 AM
Long-term archiving on : Tuesday, April 6, 2010 - 11:39:31 PM

File

lassoTd.pdf
Publisher files allowed on an open archive

Identifiers

  • HAL Id : inria-00117075, version 1

Collections

Citation

Manuel Loth, Manuel Davy, Philippe Preux. Sparse Temporal Difference Learning using LASSO. IEEE International Symposium on Approximate Dynamic Programming and Reinforcement Learning, Apr 2007, Hawaï, USA, United States. ⟨inria-00117075⟩

Share

Metrics

Record views

396

Files downloads

664