High-Accuracy Value-Function Approximation with Neural Networks Applied to the Acrobot

Rémi Coulom 1
1 CORTEX - Neuromimetic intelligence
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(lambda). It generated an efficient controller, producing a high-accuracy state-value function. A striking feature of this value function is a very sharp 4-dimensional ridge that is extremely hard to evaluate with linear parametric approximators. From a broader point of view, this experimental success demonstrates some of the qualities of feedforward neural networks in comparison with linear approximators in reinforcement learning.
Type de document :
Communication dans un congrès
Michel Verleysen. 12th European Symposium on Artificial Neural Networks - ESANN'2004, 2004, Bruges, Belgique, d-side, pp.7-12, 2004
Liste complète des métadonnées

Littérature citée [10 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00107776
Contributeur : Publications Loria <>
Soumis le : jeudi 19 octobre 2006 - 09:08:54
Dernière modification le : jeudi 11 janvier 2018 - 06:19:48
Document(s) archivé(s) le : mercredi 29 mars 2017 - 13:04:28

Identifiants

  • HAL Id : inria-00107776, version 1

Collections

Citation

Rémi Coulom. High-Accuracy Value-Function Approximation with Neural Networks Applied to the Acrobot. Michel Verleysen. 12th European Symposium on Artificial Neural Networks - ESANN'2004, 2004, Bruges, Belgique, d-side, pp.7-12, 2004. 〈inria-00107776〉

Partager

Métriques

Consultations de la notice

146

Téléchargements de fichiers

108