A Model-Based Actor-Critic Algorithm in Continuous Time and Space

Rémi Coulom 1
1 CORTEX - Neuromimetic intelligence
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents a model-based actor-critic algorithm in continuous time and space. Two function approximators are used: one learns the policy (the actor) and the other learns the state-value function (the critic). The critic learns with the TD(lambda) algorithm and the actor by gradient ascent on the Hamiltonian. A similar algorithm had been proposed by Doya, but this one is more general. This algorithm was applied successfully to teach simulated articulated robots to swim.
Type de document :
Communication dans un congrès
Sixth European Workshop on Reinforcement Learning - EWRL6, Sep 2003, Nancy, France, 2 p, 2003
Liste complète des métadonnées

Littérature citée [4 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00107659
Contributeur : Publications Loria <>
Soumis le : jeudi 19 octobre 2006 - 09:04:11
Dernière modification le : jeudi 11 janvier 2018 - 06:19:48
Document(s) archivé(s) le : mercredi 29 mars 2017 - 13:07:10

Identifiants

  • HAL Id : inria-00107659, version 1

Collections

Citation

Rémi Coulom. A Model-Based Actor-Critic Algorithm in Continuous Time and Space. Sixth European Workshop on Reinforcement Learning - EWRL6, Sep 2003, Nancy, France, 2 p, 2003. 〈inria-00107659〉

Partager

Métriques

Consultations de la notice

168

Téléchargements de fichiers

56