A Model-Based Actor-Critic Algorithm in Continuous Time and Space

Rémi Coulom

Communication Dans Un Congrès Année : 2003

A Model-Based Actor-Critic Algorithm in Continuous Time and Space

(1)

Rémi Coulom

Fonction : Auteur
PersonId : 836044

Neuromimetic intelligence

Résumé

This paper presents a model-based actor-critic algorithm in continuous time and space. Two function approximators are used: one learns the policy (the actor) and the other learns the state-value function (the critic). The critic learns with the TD(lambda) algorithm and the actor by gradient ascent on the Hamiltonian. A similar algorithm had been proposed by Doya, but this one is more general. This algorithm was applied successfully to teach simulated articulated robots to swim.

Mots clés

dynamic programming reinforcement learning apprentissage par renforcement optimal control contrôle optimal programmation dynamique

Domaines

Autre [cs.OH]

Fichier principal

A03-R-125.pdf (76.16 Ko)

Publications Loria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00107659

Soumis le : jeudi 19 octobre 2006-09:04:11

Dernière modification le : jeudi 15 février 2024-03:30:59

Archivage à long terme le : mercredi 29 mars 2017-13:07:10

Dates et versions

inria-00107659 , version 1 (19-10-2006)

Identifiants

HAL Id : inria-00107659 , version 1

Citer

Rémi Coulom. A Model-Based Actor-Critic Algorithm in Continuous Time and Space. Sixth European Workshop on Reinforcement Learning - EWRL6, Sep 2003, Nancy, France, 2 p. ⟨inria-00107659⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

129 Consultations

68 Téléchargements

A Model-Based Actor-Critic Algorithm in Continuous Time and Space

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager