A Model-Based Actor-Critic Algorithm in Continuous Time and Space

Rémi Coulom

Conference Papers Year : 2003

A Model-Based Actor-Critic Algorithm in Continuous Time and Space

(1)

Rémi Coulom

Function : Author
PersonId : 836044

Neuromimetic intelligence

Abstract

This paper presents a model-based actor-critic algorithm in continuous time and space. Two function approximators are used: one learns the policy (the actor) and the other learns the state-value function (the critic). The critic learns with the TD(lambda) algorithm and the actor by gradient ascent on the Hamiltonian. A similar algorithm had been proposed by Doya, but this one is more general. This algorithm was applied successfully to teach simulated articulated robots to swim.

Keywords

dynamic programming reinforcement learning apprentissage par renforcement optimal control contrôle optimal programmation dynamique

Domains

Other [cs.OH]

Fichier principal

A03-R-125.pdf (76.16 Ko)

Publications Loria : Connect in order to contact the contributor

https://inria.hal.science/inria-00107659

Submitted on : Thursday, October 19, 2006-9:04:11 AM

Last modification on : Thursday, February 15, 2024-3:30:59 AM

Long-term archiving on: Wednesday, March 29, 2017-1:07:10 PM

Dates and versions

inria-00107659 , version 1 (19-10-2006)

Identifiers

HAL Id : inria-00107659 , version 1

Cite

Rémi Coulom. A Model-Based Actor-Critic Algorithm in Continuous Time and Space. Sixth European Workshop on Reinforcement Learning - EWRL6, Sep 2003, Nancy, France, 2 p. ⟨inria-00107659⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA UNIV-LORRAINE INRIA2 LORIA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

129 View

68 Download

A Model-Based Actor-Critic Algorithm in Continuous Time and Space

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share