Momentum in Reinforcement Learning

Nino Vieillard; Bruno Scherrer; Olivier Pietquin; Matthieu Geist

Communication Dans Un Congrès Année : 2020

Momentum in Reinforcement Learning

(1) , (2, 3) , (1) , (1)

1
2
3

Nino Vieillard

Fonction : Auteur

Google Brain, Paris

Bruno Scherrer

Fonction : Auteur
PersonId : 1406
IdHAL : bruno-scherrer
IdRef : 073360708

Biology, genetics and statistics

Institut Élie Cartan de Lorraine

Olivier Pietquin

Fonction : Auteur
PersonId : 4024
IdHAL : olivier-pietquin
ORCID : 0000-0002-5386-465X
IdRef : 142821861

Google Brain, Paris

Matthieu Geist

Fonction : Auteur
PersonId : 6945
IdHAL : matthieu-geist

Google Brain, Paris

Résumé

We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive q-functions. We derive Momentum Value Iteration (MoVI), a variation of Value iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically,we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.

Domaines

Apprentissage [cs.LG]

Fichier principal

vieillard20a-supp.pdf (5.38 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Bruno Scherrer : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03137343

Soumis le : mercredi 10 février 2021-13:37:57

Dernière modification le : jeudi 1 février 2024-10:06:00

Archivage à long terme le : mardi 11 mai 2021-18:36:26

Dates et versions

hal-03137343 , version 1 (10-02-2021)

Identifiants

HAL Id : hal-03137343 , version 1

Citer

Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist. Momentum in Reinforcement Learning. AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo / Virtual, Italy. ⟨hal-03137343⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA IECN UNIV-LORRAINE INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC IECLPS UNIV-RENNES UR1-MATH-NUM

54 Consultations

97 Téléchargements

Momentum in Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager