MERL: Multi-Head Reinforcement Learning

Yannis Flet-Berliac; Philippe Preux

Communication Dans Un Congrès Année : 2019

MERL: Multi-Head Reinforcement Learning

(1) , (1, 2)

1
2

Yannis Flet-Berliac

Fonction : Auteur
PersonId : 174111
IdHAL : yannis-flet-berliac
ORCID : 0000-0002-1191-0048

Sequential Learning

Philippe Preux

Fonction : Auteur
PersonId : 5488
IdHAL : preux-philippe
IdRef : 059896353

Sequential Learning

Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189

Résumé

A common challenge in reinforcement learning is how to convert the agent's interactions with an environment into fast and robust learning. For instance, earlier work makes use of domain knowledge to improve existing reinforcement learning algorithms in complex tasks. While promising, previously acquired knowledge is often costly and challenging to scale up. Instead, we decide to consider problem knowledge with signals from quantities relevant to solve any task, e.g., self-performance assessment and accurate expectations. $\mathcal{V}^{ex}$ is such a quantity. It is the fraction of variance explained by the value function $V$ and measures the discrepancy between $V$ and the returns. Taking advantage of $\mathcal{V}^{ex}$, we propose MERL, a general framework for structuring reinforcement learning by injecting problem knowledge into policy gradient updates. As a result, the agent is not only optimized for a reward but learns using problem-focused quantities provided by MERL, applicable out-of-the-box to any task. In this paper: (a) We introduce and define MERL, the multi-head reinforcement learning framework we use throughout this work. (b) We conduct experiments across a variety of standard benchmark environments, including 9 continuous control tasks, where results show improved performance. (c) We demonstrate that MERL also improves transfer learning on a set of challenging pixel-based tasks. (d) We ponder how MERL tackles the problem of reward sparsity and better conditions the feature space of reinforcement learning agents.

Mots clés

policy gradient reinforcement learning auxiliary tasks

Domaines

Apprentissage [cs.LG] Machine Learning [stat.ML] Intelligence artificielle [cs.AI]

Fichier principal

main.pdf (1.44 Mo)

images/ablation/HalfCheetah.pdf (48.34 Ko)

images/ablation/Swimmer.pdf (47.82 Ko)

images/ablation/Walker2d.pdf (57.77 Ko)

images/atari/1.png (352.7 Ko)

images/atari/2.png (449.25 Ko)

images/atari/3.png (649.92 Ko)

images/atari/4.png (464.28 Ko)

images/atari/5.png (1.18 Mo)

images/atari/6.png (401.85 Ko)

images/overview.pdf (89.83 Ko)

images/ppo/Ant-v2.pdf (30.89 Ko)

images/ppo/HalfCheetah-v2.pdf (30.19 Ko)

images/ppo/Hopper-v2.pdf (33.12 Ko)

images/ppo/Humanoid-v2.pdf (33.34 Ko)

images/ppo/InvertedDoublePendulum-v2.pdf (36.65 Ko)

images/ppo/InvertedPendulum-v2.pdf (34.6 Ko)

images/ppo/Reacher-v2.pdf (30.27 Ko)

images/ppo/Swimmer-v2.pdf (29.59 Ko)

images/ppo/Walker2d-v2.pdf (33.59 Ko)

images/transfer/Asterix-Enduro.pdf (36.94 Ko)

images/transfer/Asterix-MsPacman.pdf (48.91 Ko)

images/transfer/BeamRider-Enduro.pdf (38.74 Ko)

images/transfer/BeamRider-MsPacman.pdf (52.07 Ko)

images/transfer/CrazyClimber-Asterix.pdf (47.17 Ko)

images/transfer/CrazyClimber-BeamRider.pdf (48.4 Ko)

images/transfer/CrazyClimber-Enduro.pdf (43.69 Ko)

images/transfer/CrazyClimber-MsPacman.pdf (51.58 Ko)

images/transfer/CrazyClimber-VideoPinball.pdf (49.68 Ko)

images/transfer/Enduro-MsPacman.pdf (51.14 Ko)

images/transfer/MsPacman-Asterix.pdf (47.21 Ko)

images/transfer/MsPacman-BeamRider.pdf (48.47 Ko)

images/transfer/MsPacman-CrazyClimber.pdf (47.82 Ko)

images/transfer/MsPacman-Enduro.pdf (40.1 Ko)

images/transfer/MsPacman-VideoPinball.pdf (50.08 Ko)

images/transfer/VideoPinball-Asterix.pdf (45.09 Ko)

images/transfer/VideoPinball-BeamRider.pdf (48.1 Ko)

images/transfer/VideoPinball-CrazyClimber.pdf (47.27 Ko)

images/transfer/VideoPinball-Enduro.pdf (40.72 Ko)

images/transfer/VideoPinball-MsPacman.pdf (51.31 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Yannis Flet-Berliac : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02305105

Soumis le : vendredi 29 novembre 2019-15:28:14

Dernière modification le : mercredi 24 janvier 2024-09:54:23

Dates et versions

hal-02305105 , version 1 (03-10-2019)

hal-02305105 , version 2 (13-10-2019)

hal-02305105 , version 3 (29-11-2019)

Identifiants

HAL Id : hal-02305105 , version 3
ARXIV : 1909.11939

Citer

Yannis Flet-Berliac, Philippe Preux. MERL: Multi-Head Reinforcement Learning. Deep Reinforcement Learning Workshop, NeurIPS, Dec 2019, Vancouver, Canada. ⟨hal-02305105v3⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CRISTAL INRIA2 CRISTAL-SEQUEL UNIV-LILLE

278 Consultations

1089 Téléchargements

MERL: Multi-Head Reinforcement Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager