Residual Reinforcement Learning from Demonstrations

Minttu Alakuijala; Gabriel Dulac-Arnold; Julien Mairal; Jean Ponce; Cordelia Schmid

Pré-Publication, Document De Travail Année : 2021

Residual Reinforcement Learning from Demonstrations

(1, 2, 3) , (3) , (2) , (1) , (3)

1
2
3

Minttu Alakuijala

Fonction : Auteur
PersonId : 1090798
IdHAL : minttu-alakuijala

Models of visual object recognition and scene understanding

Apprentissage de modèles à partir de données massives

Google Research [Paris]

Gabriel Dulac-Arnold

Fonction : Auteur

Google Research [Paris]

Julien Mairal

Fonction : Auteur
PersonId : 1034832
ORCID : 0000-0001-6991-2110
IdRef : 152125256

Apprentissage de modèles à partir de données massives

Jean Ponce

Fonction : Auteur

Models of visual object recognition and scene understanding

Cordelia Schmid

Fonction : Auteur

Google Research [Paris]

Résumé

Residual reinforcement learning (RL) has been proposed as a way to solve challenging robotic tasks by adapting control actions from a conventional feedback controller to maximize a reward signal. We extend the residual formulation to learn from visual inputs and sparse rewards using demonstrations. Learning from images, proprioceptive inputs and a sparse task-completion reward relaxes the requirement of accessing full state features, such as object and target positions. In addition, replacing the base controller with a policy learned from demonstrations removes the dependency on a hand-engineered controller in favour of a dataset of demonstrations, which can be provided by non-experts. Our experimental evaluation on simulated manipulation tasks on a 6-DoF UR5 arm and a 28-DoF dexterous hand demonstrates that residual RL from demonstrations is able to generalize to unseen environment conditions more flexibly than either behavioral cloning or RL fine-tuning, and is capable of solving high-dimensional, sparse-reward tasks out of reach for RL from scratch.

Domaines

Apprentissage [cs.LG]

Fichier principal

RRLfD.pdf (2.28 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Minttu Alakuijala : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03260683

Soumis le : mardi 15 juin 2021-10:18:21

Dernière modification le : vendredi 19 avril 2024-16:18:58

Archivage à long terme le : jeudi 16 septembre 2021-18:22:46

Dates et versions

hal-03260683 , version 1 (15-06-2021)

Identifiants

HAL Id : hal-03260683 , version 1
ARXIV : 2106.08050

Citer

Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid. Residual Reinforcement Learning from Demonstrations. 2021. ⟨hal-03260683⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UNIV-RENNES1 UGA CNRS INRIA IRISA LJK LJK_GI INRIA2 LJK-GI-THOTH PSL UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES MIAI ANR PRAIRIE-IA UR1-MATH-NUM

129 Consultations

276 Téléchargements

Residual Reinforcement Learning from Demonstrations

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager