Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Residual Reinforcement Learning from Demonstrations

Minttu Alakuijala 1, 2, 3 Gabriel Dulac-Arnold 3 Julien Mairal 2 Jean Ponce 1 Cordelia Schmid 3 
1 WILLOW - Models of visual object recognition and scene understanding
DI-ENS - Département d'informatique - ENS Paris, Inria de Paris
2 Thoth - Apprentissage de modèles à partir de données massives
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann
Abstract : Residual reinforcement learning (RL) has been proposed as a way to solve challenging robotic tasks by adapting control actions from a conventional feedback controller to maximize a reward signal. We extend the residual formulation to learn from visual inputs and sparse rewards using demonstrations. Learning from images, proprioceptive inputs and a sparse task-completion reward relaxes the requirement of accessing full state features, such as object and target positions. In addition, replacing the base controller with a policy learned from demonstrations removes the dependency on a hand-engineered controller in favour of a dataset of demonstrations, which can be provided by non-experts. Our experimental evaluation on simulated manipulation tasks on a 6-DoF UR5 arm and a 28-DoF dexterous hand demonstrates that residual RL from demonstrations is able to generalize to unseen environment conditions more flexibly than either behavioral cloning or RL fine-tuning, and is capable of solving high-dimensional, sparse-reward tasks out of reach for RL from scratch.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

https://hal.inria.fr/hal-03260683
Contributor : Minttu Alakuijala Connect in order to contact the contributor
Submitted on : Tuesday, June 15, 2021 - 10:18:21 AM
Last modification on : Wednesday, June 8, 2022 - 12:50:06 PM
Long-term archiving on: : Thursday, September 16, 2021 - 6:22:46 PM

Files

RRLfD.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-03260683, version 1
  • ARXIV : 2106.08050

Collections

Citation

Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid. Residual Reinforcement Learning from Demonstrations. 2021. ⟨hal-03260683⟩

Share

Metrics

Record views

81

Files downloads

113