Deep Reinforcement Learning for Audio-Visual Gaze Control

Stéphane Lathuilière 1 Benoit Massé 1 Pablo Mesejo 1 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : We address the problem of audiovisual gaze control in the specific context of human-robot interaction, namely how controlled robot motions are combined with visual and acoustic observations in order to direct the robot head towards targets of interest. The paper has the following contributions: (i) a novel audiovisual fusion framework that is well suited for controlling the gaze of a robotic head; (ii) a reinforcement learning (RL) formulation for the gaze control problem, using a reward function based on the available temporal sequence of camera and microphone observations; and (iii) several deep architectures that allow to experiment with early and late fusion of audio and visual data. We introduce a simulated environment that enables us to learn the proposed deep RL model without the need of spending hours of tedious interaction. By thoroughly experimenting on a publicly available dataset and on a real robot, we provide empirical evidence that our method achieves state-of-the-art performance.
Type de document :
Communication dans un congrès
IROS 2018 - IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2018, Madrid, Spain. IEEE, pp.1555-1562, 〈10.1109/IROS.2018.8594327〉
Liste complète des métadonnées

Littérature citée [2 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01851738
Contributeur : Team Perception <>
Soumis le : lundi 30 juillet 2018 - 17:44:41
Dernière modification le : jeudi 7 février 2019 - 16:22:00
Document(s) archivé(s) le : mercredi 31 octobre 2018 - 14:32:09

Fichier

main.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Stéphane Lathuilière, Benoit Massé, Pablo Mesejo, Radu Horaud. Deep Reinforcement Learning for Audio-Visual Gaze Control. IROS 2018 - IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct 2018, Madrid, Spain. IEEE, pp.1555-1562, 〈10.1109/IROS.2018.8594327〉. 〈hal-01851738〉

Partager

Métriques

Consultations de la notice

304

Téléchargements de fichiers

129