Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Stéphane Lathuilière 1 Benoît Massé 1 Pablo Mesejo 1 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and adapt its gaze control strategy for human-robot interaction without the use of external sensors or human supervision. The robot learns to focus its attention on groups of people from its own audio-visual experiences, and independently of the number of people in the environment, their position and physical appearance. In particular, we use recurrent neural networks and Q-learning to find an optimal action-selection policy, and we pretrain on a synthetic environment that simulates sound sources and moving participants to avoid the need of interacting with people for hours. Our experimental evaluation suggests that the proposed method is robust in terms of parameters configuration (i.e. the selection of the parameter values has not a decisive impact on the performance). The best results are obtained when audio and video information are jointly used, and when a late fusion strategy is employed (i.e. when both sources of information are separately processed and then fused). Successful experiments on a real environment with the Nao robot indicate that our framework is a step forward towards the autonomous learning of a perceivable and socially acceptable gaze behavior.
Liste complète des métadonnées

Littérature citée [35 références]  Voir  Masquer  Télécharger
Contributeur : Team Perception <>
Soumis le : mardi 21 novembre 2017 - 16:15:47
Dernière modification le : mardi 17 avril 2018 - 09:05:23


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01643775, version 1
  • ARXIV : 1711.06834


Stéphane Lathuilière, Benoît Massé, Pablo Mesejo, Radu Horaud. Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction. 14 pages. 2017. 〈hal-01643775〉



Consultations de la notice


Téléchargements de fichiers