Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2017

Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Résumé

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and adapt its gaze control strategy for human-robot interaction without the use of external sensors or human supervision. The robot learns to focus its attention on groups of people from its own audio-visual experiences, and independently of the number of people in the environment, their position and physical appearance. In particular, we use recurrent neural networks and Q-learning to find an optimal action-selection policy, and we pretrain on a synthetic environment that simulates sound sources and moving participants to avoid the need of interacting with people for hours. Our experimental evaluation suggests that the proposed method is robust in terms of parameters configuration (i.e. the selection of the parameter values has not a decisive impact on the performance). The best results are obtained when audio and video information are jointly used, and when a late fusion strategy is employed (i.e. when both sources of information are separately processed and then fused). Successful experiments on a real environment with the Nao robot indicate that our framework is a step forward towards the autonomous learning of a perceivable and socially acceptable gaze behavior.
Fichier principal
Vignette du fichier
Lathuiliere-arxiv-v1.pdf (3.28 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01643775 , version 1 (21-11-2017)
hal-01643775 , version 2 (25-04-2018)

Identifiants

Citer

Stéphane Lathuilière, Benoît Massé, Pablo Mesejo, Radu Horaud. Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction. 2017. ⟨hal-01643775v1⟩
682 Consultations
391 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More