Online Multimodal Speaker Detection for Humanoid Robots

In this paper we address the problem of audio-visual speaker detection. We introduce an online system working on the humanoid robot NAO. The scene is perceived with two cameras and two microphones. A multimodal Gaussian mixture model (mGMM) fuses the information extracted from the auditory and visual sensors and detects the most probable audio-visual object, e.g., a person emitting a sound, in the 3D space. The system is implemented on top of a platform-independent middleware and it is able to process the information online (17Hz). A detailed description of the system and its implementation are provided, with special emphasis on the online processing issues and the proposed solutions. Experimental validation, performed with five different scenarios, show that that the proposed method opens the door to robust human-robot interaction scenarios.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV] Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

Sanchez-Humanoids2012.pdf (755.72 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Perception team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00768764

Soumis le : dimanche 23 décembre 2012-19:44:32

Dernière modification le : samedi 27 avril 2024-03:13:22

Archivage à long terme le : dimanche 24 mars 2013-03:51:06

Dates et versions

hal-00768764 , version 1 (23-12-2012)

Identifiants

HAL Id : hal-00768764 , version 1
DOI : 10.1109/HUMANOIDS.2012.6651509

Citer

Jordi Sanchez-Riera, Xavier Alameda-Pineda, Johannes Wienke, Antoine Deleforge, Soraya Arias, et al.. Online Multimodal Speaker Detection for Humanoid Robots. Humanoids 2012 - IEEE International Conference on Humanoid Robotics, Nov 2012, Osaka, Japan. pp.126-133, ⟨10.1109/HUMANOIDS.2012.6651509⟩. ⟨hal-00768764⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA INSMI LJK LJK_GI LJK_GI_PERCEPTION INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

519 Consultations

363 Téléchargements