Skip to Main content Skip to Navigation
New interface
Conference papers

Online Multimodal Speaker Detection for Humanoid Robots

Jordi Sanchez-Riera 1 Xavier Alameda-Pineda 1, * Johannes Wienke 2 Antoine Deleforge 1 Soraya Arias 3 Jan Cech 1 Sebastian Wrede 4 Radu Horaud 1 
* Corresponding author
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology
Abstract : In this paper we address the problem of audio-visual speaker detection. We introduce an online system working on the humanoid robot NAO. The scene is perceived with two cameras and two microphones. A multimodal Gaussian mixture model (mGMM) fuses the information extracted from the auditory and visual sensors and detects the most probable audio-visual object, e.g., a person emitting a sound, in the 3D space. The system is implemented on top of a platform-independent middleware and it is able to process the information online (17Hz). A detailed description of the system and its implementation are provided, with special emphasis on the online processing issues and the proposed solutions. Experimental validation, performed with five different scenarios, show that that the proposed method opens the door to robust human-robot interaction scenarios.
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download
Contributor : Perception team Connect in order to contact the contributor
Submitted on : Sunday, December 23, 2012 - 7:44:32 PM
Last modification on : Thursday, May 5, 2022 - 3:11:27 AM
Long-term archiving on: : Sunday, March 24, 2013 - 3:51:06 AM


Files produced by the author(s)




Jordi Sanchez-Riera, Xavier Alameda-Pineda, Johannes Wienke, Antoine Deleforge, Soraya Arias, et al.. Online Multimodal Speaker Detection for Humanoid Robots. Humanoids 2012 - IEEE International Conference on Humanoid Robotics, Nov 2012, Osaka, Japan. pp.126-133, ⟨10.1109/HUMANOIDS.2012.6651509⟩. ⟨hal-00768764⟩



Record views


Files downloads