HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Towards Audio-Visual On-line Diarization Of Participants In Group Meetings

Abstract : We propose a fully automated, unsupervised, and non-intrusive method of identifying the current speaker audio-visually in a group conversation. This is achieved without specialized hardware, user interaction, or prior assignment of microphones to participants. Speakers are identified acoustically using a novel on-line speaker diarization approach. The output is then used to find the corresponding person in a four-camera video stream by approximating individual activity with computationally efficient features. We present results showing the robustness of the association on over 4.5 hours of non-scripted audio-visual meeting data.
Document type :
Conference papers
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download

Contributor : Peter Sturm Connect in order to contact the contributor
Submitted on : Sunday, October 5, 2008 - 1:47:28 PM
Last modification on : Thursday, February 7, 2019 - 5:55:47 PM
Long-term archiving on: : Thursday, June 3, 2010 - 10:20:29 PM


Files produced by the author(s)


  • HAL Id : inria-00326746, version 1



Hayley Hung, Gerald Friedland. Towards Audio-Visual On-line Diarization Of Participants In Group Meetings. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Andrea Cavallaro and Hamid Aghajan, Oct 2008, Marseille, France. ⟨inria-00326746⟩



Record views


Files downloads