Skip to Main content Skip to Navigation
Conference papers

Towards Audio-Visual On-line Diarization Of Participants In Group Meetings

Abstract : We propose a fully automated, unsupervised, and non-intrusive method of identifying the current speaker audio-visually in a group conversation. This is achieved without specialized hardware, user interaction, or prior assignment of microphones to participants. Speakers are identified acoustically using a novel on-line speaker diarization approach. The output is then used to find the corresponding person in a four-camera video stream by approximating individual activity with computationally efficient features. We present results showing the robustness of the association on over 4.5 hours of non-scripted audio-visual meeting data.
Document type :
Conference papers
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download
Contributor : Peter Sturm <>
Submitted on : Sunday, October 5, 2008 - 1:47:28 PM
Last modification on : Thursday, February 7, 2019 - 5:55:47 PM
Long-term archiving on: : Thursday, June 3, 2010 - 10:20:29 PM


Files produced by the author(s)


  • HAL Id : inria-00326746, version 1



Hayley Hung, Gerald Friedland. Towards Audio-Visual On-line Diarization Of Participants In Group Meetings. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Andrea Cavallaro and Hamid Aghajan, Oct 2008, Marseille, France. ⟨inria-00326746⟩



Record views


Files downloads