Towards Audio-Visual On-line Diarization Of Participants In Group Meetings

Abstract : We propose a fully automated, unsupervised, and non-intrusive method of identifying the current speaker audio-visually in a group conversation. This is achieved without specialized hardware, user interaction, or prior assignment of microphones to participants. Speakers are identified acoustically using a novel on-line speaker diarization approach. The output is then used to find the corresponding person in a four-camera video stream by approximating individual activity with computationally efficient features. We present results showing the robustness of the association on over 4.5 hours of non-scripted audio-visual meeting data.
Type de document :
Communication dans un congrès
Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Oct 2008, Marseille, France. 2008
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00326746
Contributeur : Peter Sturm <>
Soumis le : dimanche 5 octobre 2008 - 13:47:28
Dernière modification le : lundi 6 octobre 2008 - 09:30:56
Document(s) archivé(s) le : jeudi 3 juin 2010 - 22:20:29

Fichier

1569140064.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00326746, version 1

Collections

Citation

Hayley Hung, Gerald Friedland. Towards Audio-Visual On-line Diarization Of Participants In Group Meetings. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Oct 2008, Marseille, France. 2008. 〈inria-00326746〉

Partager

Métriques

Consultations de la notice

114

Téléchargements de fichiers

130