Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ? - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?

Résumé

Persons identification in video from TV broadcast is a valuable tool for indexing them. However, the use of biometric mod- els is not a very sustainable option without a priori knowledge of people present in the videos. The pronounced names (PN) or written names (WN) on the screen can provide hypotheses names for speakers. We propose an experimental comparison of the potential of these two modalities (names pronounced or written) to extract the true names of the speakers. The names pronounced offer many instances of citation but transcription and named-entity detection errors halved the potential of this modality. On the contrary, the written names detection benefits of the video quality improvement and is nowadays rather robust and efficient to name speakers. Oracle experiments presented for the mapping between written names and speakers also show the complementarity of both PN and WN modalities.
Fichier principal
Vignette du fichier
POIGNANT--INTERSPEECH--2013.pdf (1.06 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00953088 , version 1 (03-03-2014)

Identifiants

  • HAL Id : hal-00953088 , version 1

Citer

Johann Poignant, Laurent Besacier, Viet Bac Le, Sophie Rosset, Georges Quénot. Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?. the 14rd Annual Conference of the International Speech Communication Association, INTERSPEECH, 2013, Lyon, France. ⟨hal-00953088⟩
282 Consultations
139 Téléchargements

Partager

Gmail Facebook X LinkedIn More