Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?

Johann Poignant; Laurent Besacier; Viet Bac Le; Sophie Rosset; Georges Quénot

Communication Dans Un Congrès Année : 2013

Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?

(1) , (2) , (3) , (4) , (1)

1
2
3
4

Johann Poignant

Fonction : Auteur

Modélisation et Recherche d’Information Multimédia [Grenoble]

Laurent Besacier

Fonction : Auteur
PersonId : 1521
IdHAL : laurent-besacier
ORCID : 0000-0001-7411-9125
IdRef : 079377017

Groupe d’Étude en Traduction Automatique/Traitement Automatisé des Langues et de la Parole

Viet Bac Le

Fonction : Auteur

Vocapia Research [Orsay]

Sophie Rosset

Fonction : Auteur

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Georges Quénot

Fonction : Auteur
PersonId : 3114
IdHAL : georges-quenot
ORCID : 0000-0003-2117-247X
IdRef : 034104518

Modélisation et Recherche d’Information Multimédia [Grenoble]

Résumé

Persons identification in video from TV broadcast is a valuable tool for indexing them. However, the use of biometric mod- els is not a very sustainable option without a priori knowledge of people present in the videos. The pronounced names (PN) or written names (WN) on the screen can provide hypotheses names for speakers. We propose an experimental comparison of the potential of these two modalities (names pronounced or written) to extract the true names of the speakers. The names pronounced offer many instances of citation but transcription and named-entity detection errors halved the potential of this modality. On the contrary, the written names detection benefits of the video quality improvement and is nowadays rather robust and efficient to name speakers. Oracle experiments presented for the mapping between written names and speakers also show the complementarity of both PN and WN modalities.

Domaines

Recherche d'information [cs.IR]

Fichier principal

POIGNANT--INTERSPEECH--2013.pdf (1.06 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Marie-Christine Fauvet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00953088

Soumis le : lundi 3 mars 2014-16:34:16

Dernière modification le : vendredi 5 avril 2024-03:24:14

Archivage à long terme le : samedi 31 mai 2014-10:45:40

Dates et versions

hal-00953088 , version 1 (03-03-2014)

Identifiants

HAL Id : hal-00953088 , version 1

Citer

Johann Poignant, Laurent Besacier, Viet Bac Le, Sophie Rosset, Georges Quénot. Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?. the 14rd Annual Conference of the International Speech Communication Association, INTERSPEECH, 2013, Lyon, France. ⟨hal-00953088⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIMSI LIG_TDCGE LIG_TDCGE_GETALP LIG_TDCGE_MRIM UNIV-PARIS-SACLAY SORBONNE-UNIVERSITE POLYTECH-GRENOBLE LISN GS-SPORT-HUMAN-MOVEMENT LIG_SIDCH

282 Consultations

139 Téléchargements

Unsupervised naming of speakers in broadcast TV: using written names, pronounced names or both ?

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager