Utilisation de relations sémantiques pour améliorer la segmentation thématique de documents télévisuels

Camille Guinaudeau 1 Guillaume Gravier 2 Pascale Sébillot 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Topic segmentation methods based on a measure of the lexical cohesion can be applied as is to automatic transcripts of TV programs. However, these methods are less effective in this context as neither the specificities of TV contents, nor those of automatic transcripts are considered. The aim of this paper is to study the use of semantic relations to make segmentation techniques more robust.We propose a method to account for semantic relations in a measure of the lexical cohesion.We show that such relations increase the F1-measure by +1.97 and +11.83 for two data sets consisting of respectively 40h of news and 40h of longer reports on current affairs. These results demonstrate that semantic relations can make segmentation methods less sensitive to transcription errors or to the lack of repetitions in some television programs.
Document type :
Conference papers
Liste complète des métadonnées

https://hal.inria.fr/inria-00533389
Contributor : Patrick Gros <>
Submitted on : Friday, November 5, 2010 - 10:31:01 PM
Last modification on : Friday, November 16, 2018 - 1:24:47 AM
Document(s) archivé(s) le : Friday, October 26, 2012 - 3:03:03 PM

File

guinaudeau_taln2010.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00533389, version 1

Citation

Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot. Utilisation de relations sémantiques pour améliorer la segmentation thématique de documents télévisuels. Traitement automatique des langues naturelles, TALN 2010, Jul 2010, Montréal, Canada. ⟨inria-00533389⟩

Share

Metrics

Record views

478

Files downloads

256