Indices utiles à la cohésion lexicale pour la segmentation thématique de documents oraux

Camille Guinaudeau 1 Guillaume Gravier 2, * Pascale Sébillot 1
* Corresponding author
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : The increasing quantity of TV material requires methods to help users navigate such data streams. Topic segmentation of TV broadcast is a rst stage to structuring tasks. The goal of this article is to determine to what extent condence measures and semantics can compensate errors in automatic transcripts for topic segmentation. To this end, we introduce condence measure and semantic relations in a topic segmentation method. We show that our F1-measure is improved by +1.5 and +1.9 when integrating condence measure and semantic relations respectively. Such improvement demonstrates that simple clues can conteract errors in automatic transcripts for topic segmentation.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [7 references]  Display  Hide  Download

https://hal.inria.fr/inria-00533388
Contributor : Patrick Gros <>
Submitted on : Friday, November 5, 2010 - 10:18:01 PM
Last modification on : Friday, November 16, 2018 - 1:24:30 AM
Document(s) archivé(s) le : Friday, October 26, 2012 - 3:02:57 PM

File

guinaudeau_jep2010.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00533388, version 1

Citation

Camille Guinaudeau, Guillaume Gravier, Pascale Sébillot. Indices utiles à la cohésion lexicale pour la segmentation thématique de documents oraux. XXVIIIe journées d'études de la parole, May 2010, Mons, Belgique. ⟨inria-00533388⟩

Share

Metrics

Record views

363

Files downloads

198