Experiment Analysis in Newspaper Topic Detection - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2000

Experiment Analysis in Newspaper Topic Detection

Résumé

This paper presents several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at finding the most representative words of a topic. Several methods have been experimented, the first one is based on perplexity, this method achieves a 100% topic identification rate, on large test corpora, when the two first propositions are taken into account. Other methods are based on statistical counts and achieve 94% of identification on smaller test corpora. The most challenge of this work is to identify topics with only few words in order to be able, during speech recognition, to determine the best adequate language model.
Fichier principal
Vignette du fichier
SPIRE00.pdf (91.81 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00099394 , version 1 (21-11-2017)

Identifiants

  • HAL Id : inria-00099394 , version 1

Citer

Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Experiment Analysis in Newspaper Topic Detection. SPIRE 2000 - String Processing & Information Retrieval, 2000, A Coruna, Spain. pp.55 - 64. ⟨inria-00099394⟩
121 Consultations
188 Téléchargements

Partager

Gmail Facebook X LinkedIn More