Experiment Analysis in Newspaper Topic Detection

Armelle Brun 1 Kamel Smaïli 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at finding the most representative words of a topic. Several methods have been experimented, the first one is based on perplexity, this method achieves a 100% topic identification rate, on large test corpora, when the two first propositions are taken into account. Other methods are based on statistical counts and achieve 94% of identification on smaller test corpora. The most challenge of this work is to identify topics with only few words in order to be able, during speech recognition, to determine the best adequate language model.
Type de document :
Communication dans un congrès
SPIRE 2000 - String Processing & Information Retrieval, 2000, A Coruna, Spain. IEEE Computer Society, pp.55 - 64, 2000
Liste complète des métadonnées

Littérature citée [12 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00099394
Contributeur : Publications Loria <>
Soumis le : mardi 21 novembre 2017 - 10:44:44
Dernière modification le : jeudi 11 janvier 2018 - 06:19:57

Fichier

SPIRE00.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00099394, version 1

Collections

Citation

Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Experiment Analysis in Newspaper Topic Detection. SPIRE 2000 - String Processing & Information Retrieval, 2000, A Coruna, Spain. IEEE Computer Society, pp.55 - 64, 2000. 〈inria-00099394〉

Partager

Métriques

Consultations de la notice

180

Téléchargements de fichiers

5