Dynamic Topic Identification: Towards Combination of Methods - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2001

Dynamic Topic Identification: Towards Combination of Methods

Résumé

This paper presents several statistical methods for topic identification (TID): topic unigrams, cache model, TFIDF classifier, topic perplexity, and weighted model. Our work aims to improve these methods by confronting them to very different data, measuring their potential complementarity and their TID performance with simple combinations. Statistical topic identification methods depend not only on a corpus, but also on its type. This study allows to advance the cache model which achieves a TID performance of 82 %. This performance has been increased to 82.3 % with our best linear combination.

Domaines

Autre [cs.OH]
Fichier non déposé

Dates et versions

inria-00100481 , version 1 (26-09-2006)

Identifiants

  • HAL Id : inria-00100481 , version 1

Citer

Brigitte Bigi, Armelle Brun, Jean-Paul Haton, Kamel Smaïli, Imed Zitouni. Dynamic Topic Identification: Towards Combination of Methods. Recent Advances in Natural Language Processing - RANLP'2001, Galia Angelova, Kalima Bontcheva, Ruslan Mitkov, Nicolas Nicolov, Nikolai Nikolov, 2001, Tzigov Chark, Bulgaria, pp.255-257. ⟨inria-00100481⟩
113 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More