Topic Identification Challenge Based on Short Word History - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2000

Topic Identification Challenge Based on Short Word History

Résumé

This paper presents several methods for topic detection on newspaper articles based on either a general vocabulary or a set of topic vocabularies. Our topic detection methods will be applied to speech recognition framework. The originality and the difficulty of our work lies in the fact that both training and test corpora contain few words (less than 200 words for test corpora). Test corpora are very small because our objective is to identify topic and adapt the language model, after uttering only few words. Experiments show that beyond 60 words, topic detection methods are not reliable. On and after 80 words, topic detection rate reaches 82% for the two first hypotheses, which is promising due to the conditions of our experimentation.
Fichier non déposé

Dates et versions

inria-00099124 , version 1 (26-09-2006)

Identifiants

  • HAL Id : inria-00099124 , version 1

Citer

Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Topic Identification Challenge Based on Short Word History. Traitement Automatique du Langage Naturel - TALN'00, 2000, Lausanne, Suisse, pp.383-392. ⟨inria-00099124⟩
82 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More