Skip to Main content Skip to Navigation
Conference papers

Topic Identification Challenge Based on Short Word History

Armelle Brun 1 Kamel Smaïli 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents several methods for topic detection on newspaper articles based on either a general vocabulary or a set of topic vocabularies. Our topic detection methods will be applied to speech recognition framework. The originality and the difficulty of our work lies in the fact that both training and test corpora contain few words (less than 200 words for test corpora). Test corpora are very small because our objective is to identify topic and adapt the language model, after uttering only few words. Experiments show that beyond 60 words, topic detection methods are not reliable. On and after 80 words, topic detection rate reaches 82% for the two first hypotheses, which is promising due to the conditions of our experimentation.
Document type :
Conference papers
Complete list of metadata
Contributor : Publications Loria Connect in order to contact the contributor
Submitted on : Tuesday, September 26, 2006 - 8:51:07 AM
Last modification on : Friday, February 26, 2021 - 3:28:06 PM


  • HAL Id : inria-00099124, version 1



Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Topic Identification Challenge Based on Short Word History. Traitement Automatique du Langage Naturel - TALN'00, 2000, Lausanne, Suisse, pp.383-392. ⟨inria-00099124⟩



Record views