Topic Identification Challenge Based on Short Word History

Armelle Brun 1 Kamel Smaïli 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents several methods for topic detection on newspaper articles based on either a general vocabulary or a set of topic vocabularies. Our topic detection methods will be applied to speech recognition framework. The originality and the difficulty of our work lies in the fact that both training and test corpora contain few words (less than 200 words for test corpora). Test corpora are very small because our objective is to identify topic and adapt the language model, after uttering only few words. Experiments show that beyond 60 words, topic detection methods are not reliable. On and after 80 words, topic detection rate reaches 82% for the two first hypotheses, which is promising due to the conditions of our experimentation.
Type de document :
Communication dans un congrès
Traitement Automatique du Langage Naturel - TALN'00, 2000, Lausanne, Suisse, pp.383-392, 2000
Liste complète des métadonnées

https://hal.inria.fr/inria-00099124
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 08:51:07
Dernière modification le : jeudi 11 janvier 2018 - 06:19:57

Identifiants

  • HAL Id : inria-00099124, version 1

Collections

Citation

Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Topic Identification Challenge Based on Short Word History. Traitement Automatique du Langage Naturel - TALN'00, 2000, Lausanne, Suisse, pp.383-392, 2000. 〈inria-00099124〉

Partager

Métriques

Consultations de la notice

168