Skip to Main content Skip to Navigation
New interface
Conference papers

Experiment Analysis in Newspaper Topic Detection

Armelle Brun 1 Kamel Smaïli 1 Jean-Paul Haton 1 
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper presents several methods for topic detection on newspaper articles, using either a general vocabulary or topic-specific vocabularies. Specific vocabularies are determined manually or statistically. In both cases, we aim at finding the most representative words of a topic. Several methods have been experimented, the first one is based on perplexity, this method achieves a 100% topic identification rate, on large test corpora, when the two first propositions are taken into account. Other methods are based on statistical counts and achieve 94% of identification on smaller test corpora. The most challenge of this work is to identify topics with only few words in order to be able, during speech recognition, to determine the best adequate language model.
Document type :
Conference papers
Complete list of metadata

Cited literature [11 references]  Display  Hide  Download
Contributor : Publications Loria Connect in order to contact the contributor
Submitted on : Tuesday, November 21, 2017 - 10:44:44 AM
Last modification on : Saturday, June 25, 2022 - 7:43:20 PM


Files produced by the author(s)


  • HAL Id : inria-00099394, version 1



Armelle Brun, Kamel Smaïli, Jean-Paul Haton. Experiment Analysis in Newspaper Topic Detection. SPIRE 2000 - String Processing & Information Retrieval, 2000, A Coruna, Spain. pp.55 - 64. ⟨inria-00099394⟩



Record views


Files downloads