Skip to Main content Skip to Navigation
Conference papers

Improving language models by using distant information

Armelle Brun 1 David Langlois 1 Kamel Smaïli 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This study examines how to take originally advantage from distant information instatistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers.
Document type :
Conference papers
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/inria-00187084
Contributor : Armelle Brun <>
Submitted on : Tuesday, November 13, 2007 - 3:34:25 PM
Last modification on : Thursday, January 11, 2018 - 6:19:56 AM
Document(s) archivé(s) le : Monday, April 12, 2010 - 2:04:36 AM

File

ISSAP2007.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00187084, version 1

Collections

Citation

Armelle Brun, David Langlois, Kamel Smaïli. Improving language models by using distant information. International Symposium on Signal Processing and its Applications - ISSPA 2007, Feb 2007, Sharjah, United Arab Emirates. ⟨inria-00187084⟩

Share

Metrics

Record views

215

Files downloads

378