Improving language models by using distant information

Armelle Brun 1 David Langlois 1 Kamel Smaïli 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This study examines how to take originally advantage from distant information instatistical language models. We show that it is possible to use n-gram models considering histories different from those used during training. These models are called crossing context models. Our study deals with classical and distant n-gram models. A mixture of four models is proposed and evaluated. A bigram linear mixture achieves an improvement of 14% in terms of perplexity. Moreover the trigram mixture outperforms the standard trigram by 5.6%. These improvements have been obtained without complexifying standard n-gram models. The resulting mixture language model has been integrated into a speech recognition system. Its evaluation achieves a slight improvement in terms of word error rate on the data used for the francophone evaluation campaign ESTER. Finally, the impact of the proposed crossing context language models on performance is presented according to various speakers.
Type de document :
Communication dans un congrès
International Symposium on Signal Processing and its Applications - ISSPA 2007, Feb 2007, Sharjah, United Arab Emirates. 2007
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00187084
Contributeur : Armelle Brun <>
Soumis le : mardi 13 novembre 2007 - 15:34:25
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56
Document(s) archivé(s) le : lundi 12 avril 2010 - 02:04:36

Fichier

ISSAP2007.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00187084, version 1

Collections

Citation

Armelle Brun, David Langlois, Kamel Smaïli. Improving language models by using distant information. International Symposium on Signal Processing and its Applications - ISSPA 2007, Feb 2007, Sharjah, United Arab Emirates. 2007. 〈inria-00187084〉

Partager

Métriques

Consultations de la notice

184

Téléchargements de fichiers

216