Efficient Language Models Combination: Application to Phrase Finding

David Langlois 1 Kamel Smaïli 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In this paper, we propose a new approach to combine several language models more efficiently than with a classical linear interpolation. This new language model is referred to as the Selected History Principle. In this model, the perplexity measure is used to select for each history, the best language model. This method is tested with two language models: bigram and distant bigram. It achieves an improvement of 6 points in terms of perplexity in comparison to a linear interpolation. We also take advantage from the Selected History Principle in order to retrieve a set of useful variable length phrases. 10000 of them have been selected and integrated into the vocabulary. Then, we build a phrase-based bigram model which achieves an improvement of 18% in comparison to a baseline bigram.
Type de document :
Communication dans un congrès
Proceedings of the International Workshop "Speech and Computer" - SPECOM 2001, 2001, Moscow, Russia, 4 p, 2001
Liste complète des métadonnées

https://hal.inria.fr/inria-00100650
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 14:48:39
Dernière modification le : jeudi 11 janvier 2018 - 06:19:55

Identifiants

  • HAL Id : inria-00100650, version 1

Collections

Citation

David Langlois, Kamel Smaïli, Jean-Paul Haton. Efficient Language Models Combination: Application to Phrase Finding. Proceedings of the International Workshop "Speech and Computer" - SPECOM 2001, 2001, Moscow, Russia, 4 p, 2001. 〈inria-00100650〉

Partager

Métriques

Consultations de la notice

119