Efficient Language Models Combination: Application to Phrase Finding

David Langlois; Kamel Smaïli; Jean-Paul Haton

Communication Dans Un Congrès Année : 2001

Efficient Language Models Combination: Application to Phrase Finding

(1) , (1) , (1)

David Langlois

Fonction : Auteur
PersonId : 298
IdHAL : david-langlois
IdRef : 070239509

Analysis, perception and recognition of speech

Kamel Smaïli

Fonction : Auteur
PersonId : 2521
IdHAL : kamel-smaili
IdRef : 034429700

Analysis, perception and recognition of speech

Jean-Paul Haton

Fonction : Auteur
PersonId : 830987

Analysis, perception and recognition of speech

Résumé

In this paper, we propose a new approach to combine several language models more efficiently than with a classical linear interpolation. This new language model is referred to as the Selected History Principle. In this model, the perplexity measure is used to select for each history, the best language model. This method is tested with two language models: bigram and distant bigram. It achieves an improvement of 6 points in terms of perplexity in comparison to a linear interpolation. We also take advantage from the Selected History Principle in order to retrieve a set of useful variable length phrases. 10000 of them have been selected and integrated into the vocabulary. Then, we build a phrase-based bigram model which achieves an improvement of 18% in comparison to a baseline bigram.

Mots clés

distant relationship combination modèle de langage statistique statistical language modelling phrase séquence relation distante combinaison

Domaines

Autre [cs.OH]

Publications Loria : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00100650

Soumis le : mardi 26 septembre 2006-14:48:39

Dernière modification le : vendredi 24 mars 2023-14:52:48

Dates et versions

inria-00100650 , version 1 (26-09-2006)

Identifiants

HAL Id : inria-00100650 , version 1

Citer

David Langlois, Kamel Smaïli, Jean-Paul Haton. Efficient Language Models Combination: Application to Phrase Finding. Proceedings of the International Workshop "Speech and Computer" - SPECOM 2001, 2001, Moscow, Russia, 4 p. ⟨inria-00100650⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

60 Consultations

0 Téléchargements

Efficient Language Models Combination: Application to Phrase Finding

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager