Statistical Feature Language Model - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Statistical Feature Language Model

Résumé

Statistical language models are widely used in automatic speech recognition in order to constrain the decoding of a sentence. Most of these models derive from the classical n-gram paradigm. However, the production of a word dends on a large set of linguistic features : lexical, syntactic, semantic, etc. Moreover, in some natural languages the gender and number of the left context affect the production of the next word. Therefore, it seems attractive to design a language model based on a variety of word features. We present in this paper a new statistical language model, called Statistical Feature Language Model, SFLM, based on this idea. In SFLM a word is considered as an array of linguistic features, and the model is defined in a way similar to the n-gram model. Experiments carried out for French and show an improvement in terms of perplexity and predicted words.
Fichier principal
Vignette du fichier
salma1.pdf (51.28 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00100021 , version 1 (21-11-2017)

Identifiants

  • HAL Id : inria-00100021 , version 1

Citer

Kamel Smaïli, Salma Jamoussi, David Langlois, Jean-Paul Haton. Statistical Feature Language Model. 8th International Conference on Spoken Language Processing - ICSLP' 2004, 2004, Jeju, South Korea. 4 p. ⟨inria-00100021⟩
133 Consultations
115 Téléchargements

Partager

Gmail Facebook X LinkedIn More