inria-00100021, version 1
Statistical Feature Language Model
8th International Conference on Spoken Language Processing - ICSLP' 2004 (2004) 4 p
Résumé : Statistical language models are widely used in automatic speech recognition in order to constrain the decoding of a sentence. Most of these models derive from the classical n-gram paradigm. However, the production of a word dends on a large set of linguistic features : lexical, syntactic, semantic, etc. Moreover, in some natural languages the gender and number of the left context affect the production of the next word. Therefore, it seems attractive to design a language model based on a variety of word features. We present in this paper a new statistical language model, called Statistical Feature Language Model, SFLM, based on this idea. In SFLM a word is considered as an array of linguistic features, and the model is defined in a way similar to the n-gram model. Experiments carried out for French and show an improvement in terms of perplexity and predicted words.
- a – UNIVERSITE NANCY 2
- b – UNIVERSITE HENRI POINCARE
- c – IUFM DE LORRAINE
- 1 :
- INRIA – CNRS : UMR7503 – Université Henri Poincaré - Nancy I – Université Nancy II – Institut National Polytechnique de Lorraine (INPL)
- Domaine : Informatique/Autre
- Mots-clés : statistical language modeling – automatic speech recognition || modélisation statistique du langage – reconnaissance automatique de la parole
- Référence interne : A04-R-253 || smaili04a
- Commentaire : Colloque avec actes et comité de lecture. internationale.
- inria-00100021, version 1
- http://hal.inria.fr/inria-00100021
- oai:hal.inria.fr:inria-00100021
- Contributeur :
- Soumis le : Mardi 26 Septembre 2006, 10:13:24
- Dernière modification le : Jeudi 28 Septembre 2006, 15:22:46


Exporter