Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints

Ghazi Bouselmi 1 Dominique Fohr 1 Irina Illina 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In this paper we present an automated approach for non-native speech recognition. We introduce a new phonetic confusion concept that associates sequences of native language (NL) phones to spoken language (SL) phones. Phonetic confusion rules are automatically extracted from a non-native speech database for a given NL and SL using both NL's and SL's ASR systems. These rules are used to modify the acoustic models (HMMs) of SL's ASR by adding acoustic models of NL's phones according to these rules. As pronunciation errors that non-native speakers produce depend on the writing of the words, we have also used graphemic constraints in the phonetic confusion extraction process. In the lexicon, the phones in words' pronunciations are linked to the corresponding graphemes (characters) of the word. In this way, the phonetic confusion is established between couples of (SL phones, graphemes) and sequences of NL phones. We evaluated our approach on French, Italian, Spanish and Greek non-native speech databases. The spoken language is English. The modified ASR system achieved significant improvements ranging from 20.3% to 43.2% (relative) in sentence error rate and from 26.6% to 50.0% in WER.
Type de document :
Communication dans un congrès
The Ninth International Conference on Spoken Language Processing - ICSLP 2006, Sep 2006, Pittsburgh, PA/USA, 2006
Liste complète des métadonnées

Littérature citée [5 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00110496
Contributeur : Bouselmi Ghazi <>
Soumis le : samedi 9 décembre 2006 - 20:06:29
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56
Document(s) archivé(s) le : mardi 6 avril 2010 - 21:16:10

Fichier

interspeech2006.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00110496, version 1

Collections

Citation

Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean-Paul Haton. Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints. The Ninth International Conference on Spoken Language Processing - ICSLP 2006, Sep 2006, Pittsburgh, PA/USA, 2006. 〈inria-00110496〉

Partager

Métriques

Consultations de la notice

452

Téléchargements de fichiers

182