Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints

Ghazi Bouselmi; Dominique Fohr; Irina Illina; Jean-Paul Haton

Communication Dans Un Congrès Année : 2006

Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints

(1) , (1) , (1) , (1)

Ghazi Bouselmi

Fonction : Auteur
PersonId : 836336

Analysis, perception and recognition of speech

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Analysis, perception and recognition of speech

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Analysis, perception and recognition of speech

Jean-Paul Haton

Fonction : Auteur
PersonId : 830987

Analysis, perception and recognition of speech

Résumé

In this paper we present an automated approach for non-native speech recognition. We introduce a new phonetic confusion concept that associates sequences of native language (NL) phones to spoken language (SL) phones. Phonetic confusion rules are automatically extracted from a non-native speech database for a given NL and SL using both NL's and SL's ASR systems. These rules are used to modify the acoustic models (HMMs) of SL's ASR by adding acoustic models of NL's phones according to these rules. As pronunciation errors that non-native speakers produce depend on the writing of the words, we have also used graphemic constraints in the phonetic confusion extraction process. In the lexicon, the phones in words' pronunciations are linked to the corresponding graphemes (characters) of the word. In this way, the phonetic confusion is established between couples of (SL phones, graphemes) and sequences of NL phones. We evaluated our approach on French, Italian, Spanish and Greek non-native speech databases. The spoken language is English. The modified ASR system achieved significant improvements ranging from 20.3% to 43.2% (relative) in sentence error rate and from 26.6% to 50.0% in WER.

Mots clés

non-native speech recognition pronunciation modelling multilingual HMM modification

Domaines

Informatique et langage [cs.CL]

Fichier principal

interspeech2006.pdf (148.18 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Bouselmi Ghazi : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00110496

Soumis le : samedi 9 décembre 2006-20:06:29

Dernière modification le : vendredi 24 mars 2023-14:52:48

Archivage à long terme le : mardi 6 avril 2010-21:16:10

Dates et versions

inria-00110496 , version 1 (09-12-2006)

Identifiants

HAL Id : inria-00110496 , version 1

Citer

Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean-Paul Haton. Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints. The Ninth International Conference on Spoken Language Processing - ICSLP 2006, Sep 2006, Pittsburgh, PA/USA. ⟨inria-00110496⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

259 Consultations

247 Téléchargements

Multilingual Non-Native Speech Recognition using Phonetic Confusion-Based Acoustic Model Modification and Graphemic Constraints

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager