Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only

Denis Jouvet 1 Dominique Fohr 1 Irina Illina 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : This paper shows that web pronunciations such as those available in the Wiktionary can be used efficiently for building a pronunciation lexicon for a speech transcription system. The pronunciations of words and expressions extracted from the Wiktionary provided data that were used either directly or for training data-driven grapheme-to-phoneme conversion systems, in order to make possible the development of pronunciation lexicons. As an example, French Wiktionary pronunciations were extracted. The derived pronunciations lexicons were then used for training the acoustic models and for evaluating speech recognition performance on the French broadcast news data from the ESTER2 campaign. Moreover, the results show that combining the pronunciations delivered by two different grapheme-to-phoneme conversion systems yields significant performance improvement.
Type de document :
Communication dans un congrès
XIV International Conference "Speech and Computer" (SPECOM'2011), Sep 2011, Kazan, Russia. 2011
Liste complète des métadonnées

https://hal.inria.fr/inria-00616330
Contributeur : Denis Jouvet <>
Soumis le : lundi 22 août 2011 - 10:37:48
Dernière modification le : jeudi 11 janvier 2018 - 06:19:56

Identifiants

  • HAL Id : inria-00616330, version 1

Collections

Citation

Denis Jouvet, Dominique Fohr, Irina Illina. Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only. XIV International Conference "Speech and Computer" (SPECOM'2011), Sep 2011, Kazan, Russia. 2011. 〈inria-00616330〉

Partager

Métriques

Consultations de la notice

248