Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only

Denis Jouvet; Dominique Fohr; Irina Illina

Communication Dans Un Congrès Année : 2011

Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only

(1) , (1) , (1)

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Analysis, perception and recognition of speech

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Analysis, perception and recognition of speech

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Analysis, perception and recognition of speech

Résumé

This paper shows that web pronunciations such as those available in the Wiktionary can be used efficiently for building a pronunciation lexicon for a speech transcription system. The pronunciations of words and expressions extracted from the Wiktionary provided data that were used either directly or for training data-driven grapheme-to-phoneme conversion systems, in order to make possible the development of pronunciation lexicons. As an example, French Wiktionary pronunciations were extracted. The derived pronunciations lexicons were then used for training the acoustic models and for evaluating speech recognition performance on the French broadcast news data from the ESTER2 campaign. Moreover, the results show that combining the pronunciations delivered by two different grapheme-to-phoneme conversion systems yields significant performance improvement.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Denis Jouvet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00616330

Soumis le : lundi 22 août 2011-10:37:48

Dernière modification le : vendredi 24 mars 2023-14:52:54

Dates et versions

inria-00616330 , version 1 (22-08-2011)

Identifiants

HAL Id : inria-00616330 , version 1

Citer

Denis Jouvet, Dominique Fohr, Irina Illina. Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only. XIV International Conference "Speech and Computer" (SPECOM'2011), Sep 2011, Kazan, Russia. ⟨inria-00616330⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA

185 Consultations

0 Téléchargements

Building a Pronunciation Lexicon for a Speech Transcription System from Wiktionary Pronunciations only

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager