Experiments on the Construction of a Phonetically Balanced Corpus from the Web

Abstract : The construction of a speech recognition system requires a recorded set of phrases to compute the pertinent acoustic models. This set of phrases must be phonetically rich and balanced in order to obtain a robust recognizer. By tradition, this set is defined manually implicating a great human effort. In this paper we propose an automated method for assembling a phonetically balanced corpus (set of phrases) from the Web. The proposed method was used to construct a phonetically balanced corpus for the Mexican Spanish language.
Type de document :
Communication dans un congrès
Conference on Intelligent Text Processing and Computational Linguistics CICLing-2004, Feb 2004, Seoul, South Korea. Springer-Verlag, 2945, 4 p., 2004, Lecture Notes in Computer Science
Liste complète des métadonnées

Littérature citée [4 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00326519
Contributeur : Dominique Vaufreydaz <>
Soumis le : vendredi 3 octobre 2008 - 12:08:13
Dernière modification le : jeudi 12 avril 2018 - 01:51:16
Document(s) archivé(s) le : vendredi 4 juin 2010 - 12:10:15

Fichier

Villasenor04.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00326519, version 1

Collections

Citation

Luis Villaseñor-Pineda, Manuel Montes-Y-Gómez, Dominique Vaufreydaz, Jean-François Serignat. Experiments on the Construction of a Phonetically Balanced Corpus from the Web. Conference on Intelligent Text Processing and Computational Linguistics CICLing-2004, Feb 2004, Seoul, South Korea. Springer-Verlag, 2945, 4 p., 2004, Lecture Notes in Computer Science. 〈inria-00326519〉

Partager

Métriques

Consultations de la notice

282

Téléchargements de fichiers

186