Experiments on the Construction of a Phonetically Balanced Corpus from the Web
Résumé
The construction of a speech recognition system requires a recorded set of phrases to compute the pertinent acoustic models. This set of phrases must be phonetically rich and balanced in order to obtain a robust recognizer. By tradition, this set is defined manually implicating a great human effort. In this paper we propose an automated method for assembling a phonetically balanced corpus (set of phrases) from the Web. The proposed method was used to construct a phonetically balanced corpus for the Mexican Spanish language.
Domaines
Informatique et langage [cs.CL]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...