Do not build your TTS training corpus randomly

Jonathan Chevelu 1 Damien Lolive 1
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : TTS voice building generally relies on a script extracted from a big text corpus while optimizing the coverage of linguistic and phonological events supposedly related to voice acoustic quality. Previous works have shown differences on objective measures between smartly reduced and random corpora, but not when subjective evaluations are performed. For us, those results do not come from corpus reduction utility but from evaluations that smooth differences. In this article, we highlight those differences in a subjective test, by clustering test corpora according to a distance between signals so as to focus on different synthesized stimuli. The results show that covering appropriate features has a real impact on the perceived quality.
Type de document :
Communication dans un congrès
Proceedings of the European Signal Processing Conference (EUSIPCO), Aug 2015, Nice, France
Liste complète des métadonnées

https://hal.inria.fr/hal-01199083
Contributeur : Damien Lolive <>
Soumis le : lundi 14 septembre 2015 - 21:09:33
Dernière modification le : jeudi 5 avril 2018 - 12:30:23

Identifiants

  • HAL Id : hal-01199083, version 1

Citation

Jonathan Chevelu, Damien Lolive. Do not build your TTS training corpus randomly. Proceedings of the European Signal Processing Conference (EUSIPCO), Aug 2015, Nice, France. 〈hal-01199083〉

Partager

Métriques

Consultations de la notice

299