Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

Raheel Qader 1 Gwénolé Lecorvé 1 Damien Lolive 1 Marie Tahon 1 Pascale Sébillot 2
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
2 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA_D6 - MEDIA ET INTERACTIONS
Abstract : To bring more expressiveness into text-to-speech systems, this paper presents a new pronunciation variant generation method which works by adapting standard, i.e., dictionary-based, pronunciations to a spontaneous style. Its strength and originality lie in exploiting a wide range of linguistic, articulatory and prosodic features, and in using a probabilistic machine learning framework, namely conditional random fields and phoneme-based n-gram models. Extensive experiments on the Buckeye corpus of English conversational speech demonstrate the effectiveness of the approach through objective and perceptual evaluations.
Type de document :
Communication dans un congrès
Text, Speech and Dialogue (TSD), Aug 2017, Prague, Czech Republic
Liste complète des métadonnées


https://hal.inria.fr/hal-01532035
Contributeur : Gwénolé Lecorvé <>
Soumis le : vendredi 2 juin 2017 - 12:15:00
Dernière modification le : jeudi 15 juin 2017 - 09:08:53

Fichier

pronunciation_adaptation_rahee...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01532035, version 1

Citation

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Marie Tahon, Pascale Sébillot. Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis. Text, Speech and Dialogue (TSD), Aug 2017, Prague, Czech Republic. <hal-01532035>

Partager

Métriques

Consultations de
la notice

117

Téléchargements du document

21