Probabilistic Speaker Pronunciation Adaptation for Spontaneous Speech Synthesis Using Linguistic Features

Raheel Qader 1 Gwénolé Lecorvé 1 Damien Lolive 1 Pascale Sébillot 2
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
2 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : Pronunciation adaptation consists in predicting pronunciation variants of words and utterances based on their standard pronunciation and a target style. This is a key issue in text-to-speech as those variants bring expressiveness to synthetic speech, especially when considering a spontaneous style. This paper presents a new pronunciation adaptation method which adapts standard pronunciations to the style of individual speakers in a context of spontaneous speech. Its originality and strength are to solely rely on linguistic features and to consider a probabilistic machine learning framework, namely conditional random fields, to produce the adapted pronunciations. Features are first selected in a series of experiments, then combined to produce the final adaptation method. Backend experiments on the Buckeye conversational English speech corpus show that adapted pronunciations significantly better reflect spontaneous speech than standard ones, and that even better could be achieved if considering alternative predictions.
Type de document :
Communication dans un congrès
International Conference on Statistical Language and Speech Processing (SLSP), Nov 2015, Budapest, Hungary. Proceedings of Statistical Language and Speech Processing, pp.229-241
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01181192
Contributeur : Gwénolé Lecorvé <>
Soumis le : vendredi 16 octobre 2015 - 10:23:07
Dernière modification le : mercredi 2 août 2017 - 10:06:56
Document(s) archivé(s) le : jeudi 27 avril 2017 - 00:17:36

Fichier

94490238.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01181192, version 1

Citation

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Pascale Sébillot. Probabilistic Speaker Pronunciation Adaptation for Spontaneous Speech Synthesis Using Linguistic Features. International Conference on Statistical Language and Speech Processing (SLSP), Nov 2015, Budapest, Hungary. Proceedings of Statistical Language and Speech Processing, pp.229-241. 〈hal-01181192〉

Partager

Métriques

Consultations de
la notice

672

Téléchargements du document

107