Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis

Raheel Qader 1 Gwénolé Lecorvé 1 Damien Lolive 1 Marie Tahon 1 Pascale Sébillot 2
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
2 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : To bring more expressiveness into text-to-speech systems, this paper presents a new pronunciation variant generation method which works by adapting standard, i.e., dictionary-based, pronunciations to a spontaneous style. Its strength and originality lie in exploiting a wide range of linguistic, articulatory and prosodic features, and in using a probabilistic machine learning framework, namely conditional random fields and phoneme-based n-gram models. Extensive experiments on the Buckeye corpus of English conversational speech demonstrate the effectiveness of the approach through objective and perceptual evaluations.
Complete list of metadatas

Cited literature [23 references]  Display  Hide  Download

https://hal.inria.fr/hal-01532035
Contributor : Gwénolé Lecorvé <>
Submitted on : Friday, June 2, 2017 - 12:15:00 PM
Last modification on : Friday, September 13, 2019 - 9:48:07 AM
Long-term archiving on : Wednesday, December 13, 2017 - 7:23:39 AM

File

pronunciation_adaptation_rahee...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01532035, version 1

Citation

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Marie Tahon, Pascale Sébillot. Statistical Pronunciation Adaptation for Spontaneous Speech Synthesis. Text, Speech and Dialogue (TSD), Aug 2017, Prague, Czech Republic. ⟨hal-01532035⟩

Share

Metrics

Record views

1244

Files downloads

200