Une pénalité floue fondée phonologiquement pour améliorer la Sélection d'Unité

David Guennec 1 Damien Lolive 1
1 EXPRESSION - Expressiveness in Human Centered Data/Media
UBS - Université de Bretagne Sud, IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : Unit selection speech synthesis systems rely, except for rare cases, on target and concaténation costs for selecting the best unit sequence. The role of the concatenation cost is to insure that joining two voice segments will not cause any acoustic artefact to appear. For this task, acoustic distances (MFCC, F0) are typically used but in many cases, this is not enough. In this paper, we introduce a penalty in the concaténation cost, inherited from the field of corpus covering, in order to block some concaténations based on their phonological class. Moreover, a derived fuzzy version is proposed to relax the penalty based on the concaténation quality with respect to the cost distribution. An objective evaluation showed that the penalty is effective to better rank candidate unit sequences during selection. The subjective evaluation we conducted reveals a superior performance of the fuzzy approach.
Complete list of metadatas

https://hal.inria.fr/hal-01338948
Contributor : Damien Lolive <>
Submitted on : Wednesday, June 29, 2016 - 1:35:23 PM
Last modification on : Friday, January 11, 2019 - 2:27:03 PM

Identifiers

  • HAL Id : hal-01338948, version 1

Citation

David Guennec, Damien Lolive. Une pénalité floue fondée phonologiquement pour améliorer la Sélection d'Unité. Journées d'Études sur la Parole, Jul 2016, Paris, France. ⟨hal-01338948⟩

Share

Metrics

Record views

420