Disfluency Insertion for Spontaneous TTS: Formalization and Proof of Concept - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2018

Disfluency Insertion for Spontaneous TTS: Formalization and Proof of Concept

Résumé

This paper presents an exploratory work to automatically insert disfluencies in text-to-speech (TTS) systems. The objective is to make TTS more spontaneous and expressive. To achieve this, we propose to focus on the linguistic level of speech through the insertion of pauses, repetitions and revisions. We formalize the problem as a theoretical process, where transformations are iteratively composed. This is a novel contribution since most of the previous work either focus on the detection or cleaning of linguistic disfluencies in speech transcripts, or solely concentrate on acoustic phenomena in TTS, especially pauses. We present a first implementation of the proposed process using conditional random fields and language models. The objective and perceptual evalation conducted on an English corpus of spontaneous speech show that our proposition is effective to generate disfluencies, and highlights perspectives for future improvements.
Fichier principal
Vignette du fichier
disfluencies_slsp_camera_ready.pdf (335.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01840798 , version 1 (16-07-2018)

Identifiants

Citer

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Pascale Sébillot. Disfluency Insertion for Spontaneous TTS: Formalization and Proof of Concept. SLSP 2018 - 6th International Conference on Statistical Language and Speech Processing, Oct 2018, Mons, Belgium. pp.1-12, ⟨10.1007/978-3-030-00810-9_4⟩. ⟨hal-01840798⟩
253 Consultations
457 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More