Text-informed speech inpainting via voice conversion

Pierre Prablanc; Alexey Ozerov; Ngoc Q. K. Duong; Patrick Pérez

Communication Dans Un Congrès Année : 2016

Text-informed speech inpainting via voice conversion

(1) , (1) , (1) , (1)

Pierre Prablanc

Fonction : Auteur

Technicolor R & I [Cesson Sévigné]

Alexey Ozerov

Fonction : Auteur

Technicolor R & I [Cesson Sévigné]

Ngoc Q. K. Duong

Fonction : Auteur

Technicolor R & I [Cesson Sévigné]

Patrick Pérez

Fonction : Auteur
PersonId : 1022281

Technicolor R & I [Cesson Sévigné]

Résumé

The problem of speech inpainting consists in recovering some parts in a speech signal that are missing for some reasons. To our best knowledge none of the existing methods allows satisfactory inpainting of missing parts of large size such as one second and longer. In this work we address this challenging scenario. Since in the case of such long missing parts entire words can be lost, we assume that the full text uttered in the speech signal is known. This leads to a new concept of text-informed speech inpainting. To solve this problem we propose a method that is based on synthesizing the missing speech by a speech synthesizer, on modifying its vocal characteristics via a voice conversion method, and on filling in the missing part with the resulting converted speech sample. We carried subjective listening tests to compare the proposed approach with two baseline methods.

Mots clés

Gaussian mixture model voice con-version audio inpainting speech inpainting speech synthesis

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

eusipco16a.pdf (268.87 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Alexey Ozerov : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01271257

Soumis le : mardi 22 novembre 2016-16:49:33

Dernière modification le : mardi 8 décembre 2020-09:52:23

Archivage à long terme le : lundi 20 mars 2017-23:54:34

Dates et versions

hal-01271257 , version 1 (08-02-2016)

hal-01271257 , version 2 (22-11-2016)

Identifiants

HAL Id : hal-01271257 , version 2

Citer

Pierre Prablanc, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez. Text-informed speech inpainting via voice conversion. 24th European Signal Processing Conference (EUSIPCO 2016), Aug 2016, Budapest, Hungary. ⟨hal-01271257v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ANR

236 Consultations

411 Téléchargements

Text-informed speech inpainting via voice conversion

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager