Voice Cloning Applied to Voice Disorders: a Study of Extreme Phonetic Content in Speaker Embeddings

Organic dysphonia can lead to vocal impairments. Recording patients' impaired voice could allow them to use voice cloning systems. In the domain of speech synthesis, voice cloning is the process of producing speech matching a target speaker voice, given textual input and an audio sample from the speaker. It can achieve high-quality speech with only few data from the target speaker. However, dysphonic patients may only produce speech with specific or limited phonetic content. To our knowledge, the impact of such constraints on a voice cloning system remains to be studied. This article presents the results of preliminary experiments on the matter, along with specifications about the models and datasets used.

Mots clés

voice cloning speaker encoder Text-to-Speech x-vector voice disorders

Domaines

Intelligence artificielle [cs.AI]

Nelly Barbot : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03697484

Soumis le : jeudi 16 juin 2022-23:12:08

Dernière modification le : mardi 3 octobre 2023-09:49:08

Dates et versions

hal-03697484 , version 1 (16-06-2022)

Identifiants

HAL Id : hal-03697484 , version 1

Citer

Lily Wadoux, Nelly Barbot, Damien Lolive, Jonathan Chevelu. Voice Cloning Applied to Voice Disorders: a Study of Extreme Phonetic Content in Speaker Embeddings. 35th Canadian Conference on Artificial Intelligence, May 2022, Toronto, Canada. ⟨hal-03697484⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES ENSSAT IRISA CENTRALESUPELEC UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

86 Consultations

0 Téléchargements