Adaptation de la prononciation pour la synthèse de la parole spontanée en utilisant des informations linguistiques

Raheel Qader; Gwénolé Lecorvé; Damien Lolive; Pascale Sébillot

Communication Dans Un Congrès Année : 2016

Adaptation de la prononciation pour la synthèse de la parole spontanée en utilisant des informations linguistiques

(1) , (1) , (1) , (2)

1
2

Raheel Qader

Fonction : Auteur
PersonId : 778121
IdRef : 224293559

Expressiveness in Human Centered Data/Media

Gwénolé Lecorvé

Fonction : Auteur
PersonId : 20677
IdHAL : gwenole-lecorve
ORCID : 0000-0002-4271-2087
IdRef : 150245254

Expressiveness in Human Centered Data/Media

Damien Lolive

Fonction : Auteur
PersonId : 5088
IdHAL : damien-lolive
ORCID : 0000-0002-1110-5444
IdRef : 13017498X

Expressiveness in Human Centered Data/Media

Pascale Sébillot

Fonction : Auteur
PersonId : 21840
IdHAL : pascale-sebillot
ORCID : 0000-0002-5429-4302
IdRef : 075988453

Creating and exploiting explicit links between multimedia fragments

Résumé

This paper presents a new pronunciation adaptation method which adapts canonical pronunciations to a spontaneous style. This is a key task in text-to-speech as those pronunciation variants bring expressiveness to synthetic speech, thus enabling new potential applications. The strength of the method is to solely rely on linguistic features and to consider a probabilistic machine learning framework, namely conditional random fields, to produce the adapted pronunciations. Features are selected in a first series of experiments, then combined in the backend experiments. Results on the Buckeye conversational English speech corpus show that adapted pronunciations significantly better reflect spontaneous speech than canonical ones.

Cet article présente une nouvelle méthode d'adaptation de la prononciation dont le but est de reproduire le style spontané. Il s'agit d'une tâche-clé en synthèse de la parole car elle permet d'apporter de l'expressivité aux signaux produits, ouvrant ainsi la voie à de nouvelles applications. La force de la méthode proposée est de ne s'appuyer que sur des informations linguistiques et de considérer un cadre probabiliste pour ce faire, précisément les champs aléatoires conditionnels. Dans cet article, nous étudions tout d'abord la pertinence d'un ensemble d'informations pour l'adaptation, puis nous combinons les informations les plus pertinentes lors d'expériences finales. Les évaluations de la méthode sur un corpus de parole conversationnelle en anglais montrent que les prononciations adaptées reflètent significativement mieux un style spontané que les prononciations canoniques. ABSTRACT Pronunciation adaptation for spontaneous speech synthesis using linguistic information. This paper presents a new pronunciation adaptation method which adapts canonical pronunciations to a spontaneous style. This is a key task in text-to-speech as those pronunciation variants bring expressiveness to synthetic speech, thus enabling new potential applications. The strength of the method is to solely rely on linguistic features and to consider a probabilistic machine learning framework, namely conditional random fields, to produce the adapted pronunciations. Features are selected in a first series of experiments, then combined in the backend experiments. Results on the Buckeye conversational English speech corpus show that adapted pronunciations significantly better reflect spontaneous speech than canonical ones. MOTS-CLÉS : Adaptation de la prononciation, parole spontanée, synthèse de la parole.

Mots clés

Pronunciation a... spontaneous spe... speech synthesi...

Adaptation de la prononciation parole spontanée synthèse de la parole

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

pronunciation_adaptation.pdf (292.85 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Gwénolé Lecorvé : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01321361

Soumis le : mercredi 25 mai 2016-15:02:35

Dernière modification le : mardi 3 octobre 2023-09:49:52

Archivage à long terme le : vendredi 26 août 2016-10:50:41

Dates et versions

hal-01321361 , version 1 (25-05-2016)

Identifiants

HAL Id : hal-01321361 , version 1

Citer

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Pascale Sébillot. Adaptation de la prononciation pour la synthèse de la parole spontanée en utilisant des informations linguistiques. Journées d'Études sur la Parole, Jul 2016, Paris, France. ⟨hal-01321361⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA INSA-RENNES ENSSAT IRISA IRISA-INSA-R CENTRALESUPELEC IRISA-D6 INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

476 Consultations

90 Téléchargements

Adaptation de la prononciation pour la synthèse de la parole spontanée en utilisant des informations linguistiques

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager