Phonology Modelling for Expressive Speech Synthesis: a Review

Raheel Qader; Gwénolé Lecorvé; Damien Lolive; Pascale Sébillot

Rapport (Rapport De Recherche) Année : 2014

Phonology Modelling for Expressive Speech Synthesis: a Review

(1) , (1) , (1) , (2)

1
2

Raheel Qader

Fonction : Auteur
PersonId : 958276

Expressiveness in Human Centered Data/Media

Gwénolé Lecorvé

Fonction : Auteur
PersonId : 20677
IdHAL : gwenole-lecorve
ORCID : 0000-0002-4271-2087
IdRef : 150245254

Expressiveness in Human Centered Data/Media

Damien Lolive

Fonction : Auteur
PersonId : 5088
IdHAL : damien-lolive
ORCID : 0000-0002-1110-5444
IdRef : 13017498X

Expressiveness in Human Centered Data/Media

Pascale Sébillot

Fonction : Auteur
PersonId : 21840
IdHAL : pascale-sebillot
ORCID : 0000-0002-5429-4302
IdRef : 075988453

Multimedia content-based indexing

Résumé

Expressive speech processing is an important scientific problem as expressivity introduces a lot of variability into speech. This variability leads to a degradation of speech application performances. Variations are reflected in the linguistic, phonological and acoustic sides of speech. However our main interest is on phonology, more precisely the study of pronunciation and of disfluencies. Both of these fields have huge impacts on speech. This report is a bibliographical review of the state of the art in expressivity and phonology modelling. Although the main focus will be on speech synthesis, we will discuss works about automatic speech recognition as well because expressivity modelling in phonology is a cross-domain problem.

L'expressivité introduit beaucoup de variabilité dans la parole. Cette variabilité touche des aspects aussi linguistiques, phonologiques qu'acoustique et conduit généralement à des dégradation des applications de traitement de la parole. Ainsi, le traitement de la parole expressive est un problème important. Précisément, notre intérêt principal se porte sur l'étude la phonologie, plus précisément celle de la prononciation et des disfluences, ces deux champs ayant chacun un rôle considérable dans la parole. Ce rapport est une étude bibliographique des travaux liées à l'expressivité et à la modélisation de la phonologie. Le cadre de cette étude est principalement celui de la synthèse de la parole. Néanmois, comme la modélisation phonologique de l'expressivité est une problématique multi-domaine, nous aborderons également des travaux issus du monde de la reconnaissance automatique de la parole.

Mots clés

Phonology pronunciation disfluencies expressivity speech synthesis automatic speech recognition natural language processing

Domaines

Intelligence artificielle [cs.AI]

Fichier principal

research_report.pdf (479.86 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Gwénolé Lecorvé : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01021911

Soumis le : mercredi 9 juillet 2014-18:45:30

Dernière modification le : mardi 3 octobre 2023-09:49:44

Archivage à long terme le : jeudi 9 octobre 2014-12:52:55

Dates et versions

hal-01021911 , version 1 (09-07-2014)

Identifiants

HAL Id : hal-01021911 , version 1

Citer

Raheel Qader, Gwénolé Lecorvé, Damien Lolive, Pascale Sébillot. Phonology Modelling for Expressive Speech Synthesis: a Review. [Research Report] PI-2020, IRISA, équipe EXPRESSION. 2014, 18 p., 1 column. ⟨hal-01021911⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES ENSSAT IRISA IRISA-INSA-R IRISA-D6 INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

470 Consultations

536 Téléchargements

Phonology Modelling for Expressive Speech Synthesis: a Review

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager