On the quality of an expressive audiovisual corpus: a case study of acted speech

Slim Ouni 1, 2 Sara Dahmani 2 Vincent Colotte 2
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : In the context of developing an expressive audiovisual speech synthesis system, the quality of the audiovisual corpus from which the 3D visual data will be extracted is important. In this paper, we present a perceptive case study on the quality of the expressiveness of a set of emotions acted by a semi-professional actor. We have analyzed the production of this actor pronouncing a set of sentences with acted emotions, during a human emotion-recognition task. We have observed different modalities: audio, real video, 3D-extracted data, as unimodal presentations and bimodal presentations (with audio). The results of this study show the necessity of such perceptive evaluation prior to further exploitation of the data for the synthesis system. The comparison of the modalities shows clearly what the emotions are, that need to be improved during production and how audio and visual components have a strong mutual influence on emotional perception.
Type de document :
Communication dans un congrès
Slim Ouni; Chris Davis; Alexandra Jesse; Jonas Beskow. The 14th International Conference on Auditory-Visual Speech Processing, Aug 2017, Stockholm, Sweden. 2017, 〈http://avsp2017.loria.fr〉
Liste complète des métadonnées

Littérature citée [11 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01596614
Contributeur : Slim Ouni <>
Soumis le : mercredi 27 septembre 2017 - 20:08:13
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : jeudi 28 décembre 2017 - 14:20:41

Fichier

AVSP2017_paper_22.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01596614, version 1

Citation

Slim Ouni, Sara Dahmani, Vincent Colotte. On the quality of an expressive audiovisual corpus: a case study of acted speech. Slim Ouni; Chris Davis; Alexandra Jesse; Jonas Beskow. The 14th International Conference on Auditory-Visual Speech Processing, Aug 2017, Stockholm, Sweden. 2017, 〈http://avsp2017.loria.fr〉. 〈hal-01596614〉

Partager

Métriques

Consultations de la notice

269

Téléchargements de fichiers

41