Audio-visual emotion recognition: A dynamic, multimodal approach

Jérémie Nicolle; Vincent Rapp; Kevin Bailly; Lionel Prevost; Mohamed Chetouani

Poster Communications Year : 2014

Audio-visual emotion recognition: A dynamic, multimodal approach

(1) , (1) , (1) , (2) , (1)

1
2

Jérémie Nicolle

Function : Author

Institut des Systèmes Intelligents et de Robotique

Vincent Rapp

Function : Author
PersonId : 961979

Institut des Systèmes Intelligents et de Robotique

Kevin Bailly

Function : Author
PersonId : 181765
IdHAL : kevin-bailly
ORCID : 0000-0001-7802-3673
IdRef : 178678244

Institut des Systèmes Intelligents et de Robotique

Lionel Prevost

Function : Author
PersonId : 967227

Laboratoire de Mathématiques Informatique et Applications

Mohamed Chetouani

Function : Author
PersonId : 179528
IdHAL : mohamed-chetouani
ORCID : 0000-0002-2920-4539
IdRef : 089021916

Institut des Systèmes Intelligents et de Robotique

Abstract

Designing systems able to interact with students in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the continuous Audio/Visual Emotion Challenge (AVEC'12) whose goal is to predict four affective signals describing human emotions. The proposed method uses Fourier spectra to extract multi-scale dynamic descriptions of signals characterizing face appearance, head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and our results on challenge data were the highest among 10 international research teams.

Keywords

Multimodal fusion Facial expressions Feature selection Dynamic features Affective computing

Domains

Human-Computer Interaction [cs.HC]

Fichier principal

p44-nicole.pdf (668.54 Ko)

Origin : Files produced by the author(s)

Ihm14 Ihm14 : Connect in order to contact the contributor

https://hal.science/hal-01089628

Submitted on : Tuesday, December 2, 2014-9:26:24 AM

Last modification on : Thursday, August 17, 2023-1:30:46 PM

Long-term archiving on: Tuesday, March 3, 2015-10:35:57 AM

Dates and versions

hal-01089628 , version 1 (02-12-2014)

Identifiers

HAL Id : hal-01089628 , version 1

Cite

Jérémie Nicolle, Vincent Rapp, Kevin Bailly, Lionel Prevost, Mohamed Chetouani. Audio-visual emotion recognition: A dynamic, multimodal approach. IHM'14, 26e conférence francophone sur l'Interaction Homme-Machine, Oct 2014, Lille, France. pp.44-51, 2014. ⟨hal-01089628⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC UNIV-AG CNRS ISIR IHM-2014 LAMIA SORBONNE-UNIVERSITE SU-SCIENCES ISIR_PIROS

303 View

251 Download

Audio-visual emotion recognition: A dynamic, multimodal approach

Abstract

Keywords

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Share