Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, Epiciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

Speaker-Dependent Emotion Recognition For Audio Document Indexing

Abstract : The researches of the emotions are currently great interest in speech processing as well as in human-machine interaction domain. In the recent years, more and more of researches relating to emotion synthesis or emotion recognition are developed for the different purposes. Each approach uses its methods and its various parameters measured on the speech signal. In this paper, we proposed using a short-time parameter: MFCC coefficients (Mel-Frequency Cepstrum Coefficients) and a simple but efficient classifying method: Vector Quantification (VQ) for speaker-dependent emotion recognition. Many other features: energy, pitch, zero crossing, phonetic rate, LPC� and their derivatives are also tested and combined with MFCC coefficients in order to find the best combination. The other models: GMM and HMM (Discrete and Continuous Hidden Markov Model) are studied as well in the hope that the usage of continuous distribution and the temporal behaviour of this set of features will improve the quality of emotion recognition. The maximum accuracy recognizing five different emotions exceeds 88% by using only MFCC coefficients with VQ model. This is a simple but efficient approach, the result is even much better than those obtained with the same database in human evaluation by listening and judging without returning permission nor comparison between sentences [8]; And this result is positively comparable with the other approaches.
keyword : Emotion Recognition
Document type :
Conference papers
Complete list of metadata
Contributor : Marie-Christine Fauvet Connect in order to contact the contributor
Submitted on : Friday, February 28, 2014 - 4:06:42 PM
Last modification on : Sunday, June 26, 2022 - 9:35:04 AM


  • HAL Id : hal-00953924, version 1



Xuan Hung Le, Georges Quénot, Eric Castelli. Speaker-Dependent Emotion Recognition For Audio Document Indexing. International Conference on Electronics, Information, 2004, Unknown. ⟨hal-00953924⟩



Record views