Finding Speaker Face Region by Audiovisual Correlation

Abstract : The ability to find the speaker face region in a video is important in various application areas. In this work, we develop a novel technique to find this region robustly against different views and complex backgrounds using gray images only. The main thrust of this technique is to integrate audiovisual correlation analysis into an image segmentation framework to extract the speaker face region. We first analyze the video in a time window and evaluate the audiovisual correlation locally at each pixel position using a novel statistical measure based on Quadratic Mutual Information. As only local visual information is adopted in this stage, the analysis is robust against the view change of the human face. Analyzed correlation is then incorporated into Graph Cut-based image segmentation, which optimizes an energy function defined over multiple video frames. As this process can find the global optimum segmentation with image information balanced, we thus can extract a reliable region aligned to real visual boundaries. Experimental results demonstrate the effectiveness and robustness of our method.
Type de document :
Communication dans un congrès
Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Oct 2008, Marseille, France. 2008
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00326761
Contributeur : Peter Sturm <>
Soumis le : dimanche 5 octobre 2008 - 14:10:19
Dernière modification le : lundi 6 octobre 2008 - 09:26:27
Document(s) archivé(s) le : lundi 8 octobre 2012 - 14:02:36

Fichier

1569139970.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00326761, version 1

Collections

Citation

Yuyu Liu, Yoichi Sato. Finding Speaker Face Region by Audiovisual Correlation. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Oct 2008, Marseille, France. 2008. 〈inria-00326761〉

Partager

Métriques

Consultations de la notice

150

Téléchargements de fichiers

146