HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Finding Speaker Face Region by Audiovisual Correlation

Abstract : The ability to find the speaker face region in a video is important in various application areas. In this work, we develop a novel technique to find this region robustly against different views and complex backgrounds using gray images only. The main thrust of this technique is to integrate audiovisual correlation analysis into an image segmentation framework to extract the speaker face region. We first analyze the video in a time window and evaluate the audiovisual correlation locally at each pixel position using a novel statistical measure based on Quadratic Mutual Information. As only local visual information is adopted in this stage, the analysis is robust against the view change of the human face. Analyzed correlation is then incorporated into Graph Cut-based image segmentation, which optimizes an energy function defined over multiple video frames. As this process can find the global optimum segmentation with image information balanced, we thus can extract a reliable region aligned to real visual boundaries. Experimental results demonstrate the effectiveness and robustness of our method.
Document type :
Conference papers
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download

Contributor : Peter Sturm Connect in order to contact the contributor
Submitted on : Sunday, October 5, 2008 - 2:10:19 PM
Last modification on : Monday, May 17, 2021 - 12:00:04 PM
Long-term archiving on: : Monday, October 8, 2012 - 2:02:36 PM


Files produced by the author(s)


  • HAL Id : inria-00326761, version 1



Yuyu Liu, Yoichi Sato. Finding Speaker Face Region by Audiovisual Correlation. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Andrea Cavallaro and Hamid Aghajan, Oct 2008, Marseille, France. ⟨inria-00326761⟩



Record views


Files downloads