Blind Audiovisual Source Separation Based on Redundant Representations

Anna Llagostera Casanovas 1 Gianluca Monaci 1 Pierre Vandergheynst 1 Rémi Gribonval 2
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this work we present a method to perform a complete audiovisual source separation without need of previous information. This method is based on the assumption that sounds are caused by moving structures. Thus, an efficient representation of audio and video sequences allows to build relationships between synchronous structures on both modalities. A robust clustering algorithm groups video structures exhibiting strong correlations with the audio so that sources are counted and located in the image. Using such information and exploiting audio-video correlation, the audio sources activity is determined. Next, \backslashemph\char123spectral\char125 GMMs are learnt in time slots with only one source active so that it is possible to separate them in case of an audio mixture. Audio source separation performances are rigorously evaluated, clearly showing that the proposed algorithm performs efficiently and robustly.
Type de document :
Communication dans un congrès
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on,, Apr 2008, Las Vegas, Nevada, United States. pp.1841 -1844, 2008, 〈10.1109/ICASSP.2008.4517991〉
Liste complète des métadonnées

Littérature citée [10 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00544971
Contributeur : Rémi Gribonval <>
Soumis le : jeudi 27 janvier 2011 - 22:39:22
Dernière modification le : mercredi 16 mai 2018 - 11:23:03
Document(s) archivé(s) le : jeudi 28 avril 2011 - 02:33:19

Fichier

2008_ICASSP_LlagosterasEtAl_BA...
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Anna Llagostera Casanovas, Gianluca Monaci, Pierre Vandergheynst, Rémi Gribonval. Blind Audiovisual Source Separation Based on Redundant Representations. Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on,, Apr 2008, Las Vegas, Nevada, United States. pp.1841 -1844, 2008, 〈10.1109/ICASSP.2008.4517991〉. 〈inria-00544971〉

Partager

Métriques

Consultations de la notice

378

Téléchargements de fichiers

180