Learning Multi-Modal Dictionaries: Application to Audiovisual Data

Gianluca Monaci 1 Philippe Jost 1 Pierre Vandergheynst 1 Boris Mailhé 2 Sylvain Lesage 2 Rémi Gribonval 2
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This paper presents a methodology for extracting meaningful synchronous structures from multi-modal signals. Simultaneous processing of multi-modal data can reveal information that is unavailable when handling the sources separately. However, in natural high-dimensional data, the statistical dependencies between modalities are, most of the time, not obvious. Learning fundamental multi-modal patterns is an alternative to classical statistical methods. Typically, recurrent patterns are shift invariant, thus the learning should try to find the best matching filters. We present a new algorithm for iteratively learning multi-modal generating functions that can be shifted at all positions in the signal. The proposed algorithm is applied to audiovisual sequences and it demonstrates to be able to discover underlying structures in the data.
Type de document :
Communication dans un congrès
Proc. of International Workshop on Multimedia Content Representation, Classification and Security (MCRCS'06), Sep 2006, Istanbul, Turkey. Springer-Verlag, 4105, pp.538--545, 2006, LNCS. 〈10.1007/11848035_71〉
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00544773
Contributeur : Rémi Gribonval <>
Soumis le : mardi 8 février 2011 - 22:35:08
Dernière modification le : mercredi 16 mai 2018 - 11:23:03
Document(s) archivé(s) le : lundi 9 mai 2011 - 02:48:27

Fichier

Monaci2006_1502.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Gianluca Monaci, Philippe Jost, Pierre Vandergheynst, Boris Mailhé, Sylvain Lesage, et al.. Learning Multi-Modal Dictionaries: Application to Audiovisual Data. Proc. of International Workshop on Multimedia Content Representation, Classification and Security (MCRCS'06), Sep 2006, Istanbul, Turkey. Springer-Verlag, 4105, pp.538--545, 2006, LNCS. 〈10.1007/11848035_71〉. 〈inria-00544773〉

Partager

Métriques

Consultations de la notice

222

Téléchargements de fichiers

192