Conjugate Mixture Models for Clustering Multimodal Data

Vasil Khalidov; Florence Forbes; Radu Horaud

doi:10.1162/NECO_a_00074

Journal Articles Neural Computation Year : 2011

Conjugate Mixture Models for Clustering Multimodal Data

(1) , (1) , (2)

1
2

Vasil Khalidov

Function : Author

Modelling and Inference of Complex and Structured Stochastic Systems

Florence Forbes

Function : Author
PersonId : 16305
IdHAL : florence-forbes
ORCID : 0000-0003-3639-0226
IdRef : 12469781X

Modelling and Inference of Complex and Structured Stochastic Systems

Radu Horaud

Function : Author
PersonId : 16183
IdHAL : radu-horaud
ORCID : 0000-0001-5232-024X
IdRef : 032302495

Interpretation and Modelling of Images and Videos

Abstract

The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors. Observations from different modalities are not necessarily aligned in the sense there there is no obvious way to associate or to compare them in some common space. A solution may consist in considering multiple clustering tasks independently for each modality. The main difficulty with such an approach is to guarantee that the unimodal clusterings are mutually consistent. In this paper we show that multimodal clustering can be addressed within a novel framework, namely conjugate mixture models. These models exploit the explicit transformations that are often available between an unobserved parameter space (objects) and each one of the observation spaces (sensors). We formulate the problem as a likelihood maximization task and we derive the associated conjugate expectation-maximization algorithm. The convergence properties of the proposed algorithm are thoroughly investigated. Several local/global optimization techniques are proposed in order to increase its convergence speed. Two initialization strategies are proposed and compared. A consistent model-selection criterion is proposed. The algorithm and its variants are tested and evaluated within the task of 3D localization of several speakers using both auditory and visual data.

Domains

Graphics [cs.GR]

Fichier principal

KhalidovForbesHoraud_NECO2011.pdf (1.23 Mo)

audiovisual1.jpg (242.35 Ko)

audiovisual2.jpg (267.35 Ko)

cover_large.jpg (147.46 Ko)

Origin : Publisher files allowed on an open archive

Format : Figure, Image

Perception team : Connect in order to contact the contributor

https://inria.hal.science/inria-00590267

Submitted on : Tuesday, May 3, 2011-9:53:02 AM

Last modification on : Saturday, April 27, 2024-3:16:08 AM

Long-term archiving on: Thursday, August 4, 2011-3:09:22 AM

Dates and versions

inria-00590267 , version 1 (03-05-2011)

Identifiers

HAL Id : inria-00590267 , version 1
DOI : 10.1162/NECO_a_00074

Cite

Vasil Khalidov, Florence Forbes, Radu Horaud. Conjugate Mixture Models for Clustering Multimodal Data. Neural Computation, 2011, 23 (2), pp.517-557. ⟨10.1162/NECO_a_00074⟩. ⟨inria-00590267⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA INSMI LJK LJK_GI LJK_PS LJK_GI_PERCEPTION LJK_PS_MISTIS INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

3048 View

428 Download

Conjugate Mixture Models for Clustering Multimodal Data

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share