Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target

Vasil Khalidov; Florence Forbes; Radu Horaud

doi:10.1109/MMSP.2013.6659295

Conference Papers Year : 2013

Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target

(1) , (2) , (3)

1
2
3

Vasil Khalidov

Function : Author

IDIAP Research Institute

Florence Forbes

Function : Author
PersonId : 16305
IdHAL : florence-forbes
ORCID : 0000-0003-3639-0226
IdRef : 12469781X

Modelling and Inference of Complex and Structured Stochastic Systems

Radu Horaud

Function : Author
PersonId : 16183
IdHAL : radu-horaud
ORCID : 0000-0001-5232-024X
IdRef : 032302495

Interpretation and Modelling of Images and Videos

Abstract

In this paper we address the problem of aligning visual (V) and auditory (A) data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for AV data aligning through estimation of the 3D positions of the microphones in the visual-centred coordinate frame defined by the stereo camera-pair. We exploit the fact that these two distinct data sets are conditioned by a common set of parameters, namely the (unknown) 3D trajectory of an AV object, and derive an EM-like algorithm that alternates between the estimation of the microphone-pair position and the estimation of the AV object trajectory. The proposed algorithm has a number of built-in features: it can deal with A and V observations that are misaligned in time, it estimates the reliability of the data, it is robust to outliers in both modalities, and it has proven theoretical convergence. We report experiments with both simulated and real data.

Domains

Computer Vision and Pattern Recognition [cs.CV]

Fichier principal

Khalidov-MMSP13.pdf (2.38 Mo)

Khalidov-MMSP13.jpg (148.86 Ko)

bestpaperaward-MMSP2013.jpg (952.87 Ko)

poster_2013_MMSP.pdf (2.49 Mo)

Origin : Files produced by the author(s)

Format : Figure, Image

Format : Other

Perception team : Connect in order to contact the contributor

https://inria.hal.science/hal-00861482

Submitted on : Friday, October 4, 2013-5:14:26 PM

Last modification on : Thursday, April 4, 2024-6:17:53 PM

Long-term archiving on: Friday, April 7, 2017-6:53:11 AM

Dates and versions

hal-00861482 , version 1 (12-09-2013)

hal-00861482 , version 2 (04-10-2013)

hal-00861482 , version 3 (04-10-2013)

Identifiers

HAL Id : hal-00861482 , version 3
DOI : 10.1109/MMSP.2013.6659295

Cite

Vasil Khalidov, Florence Forbes, Radu Horaud. Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target. MMSP 2013 - IEEE International Workshop on Multimedia Signal Processing, IEEE Signal Processing Society, Sep 2013, Pula (Sardinia), Italy. pp.242-247, ⟨10.1109/MMSP.2013.6659295⟩. ⟨hal-00861482v3⟩

Export

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA LJK LJK_GI LJK_PS LJK_GI_PERCEPTION LJK_PS_MISTIS INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

487 View

384 Download

Alignment of Binocular-Binaural Data Using a Moving Audio-Visual Target

Abstract

Domains

Dates and versions

Identifiers

Cite

Export

Collections

Altmetric

Share