Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target

Vasil Khalidov; Florence Forbes; Radu Horaud

Rapport (Rapport De Recherche) Année : 2012

Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target

(1) , (2) , (3)

1
2
3

Vasil Khalidov

Fonction : Auteur

IDIAP Research Institute

Florence Forbes

Fonction : Auteur
PersonId : 16305
IdHAL : florence-forbes
ORCID : 0000-0003-3639-0226
IdRef : 12469781X

Modelling and Inference of Complex and Structured Stochastic Systems

Radu Horaud

Fonction : Auteur
PersonId : 16183
IdHAL : radu-horaud
ORCID : 0000-0001-5232-024X
IdRef : 032302495

Interpretation and Modelling of Images and Videos

Résumé

In this paper we address the problem of aligning visual (V) and auditory (A) data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for estimating the 3D positions of the microphones in the visual-centred coordinate frame defined by the stereo camera-pair. Assuming that the latter is calibrated, the problem is twofold: estimate the trajectory of an audio-visual (AV) object that freely moves in the scene and estimate the locations of the two microphones. We explore the geometric and physical properties of the two sensorial modalities within two generative models. These models are then used to project the AV object onto both the visual and auditory observation spaces. We exploit the fact that these two distinct data sets are conditioned by a common set of parameters, namely the (unknown) 3D trajectory of the AV object. We derive an EM-like algorithm that alternates between the estimation of the microphone-pair position and the estimation of AV object trajectory. The proposed algorithm has a number of built-in features: it can deal with A and V observations that are misaligned in time, it estimates the reliability of the data, it is robust to outliers in both modalities, and it has proven theoretical convergence. We report experiments with both simulated and real data.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

RR-7865.pdf (4.68 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Perception team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00662306

Soumis le : jeudi 2 février 2012-10:40:31

Dernière modification le : samedi 27 avril 2024-03:13:18

Archivage à long terme le : lundi 19 novembre 2012-15:40:17

Dates et versions

hal-00662306 , version 1 (02-02-2012)

Identifiants

HAL Id : hal-00662306 , version 1

Citer

Vasil Khalidov, Florence Forbes, Radu Horaud. Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target. [Research Report] RR-7865, INRIA. 2012, pp.27. ⟨hal-00662306⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 UGA CNRS INRIA IRISA INRIA-RRRT INSMI LJK LJK_GI LJK_PS LJK_GI_PERCEPTION LJK_PS_MISTIS INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

207 Consultations

114 Téléchargements

Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager