Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target

Vasil Khalidov 1 Florence Forbes 2 Radu Horaud 3
2 MISTIS - Modelling and Inference of Complex and Structured Stochastic Systems
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
3 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
Abstract : In this paper we address the problem of aligning visual (V) and auditory (A) data using a sensor that is composed of a camera-pair and a microphone-pair. The original contribution of the paper is a method for estimating the 3D positions of the microphones in the visual-centred coordinate frame defined by the stereo camera-pair. Assuming that the latter is calibrated, the problem is twofold: estimate the trajectory of an audio-visual (AV) object that freely moves in the scene and estimate the locations of the two microphones. We explore the geometric and physical properties of the two sensorial modalities within two generative models. These models are then used to project the AV object onto both the visual and auditory observation spaces. We exploit the fact that these two distinct data sets are conditioned by a common set of parameters, namely the (unknown) 3D trajectory of the AV object. We derive an EM-like algorithm that alternates between the estimation of the microphone-pair position and the estimation of AV object trajectory. The proposed algorithm has a number of built-in features: it can deal with A and V observations that are misaligned in time, it estimates the reliability of the data, it is robust to outliers in both modalities, and it has proven theoretical convergence. We report experiments with both simulated and real data.
Complete list of metadatas

Cited literature [36 references]  Display  Hide  Download

https://hal.inria.fr/hal-00662306
Contributor : Team Perception <>
Submitted on : Thursday, February 2, 2012 - 10:40:31 AM
Last modification on : Thursday, February 7, 2019 - 5:55:47 PM
Long-term archiving on : Monday, November 19, 2012 - 3:40:17 PM

File

RR-7865.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00662306, version 1

Citation

Vasil Khalidov, Florence Forbes, Radu Horaud. Calibration of A Binocular-Binaural Sensor Using a Moving Audio-Visual Target. [Research Report] RR-7865, INRIA. 2012, pp.27. ⟨hal-00662306⟩

Share

Metrics

Record views

625

Files downloads

194