A Comparative Error Analysis of Audio-Visual Source Localization - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

A Comparative Error Analysis of Audio-Visual Source Localization

Résumé

This paper examines the accuracy of audio-video based localization using multiple cameras and multi-microphones. Covariance mapping theory is used to determine the accuracy of audio and video based localization. Both modalities are compared in terms of their ability to provide accurate location estimates of a moving audio-visual source. Relatively, video is found to be significantly more accurate than audio. The problem of audio-video fusion is also examined. The fusion of audio and video location estimates is applied in the audio domain, the video domain and the positional domain. The accuracy of these three fusion strategies for 3D localization are examined from a theoretical basis. The best localization performance is found when fusion is applied in the positional domain. Fusing audio and video data in the video domain is found to exhibit the worst localization performance. This analysis is confirmed by measuring the accuracy of each fusion strategy in localizing a moving audio-visual source.
Fichier principal
Vignette du fichier
1569140110.pdf (752.56 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00326742 , version 1 (05-10-2008)

Identifiants

  • HAL Id : inria-00326742 , version 1

Citer

Damien Kelly, François Pitié, Anil Kokaram, Frank Boland. A Comparative Error Analysis of Audio-Visual Source Localization. Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications - M2SFA2 2008, Andrea Cavallaro and Hamid Aghajan, Oct 2008, Marseille, France. ⟨inria-00326742⟩

Collections

M2SFA2
54 Consultations
53 Téléchargements

Partager

Gmail Facebook X LinkedIn More