Skip to Main content Skip to Navigation
Conference papers

Accounting for Room Acoustics in Audio-Visual Multi-Speaker Tracking

Yutong Ban 1 Xiaofei Li 1 Xavier Alameda-Pineda 1 Laurent Girin 2, 1 Radu Horaud 1
1 PERCEPTION [2007-2015] - Interpretation and Modelling of Images and Videos [2007-2015]
Inria Grenoble - Rhône-Alpes, LJK [2007-2015] - Laboratoire Jean Kuntzmann [2007-2015], Grenoble INP [2007-2019] - Institut polytechnique de Grenoble - Grenoble Institute of Technology [2007-2019]
Abstract : Multiple-speaker tracking is a crucial task for many applications. In real-world scenarios, exploiting the complementarity between auditory and visual data enables to track people outside the visual field of view. However, practical methods must be robust to changes in acoustic conditions, e.g. reverberation. We investigate how to combine state-of-the-art audio-source localization techniques with Bayesian multi-person tracking. Our experiments demonstrate that the performance of the proposed system is not affected by changes in the acoustic environment.
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-01718114
Contributor : Team Perception <>
Submitted on : Tuesday, February 27, 2018 - 10:34:57 AM
Last modification on : Thursday, July 30, 2020 - 3:49:20 AM
Document(s) archivé(s) le : Monday, May 28, 2018 - 5:05:09 PM

File

Ban-ICASSP18.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Yutong Ban, Xiaofei Li, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud. Accounting for Room Acoustics in Audio-Visual Multi-Speaker Tracking. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), Apr 2018, Calgary, Alberta, Canada. pp.6553-6557, ⟨10.1109/ICASSP.2018.8462100⟩. ⟨hal-01718114⟩

Share

Metrics

Record views

775

Files downloads

1032