Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

Xiaofei Li 1 Yutong Ban 1 Laurent Girin 2, 1 Xavier Alameda-Pineda 1 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, LJK - Laboratoire Jean Kuntzmann, INPG - Institut National Polytechnique de Grenoble
2 GIPSA-CRISSP - CRISSP
GIPSA-DPC - Département Parole et Cognition
Abstract : We address the problem of online localization and tracking of multiple moving speakers in reverberant environments. The paper has the following contributions. We use the direct-path relative transfer function (DP-RTF), an inter-channel feature that encodes acoustic information robust against reverberation, and we propose an online algorithm well suited for estimating DP-RTFs associated with moving audio sources. Another crucial ingredient of the proposed method is its ability to properly assign DP-RTFs to audio-source directions. Towards this goal, we adopt a maximum-likelihood formulation and we propose to use exponentiated gradient (EG) to efficiently update source-direction estimates starting from their currently available values. The problem of multiple speaker tracking is computationally intractable because the number of possible associations between observed source directions and physical speakers grows exponentially with time. We adopt a Bayesian framework and we propose a variational approximation of the posterior filtering distribution associated with multiple speaker tracking, as well as an efficient variational expectation maximization (VEM) solver. The proposed online localization and tracking method is thoroughly evaluated using two datasets that contain recordings performed in real environments.
Complete list of metadatas

Cited literature [38 references]  Display  Hide  Download

https://hal.inria.fr/hal-01851985
Contributor : Team Perception <>
Submitted on : Friday, March 1, 2019 - 10:03:59 AM
Last modification on : Monday, May 6, 2019 - 10:35:56 AM
Long-term archiving on : Thursday, May 30, 2019 - 1:00:06 PM

File

SSLT_JSTSP_R2.pdf
Files produced by the author(s)

Identifiers

Citation

Xiaofei Li, Yutong Ban, Laurent Girin, Xavier Alameda-Pineda, Radu Horaud. Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments. IEEE Journal of Selected Topics in Signal Processing, IEEE, 2019, 13 (1), pp.88-103. ⟨10.1109/JSTSP.2019.2903472⟩. ⟨hal-01851985v2⟩

Share

Metrics

Record views

202

Files downloads

656