Skip to Main content Skip to Navigation
Conference papers

Online Localization of Multiple Moving Speakers in Reverberant Environments

Xiaofei Li 1 Bastien Mourgue 1 Laurent Girin 2 Sharon Gannot 3 Radu Horaud 1
1 PERCEPTION - Interpretation and Modelling of Images and Videos
Inria Grenoble - Rhône-Alpes, Grenoble INP - Institut polytechnique de Grenoble - Grenoble Institute of Technology, LJK - Laboratoire Jean Kuntzmann
Abstract : This paper addresses the problem of online multiple moving speakers localization in reverberant environments. The direct-path relative transfer function (DP-RTF), as defined by the ratio between the first taps of the convolutive transfer function (CTF) of two microphones, encodes the inter-channel direct-path information and is thus used as a localization feature being robust against reverberation. The CTF estimation is based on the cross-relation method. In this work, the recursive least-square method is proposed to solve the cross-relation problem, due to its relatively low computational cost and its good convergence rate. The DP-RTF feature estimated at each time-frequency bin is assumed to correspond to a single speaker. A complex Gaussian mixture model is used to assign each observed feature to one among several speakers. The recursive expectation-maximization algorithm is adopted to update online the model parameters. The method is evaluated with a new dataset containing multiple moving speakers, where the ground-truth speaker trajectories are recorded with a motion capture system.
Complete list of metadata

Cited literature [23 references]  Display  Hide  Download
Contributor : Team Perception Connect in order to contact the contributor
Submitted on : Friday, May 18, 2018 - 2:45:56 PM
Last modification on : Wednesday, November 3, 2021 - 7:49:58 AM
Long-term archiving on: : Tuesday, September 25, 2018 - 1:13:49 PM


Files produced by the author(s)



Xiaofei Li, Bastien Mourgue, Laurent Girin, Sharon Gannot, Radu Horaud. Online Localization of Multiple Moving Speakers in Reverberant Environments. SAM 2018 - 10th IEEE Workshop on Sensor Array and Multichannel Signal Processing, Jul 2018, Sheffield, United Kingdom. pp.405-409, ⟨10.1109/SAM.2018.8448423⟩. ⟨hal-01795462⟩



Record views


Files downloads