Skip to Main content Skip to Navigation
New interface
Journal articles

Multi-source TDOA estimation in reverberant audio using angular spectra and clustering

Charles Blandin 1 Alexey Ozerov 1 Emmanuel Vincent 1 
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : We consider the problem of estimating the time differences of arrival (TDOAs) of multiple sources from a two-channel reverberant audio signal. While several clustering-based or angular spectrum-based methods have been proposed in the literature, only relatively small-scale experimental evaluations restricted to either category of methods have been carried out so far. We design and conduct the first large-scale experimental evaluation of these methods and investigate a two-step procedure combining angular spectra and clustering. In addition, we introduce and evaluate five new TDOA estimation methods inspired from signal-to-noise-ratio (SNR) weighting and probabilistic multi-source modeling techniques that have been successful for anechoic TDOA estimation and audio source separation. The results show that clustering-based methods do not improve upon angular spectrum-based methods. For 5 cm microphone spacing, the best TDOA estimation performance is achieved by one of the proposed SNR-based angular spectrum methods. For larger spacing, a variant of the generalized cross-correlation with phase transform (GCC-PHAT) method performs best.
Complete list of metadata

Cited literature [30 references]  Display  Hide  Download
Contributor : Emmanuel Vincent Connect in order to contact the contributor
Submitted on : Tuesday, March 6, 2012 - 11:50:36 AM
Last modification on : Friday, May 6, 2022 - 4:26:02 PM
Long-term archiving on: : Thursday, June 7, 2012 - 2:21:12 AM


Files produced by the author(s)



Charles Blandin, Alexey Ozerov, Emmanuel Vincent. Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Processing, 2012, 92, pp.1950-1960. ⟨10.1016/j.sigpro.2011.10.032⟩. ⟨inria-00630994v2⟩



Record views


Files downloads