Skip to Main content Skip to Navigation

Multi-source TDOA estimation in reverberant audio using angular spectra and clustering

Charles Blandin 1 Alexey Ozerov 1 Emmanuel Vincent 1
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : In this article, we consider the problem of estimating the time differences of arrival (TDOAs) of multiple sources from two-channel reverberant audio mixtures. This is commonly achieved using clustering or angular spectrum-based methods. These methods are limited in that they typically affect the same weight to the spatial information provided by all time-frequency bins and rely on a binary activation model of the sources. Moreover, few experimental comparisons of different methods have been carried out so far. We introduce two new groups of TDOA estimation methods. First, we propose a time-frequency weighting procedure based on a form of signal-to-noise-ratio (SNR) that was shown to be efficient for instantaneous mixtures. Second, we introduce new clustering algorithms based on the assumption that all sources can be active in each time-frequency bin. We also study a two-step procedure combining angular spectra and clustering and conduct a large-scale experimental evaluation of the proposed and existing methods. The best average localization performance is achieved by a variant of the generalized cross-correlation with phase transform (GCC-PHAT) method without subsequent clustering. Moreover, one of the SNR-based methods we propose outperforms this method for small microphone spacing.
Complete list of metadatas
Contributor : Alexey Ozerov <>
Submitted on : Tuesday, October 11, 2011 - 2:34:40 PM
Last modification on : Friday, July 10, 2020 - 4:07:55 PM
Long-term archiving on: : Thursday, January 12, 2012 - 2:31:07 AM


Files produced by the author(s)


  • HAL Id : inria-00576297, version 3


Charles Blandin, Alexey Ozerov, Emmanuel Vincent. Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. [Research Report] RR-7566, 2011, pp.22. ⟨inria-00576297v3⟩



Record views


Files downloads