A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Anechoic Mixture

Simon Arberet 1 Rémi Gribonval 1 Frédéric Bimbot 1 
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : We propose a new method, called DEMIX Anechoic, to estimate the mixing conditions, i.e. number of audio sources plus attenuation and time delay of each sources, in an underdetermined anechoic mixture. The method relies on the assumption that in the neighborhood of some time-frequency points, only one source contributes to the mixture. Such time-frequency points, located with a local confidence measure, provide estimates of the attenuation, as well as the phase difference at some frequency, of the corresponding source. The time delay parameters are estimated, by a method similar to GCC-PHAT, on points having close attenuations. As opposed to DUET like methods, our method can estimate time-delay higher than only one sample. Experiments show that DEMIX Anechoic estimates, in more than 65% of the cases, the number of directions until 6 sources and outperforms DUET in the accuracy of the estimation by a factor of 10.
Simon Arberet, Rémi Gribonval, Frédéric Bimbot. A Robust Method to Count and Locate Audio Sources in a Stereophonic Linear Anechoic Mixture. Proc. IEEE Intl. Conf. Acoust. Speech Signal Process (ICASSP'07), Apr 2007, Honolulu, Hawai, United States. pp.III-745 - III-748, ⟨10.1109/ICASSP.2007.366787⟩. ⟨inria-00544778⟩



