Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Regression versus classification for neural network based audio source localization

Abstract : We compare the performance of regression and classification neural networks for single-source direction-of-arrival estimation. Since the output space is continuous and structured, regression seems more appropriate. However, classification on a discrete spherical grid is widely believed to perform better and is predominantly used in the literature. For regression, we propose two ways to account for the spherical geometry of the output space based either on the angular distance between spherical coordinates or on the mean squared error between Cartesian coordinates. For classification, we propose two alternatives to the classical one-hot encoding framework: we derive a Gibbs distribution from the squared angular distance between grid points and use the corresponding probabilities either as soft targets or as cross-entropy weights that retain a clear probabilis-tic interpretation. We show that regression on Cartesian coordinates is generally more accurate, except when localized interference is present, in which case classification appears to be more robust.
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download
Contributor : Lauréline Perotin <>
Submitted on : Friday, May 10, 2019 - 5:39:00 PM
Last modification on : Thursday, September 17, 2020 - 12:29:04 PM


Files produced by the author(s)


  • HAL Id : hal-02125985, version 1


Lauréline Perotin, Alexandre Défossez, Emmanuel Vincent, Romain Serizel, Alexandre Guérin. Regression versus classification for neural network based audio source localization. 2019. ⟨hal-02125985v1⟩



Record views


Files downloads