CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings

Lauréline Perotin 1, 2 Romain Serizel 1 Emmanuel Vincent 1 Alexandre Guérin 2
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Localizing audio sources is challenging in real reverberant environments, especially when several sources are active. We propose to use a neural network built from stacked convolutional and recurrent layers in order to estimate the direction of arrival of multiple sources from a first-order Ambisonics recording. It returns the directions of arrival (over a discrete grid) of a known number of sources. We propose to use features derived from the acoustic intensity vector as inputs. We analyze the behavior of the neural network by means of a visualization technique called layerwise relevance propagation. This analysis highlights which parts of the input signal are relevant in a given situation. We also conduct experiments to evaluate the performance of our system in various environments, from simulated rooms to real recordings, with one or two speech sources. The results show that the proposed features significantly improve performances with respect to raw Ambisonics inputs.
Complete list of metadatas

Cited literature [55 references]  Display  Hide  Download

https://hal.inria.fr/hal-01839883
Contributor : Lauréline Perotin <>
Submitted on : Tuesday, February 26, 2019 - 12:03:27 PM
Last modification on : Saturday, July 13, 2019 - 3:42:21 PM
Long-term archiving on : Monday, May 27, 2019 - 2:00:47 PM

File

Perotin-2019-CRNN-based_multip...
Files produced by the author(s)

Identifiers

Collections

Citation

Lauréline Perotin, Romain Serizel, Emmanuel Vincent, Alexandre Guérin. CRNN-based multiple DoA estimation using acoustic intensity features for Ambisonics recordings. IEEE Journal of Selected Topics in Signal Processing, IEEE, 2019, Special Issue on Acoustic Source Localization and Tracking in Dynamic Real-life Scenes, 13 (1), pp.22 - 33. ⟨10.1109/jstsp.2019.2900164⟩. ⟨hal-01839883v2⟩

Share

Metrics

Record views

132

Files downloads

1024