HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Training Sound Event Detection On A Heterogeneous Dataset

Nicolas Turpault 1 Romain Serizel 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download

Contributor : Romain Serizel Connect in order to contact the contributor
Submitted on : Thursday, September 10, 2020 - 2:36:49 PM
Last modification on : Wednesday, November 3, 2021 - 7:57:02 AM


Files produced by the author(s)


  • HAL Id : hal-02891665, version 2
  • ARXIV : 2007.03931


Nicolas Turpault, Romain Serizel. Training Sound Event Detection On A Heterogeneous Dataset. DCASE Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02891665v2⟩



Record views


Files downloads