Skip to Main content Skip to Navigation
Conference papers

Training Sound Event Detection On A Heterogeneous Dataset

Nicolas Turpault 1 Romain Serizel 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download
Contributor : Romain Serizel <>
Submitted on : Thursday, September 10, 2020 - 2:36:49 PM
Last modification on : Monday, December 14, 2020 - 5:52:12 PM


Files produced by the author(s)


  • HAL Id : hal-02891665, version 2
  • ARXIV : 2007.03931


Nicolas Turpault, Romain Serizel. Training Sound Event Detection On A Heterogeneous Dataset. DCASE Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02891665v2⟩



Record views


Files downloads