Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Training Sound Event Detection On A Heterogeneous Dataset

Nicolas Turpault 1 Romain Serizel 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Romain Serizel <>
Submitted on : Tuesday, July 7, 2020 - 9:25:47 AM
Last modification on : Wednesday, September 16, 2020 - 3:35:44 AM
Long-term archiving on: : Friday, November 27, 2020 - 12:12:58 PM


Files produced by the author(s)


  • HAL Id : hal-02891665, version 1
  • ARXIV : 2007.03931


Nicolas Turpault, Romain Serizel. Training Sound Event Detection On A Heterogeneous Dataset. 2020. ⟨hal-02891665v1⟩



Record views


Files downloads