Skip to Main content Skip to Navigation
Conference papers

Training Sound Event Detection On A Heterogeneous Dataset

Nicolas Turpault 1 Romain Serizel 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Training a sound event detection algorithm on a heterogeneous dataset including both recorded and synthetic soundscapes that can have various labeling granularity is a non-trivial task that can lead to systems requiring several technical choices. These technical choices are often passed from one system to another without being questioned. We propose to perform a detailed analysis of DCASE 2020 task 4 sound event detection baseline with regards to several aspects such as the type of data used for training, the parameters of the mean-teacher or the transformations applied while generating the synthetic soundscapes. Some of the parameters that are usually used as default are shown to be sub-optimal.
Complete list of metadatas

Cited literature [26 references]  Display  Hide  Download

https://hal.inria.fr/hal-02891665
Contributor : Romain Serizel <>
Submitted on : Thursday, September 10, 2020 - 2:36:49 PM
Last modification on : Wednesday, September 16, 2020 - 3:35:44 AM

Files

main_SED.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02891665, version 2
  • ARXIV : 2007.03931

Collections

Citation

Nicolas Turpault, Romain Serizel. Training Sound Event Detection On A Heterogeneous Dataset. DCASE Workshop, Nov 2020, Tokyo, Japan. ⟨hal-02891665v2⟩

Share

Metrics

Record views

29

Files downloads

85