A categorization of robust speech processing datasets

Jonathan Le Roux 1 Emmanuel Vincent 2
2 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Speech and audio signal processing research is a tale of data collection efforts and evaluation campaigns. While large datasets for automatic speech recognition (ASR) in clean environments with various speaking styles are available, the landscape is not as picture- perfect when it comes to robust ASR in realistic environments, much less so for evaluation of source separation and speech enhancement methods. Many data collection efforts have been conducted, moving along towards more and more realistic conditions, each mak- ing different compromises between mostly antagonistic factors: financial and human cost; amount of collected data; availability and quality of annotations and ground truth; natural- ness of mixing conditions; naturalness of speech content and speaking style; naturalness of the background noise; etc. In order to better understand what directions need to be explored to build datasets that best support the development and evaluation of algorithms for recognition, separation or localization that can be used in real-world applications, we present here a study of existing datasets in terms of their key attributes.
Complete list of metadatas

Cited literature [40 references]  Display  Hide  Download

https://hal.inria.fr/hal-01063805
Contributor : Emmanuel Vincent <>
Submitted on : Saturday, September 13, 2014 - 12:38:57 PM
Last modification on : Saturday, March 30, 2019 - 1:26:19 AM
Long-term archiving on : Sunday, December 14, 2014 - 10:23:23 AM

File

TR2014-116.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01063805, version 1

Collections

Citation

Jonathan Le Roux, Emmanuel Vincent. A categorization of robust speech processing datasets. [Technical Report] Mitsubishi Electric Research Labs TR2014-116, 2014. ⟨hal-01063805⟩

Share

Metrics

Record views

486

Files downloads

600