The second 'CHiME' Speech Separation and Recognition Challenge: Datasets, tasks and baselines

Emmanuel Vincent 1, 2 Jon Barker 3 Shinji Watanabe 4 Jonathan Le Roux 4 Francesco Nesta 5 Marco Matassoni 5
1 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
2 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Distant-microphone automatic speech recognition (ASR) remains a challenging goal in everyday environments involving multiple background sources and reverberation. This paper is intended to be a reference on the 2nd 'CHiME' Challenge, an initiative designed to analyze and evaluate the performance of ASR systems in a real-world domestic environment. Two separate tracks have been proposed: a small-vocabulary task with small speaker movements and a medium-vocabulary task without speaker movements. We discuss the rationale for the challenge and provide a detailed description of the datasets, tasks and baseline performance results for each track.
Liste complète des métadonnées

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-00796625
Contributor : Emmanuel Vincent <>
Submitted on : Monday, March 4, 2013 - 3:57:17 PM
Last modification on : Wednesday, April 3, 2019 - 1:23:02 AM
Document(s) archivé(s) le : Wednesday, June 5, 2013 - 3:57:13 AM

File

vincent_ICASSP13.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00796625, version 1

Citation

Emmanuel Vincent, Jon Barker, Shinji Watanabe, Jonathan Le Roux, Francesco Nesta, et al.. The second 'CHiME' Speech Separation and Recognition Challenge: Datasets, tasks and baselines. ICASSP - 38th International Conference on Acoustics, Speech, and Signal Processing - 2013, May 2013, Vancouver, Canada. pp.126-130. ⟨hal-00796625⟩

Share

Metrics

Record views

931

Files downloads

864