The PASCAL CHiME Speech Separation and Recognition Challenge

Jon Barker 1 Emmanuel Vincent 2, 3 Ning Ma 1 Heidi Christensen 1 Phil Green 1
2 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
3 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Distant microphone speech recognition systems that operate with humanlike robustness remain a distant goal. The key difficulty is that operating in everyday listening conditions entails processing a speech signal that is reverberantly mixed into a noise background composed of multiple competing sound sources. This paper describes a recent speech recognition evaluation that was designed to bring together researchers from multiple communities in order to foster novel approaches to this problem. The task was to identify keywords from sentences reverberantly mixed into audio backgrounds binaurally-recorded in a busy domestic environment. The challenge was designed to model the essential difficulties of multisource environment problem while remaining on a scale that would make it accessible to a wide audience. Compared to previous ASR evaluation a particular novelty of the task is that the utterances to be recognised were provided in a continuous audio background rather than as pre-segmented utterances thus allowing a range of background modelling techniques to be employed. The challenge attracted thirteen submissions. This paper describes the challenge problem, provides an overview of the systems that were entered and provides a comparison alongside both a baseline recognition system and human performance. The paper discusses insights gained from the challenge and lessons learnt for the design of future such evaluations.
Type de document :
Article dans une revue
Computer Speech and Language, Elsevier, 2013, 27 (3), pp.621-633. 〈10.1016/j.csl.2012.10.004〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00743529
Contributeur : Emmanuel Vincent <>
Soumis le : vendredi 19 octobre 2012 - 13:31:22
Dernière modification le : mercredi 16 mai 2018 - 11:23:03
Document(s) archivé(s) le : dimanche 20 janvier 2013 - 03:39:18

Fichier

barker_CSL12.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Jon Barker, Emmanuel Vincent, Ning Ma, Heidi Christensen, Phil Green. The PASCAL CHiME Speech Separation and Recognition Challenge. Computer Speech and Language, Elsevier, 2013, 27 (3), pp.621-633. 〈10.1016/j.csl.2012.10.004〉. 〈hal-00743529〉

Partager

Métriques

Consultations de la notice

1264

Téléchargements de fichiers

1105