The PASCAL CHiME Speech Separation and Recognition Challenge

Jon Barker; Emmanuel Vincent; Ning Ma; Heidi Christensen; Phil Green

doi:10.1016/j.csl.2012.10.004

Article Dans Une Revue Computer Speech and Language Année : 2013

The PASCAL CHiME Speech Separation and Recognition Challenge

(1) , (2, 3) , (1) , (1) , (1)

1
2
3

Jon Barker

Fonction : Auteur

Department of Computer Sciences [Scheffield]

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech and sound data modeling and processing

Analysis, perception and recognition of speech

Ning Ma

Fonction : Auteur
PersonId : 873361

Department of Computer Sciences [Scheffield]

Heidi Christensen

Fonction : Auteur

Department of Computer Sciences [Scheffield]

Phil Green

Fonction : Auteur

Department of Computer Sciences [Scheffield]

Résumé

Distant microphone speech recognition systems that operate with humanlike robustness remain a distant goal. The key difficulty is that operating in everyday listening conditions entails processing a speech signal that is reverberantly mixed into a noise background composed of multiple competing sound sources. This paper describes a recent speech recognition evaluation that was designed to bring together researchers from multiple communities in order to foster novel approaches to this problem. The task was to identify keywords from sentences reverberantly mixed into audio backgrounds binaurally-recorded in a busy domestic environment. The challenge was designed to model the essential difficulties of multisource environment problem while remaining on a scale that would make it accessible to a wide audience. Compared to previous ASR evaluation a particular novelty of the task is that the utterances to be recognised were provided in a continuous audio background rather than as pre-segmented utterances thus allowing a range of background modelling techniques to be employed. The challenge attracted thirteen submissions. This paper describes the challenge problem, provides an overview of the systems that were entered and provides a comparison alongside both a baseline recognition system and human performance. The paper discusses insights gained from the challenge and lessons learnt for the design of future such evaluations.

Mots clés

Speech recognition Sourceseparation Noiserobustness

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

barker_CSL12.pdf (151.49 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Vincent : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00743529

Soumis le : vendredi 19 octobre 2012-13:31:22

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : dimanche 20 janvier 2013-03:39:18

Dates et versions

hal-00743529 , version 1 (19-10-2012)

Identifiants

HAL Id : hal-00743529 , version 1
DOI : 10.1016/j.csl.2012.10.004

Citer

Jon Barker, Emmanuel Vincent, Ning Ma, Heidi Christensen, Phil Green. The PASCAL CHiME Speech Separation and Recognition Challenge. Computer Speech and Language, 2013, 27 (3), pp.621-633. ⟨10.1016/j.csl.2012.10.004⟩. ⟨hal-00743529⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRISA-D5 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

983 Consultations

1732 Téléchargements

The PASCAL CHiME Speech Separation and Recognition Challenge

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager