A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions

Sunit Sivasankaran; Emmanuel Vincent; Irina Illina

doi:10.1016/j.csl.2017.02.003

Article Dans Une Revue Computer Speech and Language Année : 2017

A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions

(1) , (1) , (1)

Sunit Sivasankaran

Fonction : Auteur

Speech Modeling for Facilitating Oral-Based Communication

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech Modeling for Facilitating Oral-Based Communication

Irina Illina

Fonction : Auteur
PersonId : 15663
IdHAL : irina-illina
IdRef : 120731746

Speech Modeling for Facilitating Oral-Based Communication

Résumé

Robustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. However, to the best of our knowledge, a detailed study of these techniques in varied reverberation conditions is still missing in the literature. In this paper, we conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR performance in the range of reverberation conditions found in real domestic environments. We consider both established approaches such as WPE and newer approaches such as learning hidden unit contribution (LHUC) adaptations, whose performance has not been reported before in this context, and we employ them in combination. Our results indicate that performing weighted prediction error (WPE) dereverberation on a reverberated test speech utterance and decoding using an deep neural network (DNN) acoustic model trained with multi-condition reverberated speech with feature-space maximum likelihood linear regression (fMLLR) transformed features, outperforms more recent approaches and helps significantly reduce the word error rate (WER).

Mots clés

dereverberation acoustic model adaptation robust ASR evaluation

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

sivasankaran_CSL17.pdf (1.53 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Vincent : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01461382

Soumis le : mercredi 8 février 2017-10:16:06

Dernière modification le : jeudi 1 février 2024-10:04:58

Archivage à long terme le : mardi 9 mai 2017-12:34:38

Dates et versions

hal-01461382 , version 1 (08-02-2017)

Identifiants

HAL Id : hal-01461382 , version 1
DOI : 10.1016/j.csl.2017.02.003

Citer

Sunit Sivasankaran, Emmanuel Vincent, Irina Illina. A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. Computer Speech and Language, 2017, 46, pp.444-460. ⟨10.1016/j.csl.2017.02.003⟩. ⟨hal-01461382⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA GRID5000 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES SILECS UR1-MATH-NUM

440 Consultations

534 Téléchargements

A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager