A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions

Sunit Sivasankaran 1 Emmanuel Vincent 1 Irina Illina 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Robustness to reverberation is a key concern for distant-microphone ASR. Various approaches have been proposed, including single-channel or multichannel dereverberation, robust feature extraction, alternative acoustic models, and acoustic model adaptation. However, to the best of our knowledge, a detailed study of these techniques in varied reverberation conditions is still missing in the literature. In this paper, we conduct a series of experiments to assess the impact of various dereverberation and acoustic model adaptation approaches on the ASR performance in the range of reverberation conditions found in real domestic environments. We consider both established approaches such as WPE and newer approaches such as learning hidden unit contribution (LHUC) adaptations, whose performance has not been reported before in this context, and we employ them in combination. Our results indicate that performing weighted prediction error (WPE) dereverberation on a reverberated test speech utterance and decoding using an deep neural network (DNN) acoustic model trained with multi-condition reverberated speech with feature-space maximum likelihood linear regression (fMLLR) transformed features, outperforms more recent approaches and helps significantly reduce the word error rate (WER).
Type de document :
Article dans une revue
Computer Speech and Language, Elsevier, 2017, 46, pp.444-460
Liste complète des métadonnées

Littérature citée [54 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01461382
Contributeur : Emmanuel Vincent <>
Soumis le : mercredi 8 février 2017 - 10:16:06
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : mardi 9 mai 2017 - 12:34:38

Fichier

sivasankaran_CSL17.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01461382, version 1

Citation

Sunit Sivasankaran, Emmanuel Vincent, Irina Illina. A combined evaluation of established and new approaches for speech recognition in varied reverberation conditions. Computer Speech and Language, Elsevier, 2017, 46, pp.444-460. 〈hal-01461382〉

Partager

Métriques

Consultations de la notice

628

Téléchargements de fichiers

359