An extended experimental investigation of DNN uncertainty propagation for noise robust ASR

Karan Nathwani 1 Juan Morales-Cordovilla 2 Sunit Sivasankaran 1 Irina Illina 1 Emmanuel Vincent 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Automatic speech recognition (ASR) in noisy environments remains a challenging goal. Recently, the idea of estimating the uncertainty about the features obtained after speech enhancement and propagating it to dynamically adapt deep neural network (DNN) based acoustic models has raised some interest. However, the results in the literature were reported on simulated noisy datasets for a limited variety of uncertainty estimators. We found that they vary significantly in different conditions. Hence, the main contribution of this work is to assess DNN uncertainty decoding performance for different data conditions and different uncertainty estimation/propagation techniques. In addition, we propose a neural network based uncertainty estima-tor and compare it with other uncertainty estimators. We report detailed ASR results on the CHiME-2 and CHiME-3 datasets. We find that, on average, uncertainty propagation provides similar relative improvement on real and simulated data and that the proposed uncertainty estimator performs significantly better than the one in [1]. We also find that the improvement is consistent, but it depends on the signal-to-noise ratio (SNR) and the noise environment.
Type de document :
Communication dans un congrès
5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2017), Mar 2017, San Francisco, United States. 2017
Liste complète des métadonnées

Littérature citée [29 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01446441
Contributeur : Emmanuel Vincent <>
Soumis le : mercredi 25 janvier 2017 - 23:59:18
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : mercredi 26 avril 2017 - 18:53:17

Fichier

nathwani_HSCMA17.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01446441, version 1

Citation

Karan Nathwani, Juan Morales-Cordovilla, Sunit Sivasankaran, Irina Illina, Emmanuel Vincent. An extended experimental investigation of DNN uncertainty propagation for noise robust ASR. 5th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2017), Mar 2017, San Francisco, United States. 2017. 〈hal-01446441〉

Partager

Métriques

Consultations de la notice

496

Téléchargements de fichiers

181