Discriminative uncertainty estimation for noise robust ASR

Dung Tien Tran 1 Emmanuel Vincent 2 Denis Jouvet 2
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We consider the problem of uncertainty estimation for noise-robust ASR. Existing uncertainty estimation techniques improve ASR accuracy but they still exhibit a gap compared to the use of oracle uncertainty. This comes partly from the highly non-linear feature transformation and from ad- ditional assumptions such as Gaussian distribution and independence between frequency bins in the spectral domain. In this paper, we propose a method to rescale the estimated feature-domain full uncertainty covariance matrix in a state-dependent fashion according to a discriminative criterion. The state-dependent and feature index-dependent scaling factors are learned from development data. Experimental evaluation on Track 1 of the 2nd CHiME challenge data shows that discriminative rescaling leads to better results than generative rescaling. Moreover, discriminative rescaling of the Wiener uncertainty estimator leads to 12% relative word error rate reduction compared to discriminative rescaling of the alternative estimator in [1]
Type de document :
Communication dans un congrès
40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015, Apr 2015, Brisbane, Queensland, Australia
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01103969
Contributeur : Dung Tran <>
Soumis le : vendredi 16 janvier 2015 - 10:19:52
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : vendredi 11 septembre 2015 - 06:52:38

Fichier

Icassp2015_bmmi.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01103969, version 1

Citation

Dung Tien Tran, Emmanuel Vincent, Denis Jouvet. Discriminative uncertainty estimation for noise robust ASR. 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015, Apr 2015, Brisbane, Queensland, Australia. 〈hal-01103969〉

Partager

Métriques

Consultations de la notice

312

Téléchargements de fichiers

276