Discriminative uncertainty estimation for noise robust ASR

Dung Tien Tran 1 Emmanuel Vincent 2 Denis Jouvet 2
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
2 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We consider the problem of uncertainty estimation for noise-robust ASR. Existing uncertainty estimation techniques improve ASR accuracy but they still exhibit a gap compared to the use of oracle uncertainty. This comes partly from the highly non-linear feature transformation and from ad- ditional assumptions such as Gaussian distribution and independence between frequency bins in the spectral domain. In this paper, we propose a method to rescale the estimated feature-domain full uncertainty covariance matrix in a state-dependent fashion according to a discriminative criterion. The state-dependent and feature index-dependent scaling factors are learned from development data. Experimental evaluation on Track 1 of the 2nd CHiME challenge data shows that discriminative rescaling leads to better results than generative rescaling. Moreover, discriminative rescaling of the Wiener uncertainty estimator leads to 12% relative word error rate reduction compared to discriminative rescaling of the alternative estimator in [1]
Document type :
Conference papers
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.inria.fr/hal-01103969
Contributor : Dung Tran <>
Submitted on : Friday, January 16, 2015 - 10:19:52 AM
Last modification on : Saturday, March 30, 2019 - 1:26:35 AM
Long-term archiving on : Friday, September 11, 2015 - 6:52:38 AM

File

Icassp2015_bmmi.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01103969, version 1

Citation

Dung Tien Tran, Emmanuel Vincent, Denis Jouvet. Discriminative uncertainty estimation for noise robust ASR. 40th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015, Apr 2015, Brisbane, Queensland, Australia. ⟨hal-01103969⟩

Share

Metrics

Record views

364

Files downloads

341