Skip to Main content Skip to Navigation
New interface
Journal articles

Uncertainty-based learning of acoustic models from noisy data

Alexey Ozerov 1 Mathieu Lagrange 2 Emmanuel Vincent 3, 4 
3 METISS - Speech and sound data modeling and processing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
4 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new Expectation Maximisation (EM) based technique, which we call uncertainty training, that allows us to train Gaussian mixture models (GMMs) or hidden Markov models (HMMs) directly from noisy data with dynamic uncertainty. We evaluate the potential of this technique for a GMM-based speaker recognition task on speech data corrupted by real-world domestic background noise, using a state-of-the-art signal enhancement technique and various uncertainty estimation techniques as a front-end. Compared to conventional training, the proposed training algorithm results in 1% to 2% absolute improvement in speaker recognition accuracy by training from either matched, unmatched or multi-condition noisy data. This algorithm is also applicable with minor modifications to maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) acoustic model adaptation from noisy data and to other data than audio.
Complete list of metadata

Cited literature [40 references]  Display  Hide  Download
Contributor : Emmanuel Vincent Connect in order to contact the contributor
Submitted on : Wednesday, April 17, 2013 - 10:47:50 PM
Last modification on : Friday, May 6, 2022 - 4:26:02 PM
Long-term archiving on: : Thursday, July 18, 2013 - 4:10:14 AM


Files produced by the author(s)



Alexey Ozerov, Mathieu Lagrange, Emmanuel Vincent. Uncertainty-based learning of acoustic models from noisy data. Computer Speech and Language, 2013, 27 (3), pp.874-894. ⟨10.1016/j.csl.2012.07.002⟩. ⟨hal-00717992v2⟩



Record views


Files downloads