Uncertainty-based learning of acoustic models from noisy data

Alexey Ozerov; Mathieu Lagrange; Emmanuel Vincent

doi:10.1016/j.csl.2012.07.002

Article Dans Une Revue Computer Speech and Language Année : 2013

Uncertainty-based learning of acoustic models from noisy data

(1) , (2) , (3, 4)

1
2
3
4

Alexey Ozerov

Fonction : Auteur

Technicolor [Cesson Sévigné]

Mathieu Lagrange

Fonction : Auteur
PersonId : 4329
IdHAL : mathieu-lagrange

Institut de Recherche et Coordination Acoustique/Musique

Emmanuel Vincent

Fonction : Auteur
PersonId : 1256
IdHAL : emmanuelv
ORCID : 0000-0002-0183-7289
IdRef : 089360176

Speech and sound data modeling and processing

Analysis, perception and recognition of speech

Résumé

We consider the problem of acoustic modeling of noisy speech data, where the uncertainty over the data is given by a Gaussian distribution. While this uncertainty has been exploited at the decoding stage via uncertainty decoding, its usage at the training stage remains limited to static model adaptation. We introduce a new Expectation Maximisation (EM) based technique, which we call uncertainty training, that allows us to train Gaussian mixture models (GMMs) or hidden Markov models (HMMs) directly from noisy data with dynamic uncertainty. We evaluate the potential of this technique for a GMM-based speaker recognition task on speech data corrupted by real-world domestic background noise, using a state-of-the-art signal enhancement technique and various uncertainty estimation techniques as a front-end. Compared to conventional training, the proposed training algorithm results in 1% to 2% absolute improvement in speaker recognition accuracy by training from either matched, unmatched or multi-condition noisy data. This algorithm is also applicable with minor modifications to maximum a posteriori (MAP) or maximum likelihood linear regression (MLLR) acoustic model adaptation from noisy data and to other data than audio.

Mots clés

Noisy data Training Uncertainty Classification Acoustic model Gaussian mixture model Hidden Markov model Expectation-maximization

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Fichier principal

ozerov_CSL12.pdf (476.33 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuel Vincent : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00717992

Soumis le : mercredi 17 avril 2013-22:47:50

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : jeudi 18 juillet 2013-04:10:14

Dates et versions

hal-00717992 , version 1 (15-07-2012)

hal-00717992 , version 2 (17-04-2013)

Identifiants

HAL Id : hal-00717992 , version 2
DOI : 10.1016/j.csl.2012.07.002

Citer

Alexey Ozerov, Mathieu Lagrange, Emmanuel Vincent. Uncertainty-based learning of acoustic models from noisy data. Computer Speech and Language, 2013, 27 (3), pp.874-894. ⟨10.1016/j.csl.2012.07.002⟩. ⟨hal-00717992v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA IRCAM IRISA-D5 UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES INSA-GROUPE UR1-MATH-NUM

499 Consultations

583 Téléchargements

Uncertainty-based learning of acoustic models from noisy data

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager