Modeling inter-speaker variability in speech recognition

Gwenael Cloarec; Denis Jouvet

doi:10.1109/ICASSP.2008.4518663

Communication Dans Un Congrès Année : 2008

Modeling inter-speaker variability in speech recognition

(1) , (1)

Gwenael Cloarec

Fonction : Auteur

France Télécom R&D

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

France Télécom R&D

Résumé

This paper details a method for taking into account variability influence in HMM-based speech recognition. The set of Gaussian components of the mixtures represents the entire acoustic space covered for all possible variability values. For each utterance to be recognized, the corresponding variability value is estimated and used to weight and/or constrain dynamically the acoustic space for each pdf. To do that, the weight coefficients of the Gaussian mixtures are set dependent on the variability value. As an example, the variability considered is the inter-speaker variability, and is handled through speaker classes. Taking into account for each utterance the four speaker classes that best match with the utterance signal leads to a significant word error rate reduction on a continuous speech recognition task, as compared to standard speaker-independent modeling.

Domaines

Traitement du signal et de l'image [eess.SP] Traitement du signal et de l'image [eess.SP]

Denis Jouvet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00616509

Soumis le : lundi 22 août 2011-18:02:22

Dernière modification le : mercredi 8 novembre 2017-18:46:02

Dates et versions

inria-00616509 , version 1 (22-08-2011)

Identifiants

HAL Id : inria-00616509 , version 1
DOI : 10.1109/ICASSP.2008.4518663

Citer

Gwenael Cloarec, Denis Jouvet. Modeling inter-speaker variability in speech recognition. ICASSP-2008 (IEEE International Conference on Acoustics, Speech and Signal Processing, 2008), Mar 2008, las Vegas, United States. ⟨10.1109/ICASSP.2008.4518663⟩. ⟨inria-00616509⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

36 Consultations

0 Téléchargements

Modeling inter-speaker variability in speech recognition

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager