Modeling inter-speaker variability in speech recognition - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Modeling inter-speaker variability in speech recognition

Denis Jouvet

Résumé

This paper details a method for taking into account variability influence in HMM-based speech recognition. The set of Gaussian components of the mixtures represents the entire acoustic space covered for all possible variability values. For each utterance to be recognized, the corresponding variability value is estimated and used to weight and/or constrain dynamically the acoustic space for each pdf. To do that, the weight coefficients of the Gaussian mixtures are set dependent on the variability value. As an example, the variability considered is the inter-speaker variability, and is handled through speaker classes. Taking into account for each utterance the four speaker classes that best match with the utterance signal leads to a significant word error rate reduction on a continuous speech recognition task, as compared to standard speaker-independent modeling.
Fichier non déposé

Dates et versions

inria-00616509 , version 1 (22-08-2011)

Identifiants

Citer

Gwenael Cloarec, Denis Jouvet. Modeling inter-speaker variability in speech recognition. ICASSP-2008 (IEEE International Conference on Acoustics, Speech and Signal Processing, 2008), Mar 2008, las Vegas, United States. ⟨10.1109/ICASSP.2008.4518663⟩. ⟨inria-00616509⟩
36 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More