Skip to Main content Skip to Navigation
Conference papers

Modeling inter-speaker variability in speech recognition

Abstract : This paper details a method for taking into account variability influence in HMM-based speech recognition. The set of Gaussian components of the mixtures represents the entire acoustic space covered for all possible variability values. For each utterance to be recognized, the corresponding variability value is estimated and used to weight and/or constrain dynamically the acoustic space for each pdf. To do that, the weight coefficients of the Gaussian mixtures are set dependent on the variability value. As an example, the variability considered is the inter-speaker variability, and is handled through speaker classes. Taking into account for each utterance the four speaker classes that best match with the utterance signal leads to a significant word error rate reduction on a continuous speech recognition task, as compared to standard speaker-independent modeling.
Complete list of metadata
Contributor : Denis Jouvet Connect in order to contact the contributor
Submitted on : Monday, August 22, 2011 - 6:02:22 PM
Last modification on : Wednesday, November 8, 2017 - 6:46:02 PM



Gwenael Cloarec, Denis Jouvet. Modeling inter-speaker variability in speech recognition. ICASSP-2008 (IEEE International Conference on Acoustics, Speech and Signal Processing, 2008), Mar 2008, las Vegas, United States. ⟨10.1109/ICASSP.2008.4518663⟩. ⟨inria-00616509⟩



Record views