On-Line Frame-Synchronous Noise Compensation

Vincent Barreaud 1 Irina Illina 1 Dominique Fohr 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In real life speech recognition applications, mismatch between training and testing data is known to degrade performances. This mismatch is mostly due to various and unknown noise sources that corrupt incoming features. Moreover, the mismatch function cannot be considered as stationary. There are two possible approaches to enhance speech in a robust manner, when stochastic models are used for recognition. First, adaptation techniques, such as Parallel Model Combination (PMC), propose to modify the parameters of the HMMs to make the transformed stochastic models better characterize the distorted features. Second, the corrupted features can be compensated with a transformation estimated from the noise characteristics. This second approach gathers techniques such as Cepstral Mean Subtraction (CMS), Spectral Subtraction (SS) and Stochastic Matching. The method developed here belongs to this category. Frame synchronous algorithms are usually used to cope with non-stationary noise sources and are naturally appealing. The most popular frame synchronous technique is CMS: the mean of the incoming sequence of cepstra is computed and subtracted to the next observation. We believe that this method can be enhanced by taking into account statistics of the HMMs used during the recognition. For each time-frame, a transformation is applied to the incoming noisy feature in order to compensate the action of the environment. This transformed feature is then integrated in the Viterbi process : a forward probability is computed for every state of the models. The largest forward probability gives the most probable emitting state giving the known set of previous observations. The distance from this state to the transformed feature is then used to re-estimate the transformation to be applied to the next noisy feature. Thus, this on-line algorithm performs compensation in parallel with recognition and does not make any hypothesis on the nature of the noise or perform any specific models training, contrary to CMS, SS or adaptation techniques. Simple transformations such as bias or linear functions give good results. More complex solutions, such as model-specific transforms could be studied. Our noise compensation algorithm is evaluated on the VODIS database recorded in a moving car. For each task, our technique outperforms significantly the classical methods. For instance, the algorithm gave an error rate improvement of 9.48% on PMC, 12.25% on SS and 27.74% on CMS for the phonetical numbers recognition task.
Type de document :
Communication dans un congrès
The 15th International Congress of Phonetic Sciences - ICPhS 2003, Aug 2003, Barcelone, Espagne, 4 p, 2003
Liste complète des métadonnées

Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 09:41:15
Dernière modification le : jeudi 11 janvier 2018 - 06:19:57


  • HAL Id : inria-00099791, version 1



Vincent Barreaud, Irina Illina, Dominique Fohr. On-Line Frame-Synchronous Noise Compensation. The 15th International Congress of Phonetic Sciences - ICPhS 2003, Aug 2003, Barcelone, Espagne, 4 p, 2003. 〈inria-00099791〉



Consultations de la notice