A Bayesian network for time-frequency speech modeling and recognition

Khalid Daoudi 1 Dominique Fohr 1 Christophe Antoine 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : In this paper, we propose a new speech model which is a Bayesian network (BN) built in the time-frequency domain. Contrarily to HMMs, this BN provides a good modeling of the frequency dynamics, particularly the asynchrony between sub-bands. The experiments we carried out show that, consequently, speech is modeled with higher fidelity. Moreover, our new model allows to perform multi-band speech recognition without {\it all} the drawbacks of the usual multi-band approach where each sub-band is independently modeled by a HMM. This makes our model very suited to the case where speech is corrupted by a band-limited noise. We present experiments on an isolated digit recognition task, in clean and noisy conditions. The results we obtain show that the BNs framework is very promising in the field of speech modeling and recognition.
Type de document :
Communication dans un congrès
International Conference on Artificial Intelligence and Soft Computing, May 2001, Cancun, Mexico, 5 p, 2001
Liste complète des métadonnées

https://hal.inria.fr/inria-00100524
Contributeur : Publications Loria <>
Soumis le : mardi 26 septembre 2006 - 14:46:28
Dernière modification le : jeudi 11 janvier 2018 - 06:19:55

Identifiants

  • HAL Id : inria-00100524, version 1

Collections

Citation

Khalid Daoudi, Dominique Fohr, Christophe Antoine. A Bayesian network for time-frequency speech modeling and recognition. International Conference on Artificial Intelligence and Soft Computing, May 2001, Cancun, Mexico, 5 p, 2001. 〈inria-00100524〉

Partager

Métriques

Consultations de la notice

210