Missing data mask models with global frequency and temporal constraints

Sébastien Demange 1 Christophe Cerisara 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Missing data recognition has been developped in order to increase noise robustness in automatic speech recognition. Many different factors, including the speech decoding process itself, shall be considered to locate the masks. In this work, we are considering Bayesian models of the masks, where every spectral feature is classified as reliable or masked, and is independent from the rest of the signal. This classification strategy can produce unrelated small ``spots'', while experiments suggest that oracle reliable and unreliable features tend to be clustered into time-frequency blocks. We call this undesired effect: the ``checkerboard'' effect. In this paper, we propose a new Bayesian missing data classifier that integrates frequency and temporal constraints in order to reduce, or avoid, this ``checkerboard'' effect. The proposed classifier is evaluated on the Aurora2 connected digit corpora. Integrating such constraints in the missing data classification leads to significant improvements in recognition accuracy.
Type de document :
Communication dans un congrès
Ninth International Conference on Spoken Language Processing - Interspeech 2006 - ICSLP, Sep 2006, Pittsburgh, Pennsylvania/USA, 2006
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00103574
Contributeur : Sébastien Demange <>
Soumis le : mercredi 4 octobre 2006 - 16:56:44
Dernière modification le : vendredi 9 février 2018 - 13:20:05
Document(s) archivé(s) le : mardi 6 avril 2010 - 18:12:38

Fichier

Identifiants

  • HAL Id : inria-00103574, version 1

Collections

Citation

Sébastien Demange, Christophe Cerisara, Jean-Paul Haton. Missing data mask models with global frequency and temporal constraints. Ninth International Conference on Spoken Language Processing - Interspeech 2006 - ICSLP, Sep 2006, Pittsburgh, Pennsylvania/USA, 2006. 〈inria-00103574〉

Partager

Métriques

Consultations de la notice

392

Téléchargements de fichiers

123