On noise masking for automatic missing data speech recognition: a survey and discussion

Christophe Cerisara 1 Sébastien Demange 1 Jean-Paul Haton 1
1 PAROLE - Analysis, perception and recognition of speech
INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications
Abstract : Automatic speech recognition (ASR) has reached very high levels of performance in controlled situations. However, the performance degrades significantly when environmental noise occurs during the recognition process. Nowadays, the major challenge is to reach a good robustness to adverse conditions, so that automatic speech recognizers can be used in real situations. Missing data theory is a very attractive and promising approach. Unlike other denoising methods, missing data recognition does not match the whole data with the acoustic models, but instead considers part of the signal as missing, i.e. corrupted by noise. While speech recognition with missing data can be handled efficiently by methods such as data imputation or marginalization, accurately identifying missing parts (also called masks) remains a very challenging task. This paper reviews the main approaches that have been proposed to address this problem. The objective of this study is to identify the mask estimation methods that have been proposed so far, and to open this domain up to other related research, which could be adapted to overcome this difficult challenge. In order to restrict the range of methods, only the techniques using a single microphone are considered.
Type de document :
Article dans une revue
Computer Speech and Language, Elsevier, 2007, 21 (3), pp.443-457
Liste complète des métadonnées

https://hal.inria.fr/inria-00160554
Contributeur : Christophe Cerisara <>
Soumis le : vendredi 6 juillet 2007 - 11:40:07
Dernière modification le : vendredi 9 février 2018 - 13:20:01

Identifiants

  • HAL Id : inria-00160554, version 1

Collections

Citation

Christophe Cerisara, Sébastien Demange, Jean-Paul Haton. On noise masking for automatic missing data speech recognition: a survey and discussion. Computer Speech and Language, Elsevier, 2007, 21 (3), pp.443-457. 〈inria-00160554〉

Partager

Métriques

Consultations de la notice

196