Evaluation of PNCC and extended spectral subtraction methods for robust speech recognition

Thibaut Fux 1 Denis Jouvet 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper evaluates the robustness of different approaches for speech recognition with respect to signal-to-noise ratio (SNR), to signal level and to presence of non-speech data before and after utterances to be recognized. Three types of noise robust features are considered: Power Normalized Cepstral Coefficients (PNCC), Mel-Frequency Cepstral Coefficients (MFCC) after applying an extended spectral subtraction method, and Sphinx embedded denoising features from recent sphinx versions. Although removing C0 in MFCC-based features leads to a slight decrease in speech recognition performance, it makes the speech recognition system independent on the speech signal level. With multi-condition training, the three sets of noise-robust features lead to a rather similar behavior of performance with respect to SNR and presence of non-speech data. Overall, best performance is achieved with the extended spectral subtraction approach. Also, the performance of the PNCC features appears to be dependent on the initialization of the normalization factor.
Type de document :
Communication dans un congrès
EUSIPCO 2015 - 23rd European Signal Processing Conference , Aug 2015, Nice, France. Proceedings EUSIPCO 2015
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01183645
Contributeur : Denis Jouvet <>
Soumis le : lundi 10 août 2015 - 15:00:30
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : mercredi 11 novembre 2015 - 10:20:36

Fichier

Eusipco-FinalUpdate-June2015--...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01183645, version 1

Collections

Citation

Thibaut Fux, Denis Jouvet. Evaluation of PNCC and extended spectral subtraction methods for robust speech recognition. EUSIPCO 2015 - 23rd European Signal Processing Conference , Aug 2015, Nice, France. Proceedings EUSIPCO 2015. 〈hal-01183645〉

Partager

Métriques

Consultations de la notice

225

Téléchargements de fichiers

500