Convolutional Neural Networks for Speaker-Independent Speech Recognition

Abstract : In this work we analyze a neural network structure capable of achieving a degree of invariance to speaker vocal tracts for speech recognition applications. It will be shown that invariance to a speaker’s pitch can be built into the classification stage of the speech recognition process using convolutional neural networks, whereas in the past attempts have been made to achieve invariance on the feature set used in the classification stage. We conduct experiments for the segment-level phoneme classification task using convolutional neural networks and compare them to neural network structures previously used in speech recognition, primarily the time-delayed neural network and the standard multilayer perceptron. The results show that convolutional neuralnetworks can in many cases achieve superior performance than the classical structures.
Type de document :
Mémoires d'étudiants -- Hal-inria+
Machine Learning [stat.ML]. 2011
Liste complète des métadonnées

Littérature citée [39 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01142043
Contributeur : Eugene Belilovsky <>
Soumis le : mardi 14 avril 2015 - 13:28:09
Dernière modification le : samedi 18 avril 2015 - 22:09:53
Document(s) archivé(s) le : mardi 18 avril 2017 - 18:54:16

Identifiants

  • HAL Id : hal-01142043, version 1

Citation

Eugene Belilovsky. Convolutional Neural Networks for Speaker-Independent Speech Recognition. Machine Learning [stat.ML]. 2011. 〈hal-01142043〉

Partager

Métriques

Consultations de la notice

62

Téléchargements de fichiers

470