Skip to Main content Skip to Navigation
Master thesis

Convolutional Neural Networks for Speaker-Independent Speech Recognition

Abstract : In this work we analyze a neural network structure capable of achieving a degree of invariance to speaker vocal tracts for speech recognition applications. It will be shown that invariance to a speaker’s pitch can be built into the classification stage of the speech recognition process using convolutional neural networks, whereas in the past attempts have been made to achieve invariance on the feature set used in the classification stage. We conduct experiments for the segment-level phoneme classification task using convolutional neural networks and compare them to neural network structures previously used in speech recognition, primarily the time-delayed neural network and the standard multilayer perceptron. The results show that convolutional neuralnetworks can in many cases achieve superior performance than the classical structures.
Document type :
Master thesis
Complete list of metadata

Cited literature [39 references]  Display  Hide  Download

https://hal.inria.fr/hal-01142043
Contributor : Eugene Belilovsky <>
Submitted on : Tuesday, April 14, 2015 - 1:28:09 PM
Last modification on : Saturday, April 18, 2015 - 10:09:53 PM
Long-term archiving on: : Tuesday, April 18, 2017 - 6:54:16 PM

Identifiers

  • HAL Id : hal-01142043, version 1

Citation

Eugene Belilovsky. Convolutional Neural Networks for Speaker-Independent Speech Recognition. Machine Learning [stat.ML]. 2011. ⟨hal-01142043⟩

Share

Metrics

Record views

87

Files downloads

1044