Skip to Main content Skip to Navigation
New interface
Conference papers

Combining Forward-based and Backward-based Decoders for Improved Speech Recognition Performance

Denis Jouvet 1 Dominique Fohr 1 
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : Combining outputs of speech recognizers is a known way of increasing speech recognition performance. The ROVER approach handles efficiently such combinations. In this paper we show that the best performance is not achieved by combining the outputs of the best set of recognizers, but rather by combining outputs of recognizers that rely on different processing components, and in particular on a different order (backward vs. forward) for processing speech frames. Indeed, much better speech recognition results were obtained by combining outputs of sphinx-based recognizers with outputs of Julius-based recognizers than by combining the same number of outputs from only sphinx-based recognizers, even if the individual sphinx-based systems led to better results than the individual Julius-based recognizers. Further experiments have also been conducted using sphinx-based tools for processing speech frames in reverse order (i.e. backward in time). The results clearly show that combining forward-based and backward-based decoders provide significant improvement with respect to a combination of forward only or backward only decoders. Experiments have been conducted on the ESTER2 and ETAPE speech corpora. Overall, combining sphinx-based and Julius-based systems led to 18.6% word error rate on ESTER2 test data, and 24.5% word error rate on ETAPE test data.
Complete list of metadata
Contributor : Denis Jouvet Connect in order to contact the contributor
Submitted on : Friday, June 14, 2013 - 3:54:54 PM
Last modification on : Saturday, June 25, 2022 - 7:46:53 PM


  • HAL Id : hal-00834282, version 1


Denis Jouvet, Dominique Fohr. Combining Forward-based and Backward-based Decoders for Improved Speech Recognition Performance. InterSpeech - 14th Annual Conference of the International Speech Communication Association - 2013, Aug 2013, Lyon, France. ⟨hal-00834282⟩



Record views