About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models

Denis Jouvet 1 Dominique Fohr 1
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper introduces the combination of speech decoders for selecting automatically transcribed speech data for unsupervised training or adaptation of acoustic models. Here, the combination relies on the use of a forward-based and a backward-based decoder. Best performance is achieved when selecting automatically transcribed data (speech segments) that have the same word hypotheses when processed by the Sphinx forward-based and the Julius backward-based transcription systems, and this selection process outperforms confidence measure based selection. Results are reported and discussed for adaptation and for full training from scratch, using data resulting from various selection processes, whether alone or in addition to the baseline manually transcribed data. Overall, selecting automatically transcribed speech segments that have the same word hypotheses when processed by the Sphinx forward-based and Julius backward-based recognizers, and adding this automatically transcribed and selected data to the manually transcribed data leads to significant word error rate reductions on the ESTER2 data when compared to the baseline system trained only on manually transcribed speech data.
Document type :
Conference papers
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-01090483
Contributor : Denis Jouvet <>
Submitted on : Wednesday, December 3, 2014 - 3:52:29 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM
Long-term archiving on: Monday, March 9, 2015 - 5:50:43 AM

File

IS14-ForwardBackwardDecodingFo...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01090483, version 1

Collections

Citation

Denis Jouvet, Dominique Fohr. About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models. INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Sep 2014, Singapour, Singapore. ⟨hal-01090483⟩

Share

Metrics

Record views

271

Files downloads

151