About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models

Denis Jouvet 1 Dominique Fohr 1
1 PAROLE - Analysis, perception and recognition of speech
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : This paper introduces the combination of speech decoders for selecting automatically transcribed speech data for unsupervised training or adaptation of acoustic models. Here, the combination relies on the use of a forward-based and a backward-based decoder. Best performance is achieved when selecting automatically transcribed data (speech segments) that have the same word hypotheses when processed by the Sphinx forward-based and the Julius backward-based transcription systems, and this selection process outperforms confidence measure based selection. Results are reported and discussed for adaptation and for full training from scratch, using data resulting from various selection processes, whether alone or in addition to the baseline manually transcribed data. Overall, selecting automatically transcribed speech segments that have the same word hypotheses when processed by the Sphinx forward-based and Julius backward-based recognizers, and adding this automatically transcribed and selected data to the manually transcribed data leads to significant word error rate reductions on the ESTER2 data when compared to the baseline system trained only on manually transcribed speech data.
Type de document :
Communication dans un congrès
INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Sep 2014, Singapour, Singapore. 2014
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01090483
Contributeur : Denis Jouvet <>
Soumis le : mercredi 3 décembre 2014 - 15:52:29
Dernière modification le : jeudi 11 janvier 2018 - 06:25:24
Document(s) archivé(s) le : lundi 9 mars 2015 - 05:50:43

Fichier

IS14-ForwardBackwardDecodingFo...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01090483, version 1

Collections

Citation

Denis Jouvet, Dominique Fohr. About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models. INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Sep 2014, Singapour, Singapore. 2014. 〈hal-01090483〉

Partager

Métriques

Consultations de la notice

225

Téléchargements de fichiers

124