About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models

Denis Jouvet; Dominique Fohr

Communication Dans Un Congrès Année : 2014

About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models

(1) , (1)

Denis Jouvet

Fonction : Auteur
PersonId : 15904
IdHAL : denis-jouvet
IdRef : 029418666

Analysis, perception and recognition of speech

Dominique Fohr

Fonction : Auteur
PersonId : 15652
IdHAL : dominique-fohr
IdRef : 031092942

Analysis, perception and recognition of speech

Résumé

This paper introduces the combination of speech decoders for selecting automatically transcribed speech data for unsupervised training or adaptation of acoustic models. Here, the combination relies on the use of a forward-based and a backward-based decoder. Best performance is achieved when selecting automatically transcribed data (speech segments) that have the same word hypotheses when processed by the Sphinx forward-based and the Julius backward-based transcription systems, and this selection process outperforms confidence measure based selection. Results are reported and discussed for adaptation and for full training from scratch, using data resulting from various selection processes, whether alone or in addition to the baseline manually transcribed data. Overall, selecting automatically transcribed speech segments that have the same word hypotheses when processed by the Sphinx forward-based and Julius backward-based recognizers, and adding this automatically transcribed and selected data to the manually transcribed data leads to significant word error rate reductions on the ESTER2 data when compared to the baseline system trained only on manually transcribed speech data.

Mots clés

unsupervised training combining recognizer outputs data selection LVCSR speech recognition

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

IS14-ForwardBackwardDecodingForUnsupervisedTraining-V1.3.pdf (322.44 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Denis Jouvet : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01090483

Soumis le : mercredi 3 décembre 2014-15:52:29

Dernière modification le : lundi 11 septembre 2023-17:41:19

Archivage à long terme le : lundi 9 mars 2015-05:50:43

Dates et versions

hal-01090483 , version 1 (03-12-2014)

Identifiants

HAL Id : hal-01090483 , version 1

Citer

Denis Jouvet, Dominique Fohr. About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models. INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Sep 2014, Singapour, Singapore. ⟨hal-01090483⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA UNIV-LORRAINE INRIA2 LORIA LORIA-NLPKD

148 Consultations

159 Téléchargements

About Combining Forward and Backward-Based Decoders for Selecting Data for Unsupervised Training of Acoustic Models

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager