Discriminative importance weighting of augmented training data for acoustic model training

Sunit Sivasankaran 1 Emmanuel Vincent 1 Irina Illina 1
1 MULTISPEECH - Speech Modeling for Facilitating Oral-Based Communication
Inria Nancy - Grand Est, LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : DNN based acoustic models require a large amount of training data. Parametric data augmentation techniques such as adding noise, reverberation, or changing the speech rate, are often employed to boost the dataset size and the ASR performance. The choice of augmentation techniques and the associated parameters has been handled heuristically so far. In this work we propose an algorithm to automatically weight data perturbed using a variety of augmentation techniques and/or parameters. The weights are learned in a discriminative fashion so as to minimize the frame error rate using the standard gradient descent algorithm in an iterative manner. Experiments were performed using the CHiME-3 dataset. Data augmentation was done by adding noise at different SNRs. A relative WER improvement of 15% was obtained with the proposed data weighting algorithm compared to the unweighted augmented dataset. Interestingly, the resulting distribution of SNRs in the weighted training set differs significantly from that of the test set.
Type de document :
Communication dans un congrès
42th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017), Mar 2017, New Orleans, United States
Liste complète des métadonnées

Littérature citée [31 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01415759
Contributeur : Emmanuel Vincent <>
Soumis le : lundi 6 mars 2017 - 13:35:30
Dernière modification le : jeudi 11 janvier 2018 - 06:27:31
Document(s) archivé(s) le : mercredi 7 juin 2017 - 13:54:19

Fichier

sivasankaran_ICASSP17.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01415759, version 2

Citation

Sunit Sivasankaran, Emmanuel Vincent, Irina Illina. Discriminative importance weighting of augmented training data for acoustic model training. 42th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017), Mar 2017, New Orleans, United States. 〈hal-01415759v2〉

Partager

Métriques

Consultations de la notice

621

Téléchargements de fichiers

207