SEARNN: Training RNNs with global-local losses

Rémi Leblond 1 Jean-Baptiste Alayrac 2 Anton Osokin 3 Simon Lacoste-Julien 4
1 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, CNRS - Centre National de la Recherche Scientifique, Inria de Paris
3 WILLOW - Models of visual object recognition and scene understanding
Inria de Paris, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the " learning to search " (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropriate surrogate for the test error: by only maximizing the ground truth probability, it fails to exploit the wealth of information offered by structured losses. Further, it introduces discrepancies between training and predicting (such as exposure bias) that may hurt test performance. Instead, SEARNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error. We demonstrate improved performance over MLE on three different tasks: OCR, spelling correction and text chunking. Finally, we propose a subsampling strategy to enable SEARNN to scale to large vocabulary sizes.
Type de document :
Pré-publication, Document de travail
12 pages. 2017
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01665263
Contributeur : Rémi Leblond <>
Soumis le : vendredi 22 décembre 2017 - 13:39:55
Dernière modification le : jeudi 26 avril 2018 - 10:29:04

Lien texte intégral

Identifiants

  • HAL Id : hal-01665263, version 1
  • ARXIV : 1706.04499

Citation

Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien. SEARNN: Training RNNs with global-local losses. 12 pages. 2017. 〈hal-01665263〉

Partager

Métriques

Consultations de la notice

274