Quality of syntactic implication of RL-based sentence summarization

Hoa T Le; Christophe Cerisara; Claire Gardent

Communication Dans Un Congrès Année : 2020

Quality of syntactic implication of RL-based sentence summarization

(1) , (1) , (1)

Hoa T Le

Fonction : Auteur

Natural Language Processing : representations, inference and semantics

Christophe Cerisara

Fonction : Auteur
PersonId : 2353
IdHAL : christophe-cerisara
IdRef : 102700168

Natural Language Processing : representations, inference and semantics

Claire Gardent

Fonction : Auteur
PersonId : 3949
IdHAL : claire-gardent
ORCID : 0000-0002-3805-6662
IdRef : 034104593

Natural Language Processing : representations, inference and semantics

Résumé

Work on summarization has explored both reinforcement learning (RL) optimization using ROUGE as a reward and syntax-aware models, such as models whose input is enriched with part-of-speech (POS)-tags and/or dependency information. However, it is not clear what is the respective impact of these approaches beyond the standard ROUGE evaluation metric, which arguably fails to capture several important qualitative aspects of texts. Especially, RL-based for summa-rization is becoming more and more popular. In this paper, we provide a detailed comparison of these two approaches and of their combination along several dimensions that relate to the perceived quality of the generated summaries: how many words are repeated in the output ? How close to the ground truth is the generated distribution of part-of-speech tags ? What is the impact of sentence length ? How good are relevance and grammaticality ? Using the standard Gigaword sentence summarization task, we compare an RL self-critical sequence training (SCST) method with syntax-aware models that leverage POS tags and/or Dependency information. We show that on all qualitative evaluations, the combined model gives the best results, but also that only training with RL and without any syntactic information already gives nearly as good results as syntax-aware models with less parameters and faster training convergence.

Domaines

Traitement du texte et du document

Fichier principal

1912.05493.pdf (330.77 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Christophe Cerisara : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02883327

Soumis le : lundi 29 juin 2020-09:03:27

Dernière modification le : lundi 11 septembre 2023-17:41:18

Dates et versions

hal-02883327 , version 1 (29-06-2020)

Identifiants

HAL Id : hal-02883327 , version 1
ARXIV : 1912.05493

Citer

Hoa T Le, Christophe Cerisara, Claire Gardent. Quality of syntactic implication of RL-based sentence summarization. AAAI Workshop on Engineering Dependable and Secure Machine Learning Systems 2020, Feb 2020, New York, United States. ⟨hal-02883327⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA GRID5000 UNIV-LORRAINE LORIA LORIA-NLPKD LUE-UL IMPACT-OLKI SILECS ANR

86 Consultations

106 Téléchargements

Quality of syntactic implication of RL-based sentence summarization

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager