Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification

Chloé Braud; Pascal Denis

Communication Dans Un Congrès Année : 2014

Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification

(1) , (2)

1
2

Chloé Braud

Fonction : Auteur
PersonId : 179583
IdHAL : chloe-braud
ORCID : 0000-0002-1874-3430
IdRef : 195813219

Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing

Pascal Denis

Fonction : Auteur
PersonId : 1744
IdHAL : pascal-denis
IdRef : 031934684

Machine Learning in Information Networks

Résumé

This paper presents the first experiments on identifying implicit discourse relations (i.e., relations lacking an overt discourse connective) in French. Given the little amount of annotated data for this task, our system resorts to additional data automatically labeled using unambiguous connectives, a method introduced by (Marcu and Echihabi, 2002). We first show that a system trained solely on these artificial data does not generalize well to natural implicit examples, thus echoing the conclusion made by (Sporleder and Lascarides, 2008) for English. We then explain these initial results by analyzing the different types of distribution difference between natural and artificial implicit data. This finally leads us to propose a number of very simple methods, all inspired from work on domain adaptation, for combining the two types of data. Through various experiments on the French ANNODIS corpus, we show that our best system achieves an accuracy of 41.7%, corresponding to a 4.4% significant gain over a system solely trained on manually labeled data.

Domaines

Linguistique Informatique et langage [cs.CL]

Fichier principal

cbraud_pdenis_coling14.pdf (115.13 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Chloé Braud : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01017151

Soumis le : mardi 1 juillet 2014-21:35:54

Dernière modification le : vendredi 24 mars 2023-14:52:59

Archivage à long terme le : mercredi 1 octobre 2014-13:41:29

Dates et versions

hal-01017151 , version 1 (01-07-2014)

Identifiants

HAL Id : hal-01017151 , version 1

Citer

Chloé Braud, Pascal Denis. Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification. coling, Aug 2014, Dublin, Ireland. ⟨hal-01017151⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-PARIS7 UNIV-LILLE3 CNRS INRIA CRISTAL INRIA2 CRISTAL-MAGNET

547 Consultations

342 Téléchargements

Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager