Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification

Chloé Braud 1 Pascal Denis 2
1 ALPAGE - Analyse Linguistique Profonde à Grande Echelle ; Large-scale deep linguistic processing
Inria Paris-Rocquencourt, UPD7 - Université Paris Diderot - Paris 7
2 MAGNET - Machine Learning in Information Networks
LIFL - Laboratoire d'Informatique Fondamentale de Lille, Inria Lille - Nord Europe
Abstract : This paper presents the first experiments on identifying implicit discourse relations (i.e., relations lacking an overt discourse connective) in French. Given the little amount of annotated data for this task, our system resorts to additional data automatically labeled using unambiguous connectives, a method introduced by (Marcu and Echihabi, 2002). We first show that a system trained solely on these artificial data does not generalize well to natural implicit examples, thus echoing the conclusion made by (Sporleder and Lascarides, 2008) for English. We then explain these initial results by analyzing the different types of distribution difference between natural and artificial implicit data. This finally leads us to propose a number of very simple methods, all inspired from work on domain adaptation, for combining the two types of data. Through various experiments on the French ANNODIS corpus, we show that our best system achieves an accuracy of 41.7%, corresponding to a 4.4% significant gain over a system solely trained on manually labeled data.
Type de document :
Communication dans un congrès
coling, Aug 2014, Dublin, Ireland. 2014
Liste complète des métadonnées

Littérature citée [35 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01017151
Contributeur : Chloé Braud <>
Soumis le : mardi 1 juillet 2014 - 21:35:54
Dernière modification le : jeudi 15 novembre 2018 - 20:27:26
Document(s) archivé(s) le : mercredi 1 octobre 2014 - 13:41:29

Fichier

cbraud_pdenis_coling14.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01017151, version 1

Citation

Chloé Braud, Pascal Denis. Combining Natural and Artificial Examples to Improve Implicit Discourse Relation Identification. coling, Aug 2014, Dublin, Ireland. 2014. 〈hal-01017151〉

Partager

Métriques

Consultations de la notice

1408

Téléchargements de fichiers

472