Effectively long-distance dependencies in French : annotation and parsing evaluation

Abstract : We describe the annotation of cases of extraction in French, whose previous annotations in the available French treebanks were insufficient to recover the correct predicate-argument dependency between the extracted element and its head. These cases are special cases of LDDs, that we call effectively long- distance dependencies (eLDDs), in which the extracted element is indeed separated from its head by one or more intervening heads (instead of zero, one or more for the general case). We found that extraction of a dependent of a finite verb is very rarely an eLDD (one case out of 420 000 tokens), but eLDDs corresponding to extraction out of infinitival phrase is more fre- quent (one third of all occurrences of accusative relative pronoun que), and eLDDs with extraction out of NPs are quite common (2/3 of the occurrences of relative pronoun dont). We also use the annotated data in statistical depen- dency parsing experiments, and compare several parsing architectures able to recover non-local governors for extracted elements.
Document type :
Conference papers
TLT 11 - The 11th International Workshop on Treebanks and Linguistic Theories, Nov 2012, Lisbon, Portugal. 2012
Liste complète des métadonnées

Cited literature [19 references]  Display  Hide  Download

https://hal.inria.fr/hal-00769625
Contributor : Marie Candito <>
Submitted on : Wednesday, January 2, 2013 - 3:44:06 PM
Last modification on : Friday, August 31, 2018 - 9:25:46 AM
Document(s) archivé(s) le : Wednesday, April 3, 2013 - 3:48:28 AM

File

tlt_extraction_final.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00769625, version 1

Collections

Citation

Marie Candito, Djamé Seddah. Effectively long-distance dependencies in French : annotation and parsing evaluation. TLT 11 - The 11th International Workshop on Treebanks and Linguistic Theories, Nov 2012, Lisbon, Portugal. 2012. 〈hal-00769625〉

Share

Metrics

Record views

416

Files downloads

301