Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer

Abstract : We present an efficient and accurate method for transferring annotations between two different treebanks of the same language. This method led to the creation of a new instance of the French Treebank (Abeillé et al., 2003), which follows the Universal Dependency annotation scheme and which was proposed to the participants of the CoNLL 2017 Universal Dependency parsing shared task (Zeman et al., 2017). Strong results from an evaluation on our gold standard (94.75% of LAS, 99.40% UAS on the test set) demonstrate the quality of this new annotated data set and validate our approach.
Document type :
Conference papers
Liste complète des métadonnées

Cited literature [22 references]  Display  Hide  Download

https://hal.inria.fr/hal-01798801
Contributor : Benoît Sagot <>
Submitted on : Wednesday, May 23, 2018 - 11:13:29 PM
Last modification on : Thursday, April 4, 2019 - 1:24:25 AM
Document(s) archivé(s) le : Friday, August 24, 2018 - 11:33:26 PM

File

1101.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01798801, version 1

Citation

Djamé Seddah, Éric Villemonte de La Clergerie, Benoît Sagot, Hector Martinez Alonso, Marie Candito. Cheating a Parser to Death: Data-driven Cross-Treebank Annotation Transfer. Eleventh International Conference on Language Resources and Evaluation (LREC 2018), May 2018, Miyazaki, Japan. ⟨hal-01798801⟩

Share

Metrics

Record views

193

Files downloads

102