Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

Mathieu Dehouck 1, 2 Pascal Denis 1
1 MAGNET - Machine Learning in Information Networks
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically , this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with each word and its contexts. These delexicalized word em-beddings, which can be trained on any set of languages and capture features shared across languages, are then used in combination with standard language-specific features to train a lexicalized parser in the target language. We evaluate our approach through experiments on a set of eight different languages that are part the Universal Dependencies Project. Our main results show that using such delexicalized embeddings, either trained in a monolin-gual or multilingual fashion, achieves significant improvements over monolingual baselines.
Type de document :
Communication dans un congrès
EACL, Apr 2017, Valencia, Spain. EACL, 1, pp.241 - 250, 2017, EACL 2017. 〈http://eacl2017.org/〉. 〈10.18653/v1/E17-1023〉
Liste complète des métadonnées

Littérature citée [22 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01590639
Contributeur : Team Magnet <>
Soumis le : mercredi 20 septembre 2017 - 19:16:25
Dernière modification le : vendredi 13 avril 2018 - 01:28:41

Fichier

E17-1023.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Mathieu Dehouck, Pascal Denis. Delexicalized Word Embeddings for Cross-lingual Dependency Parsing. EACL, Apr 2017, Valencia, Spain. EACL, 1, pp.241 - 250, 2017, EACL 2017. 〈http://eacl2017.org/〉. 〈10.18653/v1/E17-1023〉. 〈hal-01590639〉

Partager

Métriques

Consultations de la notice

155

Téléchargements de fichiers

62