Cross-Lingual Semantic Similarity Measure for Comparable Articles - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Cross-Lingual Semantic Similarity Measure for Comparable Articles

Résumé

We aim in this research to find and compare crosslingual articles concerning a specific topic. So, we need measure for that. This measure can be based on bilingual dictionaries or based on numerical methods such as Latent Semantic Indexing (LSI). In this paper, we use the LSI in two ways to retrieve Arabic-English comparable articles. The first one is monolingual: the English article is translated into Arabic and then mapped into the Arabic LSI space; the second one is crosslingual: Arabic and English documents are mapped into Arabic-English LSI space. Then, we compare LSI approaches to the dictionary-based approach on several English-Arabic parallel and comparable corpora. Results indicate that the performance of cross-lingual LSI approach is competitive to monolingual approach, or even better for some corpora. Moreover, both LSI approaches outperform the dictionary approach.
Fichier principal
Vignette du fichier
MotazDavidKamelPolTAl2014.pdf (258.64 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01067687 , version 1 (23-09-2014)

Identifiants

Citer

Motaz Saad, David Langlois, Kamel Smaïli. Cross-Lingual Semantic Similarity Measure for Comparable Articles. Advances in Natural Language Processing - 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17-19, 2014. Proceedings, Sep 2014, Warsaw, Poland. pp.105--115, ⟨10.1007/978-3-319-10888-9_11⟩. ⟨hal-01067687⟩
195 Consultations
200 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More