Skip to Main content Skip to Navigation
Conference papers

Cross-Lingual Semantic Similarity Measure for Comparable Articles

Motaz Saad 1 David Langlois 1 Kamel Smaïli 1 
1 SMarT - Statistical Machine Translation and Speech Modelization and Text
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We aim in this research to find and compare crosslingual articles concerning a specific topic. So, we need measure for that. This measure can be based on bilingual dictionaries or based on numerical methods such as Latent Semantic Indexing (LSI). In this paper, we use the LSI in two ways to retrieve Arabic-English comparable articles. The first one is monolingual: the English article is translated into Arabic and then mapped into the Arabic LSI space; the second one is crosslingual: Arabic and English documents are mapped into Arabic-English LSI space. Then, we compare LSI approaches to the dictionary-based approach on several English-Arabic parallel and comparable corpora. Results indicate that the performance of cross-lingual LSI approach is competitive to monolingual approach, or even better for some corpora. Moreover, both LSI approaches outperform the dictionary approach.
Document type :
Conference papers
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download
Contributor : Motaz Saad Connect in order to contact the contributor
Submitted on : Tuesday, September 23, 2014 - 6:41:51 PM
Last modification on : Saturday, October 16, 2021 - 11:26:09 AM


Files produced by the author(s)




Motaz Saad, David Langlois, Kamel Smaïli. Cross-Lingual Semantic Similarity Measure for Comparable Articles. Advances in Natural Language Processing - 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17-19, 2014. Proceedings, Sep 2014, Warsaw, Poland. pp.105--115, ⟨10.1007/978-3-319-10888-9_11⟩. ⟨hal-01067687⟩



Record views


Files downloads