Skip to Main content Skip to Navigation
Conference papers

Cross-Lingual Semantic Similarity Measure for Comparable Articles

Motaz Saad 1 David Langlois 1 Kamel Smaïli 1
1 SMarT - Statistical Machine Translation and Speech Modelization and Text
LORIA - NLPKD - Department of Natural Language Processing & Knowledge Discovery
Abstract : We aim in this research to find and compare crosslingual articles concerning a specific topic. So, we need measure for that. This measure can be based on bilingual dictionaries or based on numerical methods such as Latent Semantic Indexing (LSI). In this paper, we use the LSI in two ways to retrieve Arabic-English comparable articles. The first one is monolingual: the English article is translated into Arabic and then mapped into the Arabic LSI space; the second one is crosslingual: Arabic and English documents are mapped into Arabic-English LSI space. Then, we compare LSI approaches to the dictionary-based approach on several English-Arabic parallel and comparable corpora. Results indicate that the performance of cross-lingual LSI approach is competitive to monolingual approach, or even better for some corpora. Moreover, both LSI approaches outperform the dictionary approach.
Document type :
Conference papers
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-01067687
Contributor : Motaz Saad <>
Submitted on : Tuesday, September 23, 2014 - 6:41:51 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:02 PM

File

MotazDavidKamelPolTAl2014.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Motaz Saad, David Langlois, Kamel Smaïli. Cross-Lingual Semantic Similarity Measure for Comparable Articles. Advances in Natural Language Processing - 9th International Conference on NLP, PolTAL 2014, Warsaw, Poland, September 17-19, 2014. Proceedings, Sep 2014, Warsaw, Poland. pp.105--115, ⟨10.1007/978-3-319-10888-9_11⟩. ⟨hal-01067687⟩

Share

Metrics

Record views

353

Files downloads

404