Vietnamese Sentence Similarity Based on Concepts

Abstract : We propose a novel method for measuring semantic similarity of two sentences. The originality of the method is the way that it explores the similarity of concepts referred to in the sentences using Wikipedia. The method also exploits Wiktionary to measure word-to-word similarity. The overall semantic similarity is a linear combination of word-to-word similarity, word-order similarity, and concept similarity. We build datasets consisting of 45 Vietnamese sentence pairs and then evaluate the method on these datasets. The results show that in the best cases, concept similarity help improving the performance of our method more than 15% point. The proposed method is language-independent and quite easy to employ. Therefore, one can readily adopt our method to measure semantic similarity for sentences written in other languages.
Type de document :
Communication dans un congrès
Khalid Saeed; Václav Snášel. 13th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Nov 2014, Ho Chi Minh City, Vietnam. Springer, Lecture Notes in Computer Science, LNCS-8838, pp.243-253, 2014, Computer Information Systems and Industrial Management. 〈10.1007/978-3-662-45237-0_24〉
Liste complète des métadonnées

Littérature citée [19 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01405592
Contributeur : Hal Ifip <>
Soumis le : mercredi 30 novembre 2016 - 11:01:30
Dernière modification le : jeudi 1 décembre 2016 - 01:04:16
Document(s) archivé(s) le : lundi 27 mars 2017 - 08:49:02

Fichier

978-3-662-45237-0_24_Chapter.p...
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Hien Nguyen, Phuc Duong, Vinh Vo. Vietnamese Sentence Similarity Based on Concepts. Khalid Saeed; Václav Snášel. 13th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Nov 2014, Ho Chi Minh City, Vietnam. Springer, Lecture Notes in Computer Science, LNCS-8838, pp.243-253, 2014, Computer Information Systems and Industrial Management. 〈10.1007/978-3-662-45237-0_24〉. 〈hal-01405592〉

Partager

Métriques

Consultations de la notice

62

Téléchargements de fichiers

82