Vietnamese Sentence Similarity Based on Concepts - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Vietnamese Sentence Similarity Based on Concepts

Hien T. Nguyen
  • Fonction : Auteur
  • PersonId : 994842
Phuc H. Duong
  • Fonction : Auteur
  • PersonId : 994892
Vinh T. Vo
  • Fonction : Auteur
  • PersonId : 994893

Résumé

We propose a novel method for measuring semantic similarity of two sentences. The originality of the method is the way that it explores the similarity of concepts referred to in the sentences using Wikipedia. The method also exploits Wiktionary to measure word-to-word similarity. The overall semantic similarity is a linear combination of word-to-word similarity, word-order similarity, and concept similarity. We build datasets consisting of 45 Vietnamese sentence pairs and then evaluate the method on these datasets. The results show that in the best cases, concept similarity help improving the performance of our method more than 15% point. The proposed method is language-independent and quite easy to employ. Therefore, one can readily adopt our method to measure semantic similarity for sentences written in other languages.
Fichier principal
Vignette du fichier
978-3-662-45237-0_24_Chapter.pdf (498.33 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01405592 , version 1 (30-11-2016)

Licence

Paternité

Identifiants

Citer

Hien T. Nguyen, Phuc H. Duong, Vinh T. Vo. Vietnamese Sentence Similarity Based on Concepts. 13th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Nov 2014, Ho Chi Minh City, Vietnam. pp.243-253, ⟨10.1007/978-3-662-45237-0_24⟩. ⟨hal-01405592⟩
179 Consultations
594 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More