SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

Simon Lacoste-Julien 1 Konstantina Palla 2 Alex Davies 2 Gjergji Kasneci 3 Thore Graepel 4 Zoubin Ghahramani 2
1 SIERRA - Statistical Machine Learning and Parsimony
DI-ENS - Département d'informatique de l'École normale supérieure, ENS Paris - École normale supérieure - Paris, Inria Paris-Rocquencourt, CNRS - Centre National de la Recherche Scientifique : UMR8548
Abstract : The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large-scale knowledge bases still poses a considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a simple algorithm for aligning knowledge bases with millions of entities and facts. SiGMa is an iterative propagation algorithm that leverages both the structural information from the relationship graph and flexible similarity measures between entity properties in a greedy local search, which makes it scalable. Despite its greedy nature, our experiments indicate that SiGMa can efficiently match some of the world's largest knowledge bases with high accuracy. We provide additional experiments on benchmark datasets which demonstrate that SiGMa can outperform state-of-the-art approaches both in accuracy and efficiency.
Type de document :
Communication dans un congrès
KDD 2013 - The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2013, Chicago, United States. pp.572-580, 2013, 〈10.1145/2487575.2487592〉
Liste complète des métadonnées

https://hal.inria.fr/hal-00918671
Contributeur : Simon Lacoste-Julien <>
Soumis le : samedi 14 décembre 2013 - 03:28:48
Dernière modification le : jeudi 11 janvier 2018 - 01:55:18
Document(s) archivé(s) le : mardi 18 mars 2014 - 13:20:41

Fichiers

fp0172-Lacoste-Julien.pdf
Accord explicite pour ce dépôt

Identifiants

Collections

Citation

Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, et al.. SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases. KDD 2013 - The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2013, Chicago, United States. pp.572-580, 2013, 〈10.1145/2487575.2487592〉. 〈hal-00918671〉

Partager

Métriques

Consultations de la notice

265

Téléchargements de fichiers

174