SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

Simon Lacoste-Julien; Konstantina Palla; Alex Davies; Gjergji Kasneci; Thore Graepel; Zoubin Ghahramani

doi:10.1145/2487575.2487592

Communication Dans Un Congrès Année : 2013

SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

(1) , (2) , (2) , (3) , (4) , (2)

1
2
3
4

Simon Lacoste-Julien

Fonction : Auteur
PersonId : 1938
IdHAL : simon-lacoste-julien
ORCID : 0000-0001-6485-6180
IdRef : 22557781X

Statistical Machine Learning and Parsimony

Konstantina Palla

Fonction : Auteur

Cambridge Machine Learning Group

Alex Davies

Fonction : Auteur

Cambridge Machine Learning Group

Gjergji Kasneci

Fonction : Auteur

Max-Planck-Institut für Informatik

Thore Graepel

Fonction : Auteur

Microsoft Research [Redmond]

Zoubin Ghahramani

Fonction : Auteur

Cambridge Machine Learning Group

Résumé

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information. Tools for automatically aligning these knowledge bases would make it possible to unify many sources of structured knowledge and answer complex queries. However, the efficient alignment of large-scale knowledge bases still poses a considerable challenge. Here, we present Simple Greedy Matching (SiGMa), a simple algorithm for aligning knowledge bases with millions of entities and facts. SiGMa is an iterative propagation algorithm that leverages both the structural information from the relationship graph and flexible similarity measures between entity properties in a greedy local search, which makes it scalable. Despite its greedy nature, our experiments indicate that SiGMa can efficiently match some of the world's largest knowledge bases with high accuracy. We provide additional experiments on benchmark datasets which demonstrate that SiGMa can outperform state-of-the-art approaches both in accuracy and efficiency.

Domaines

Intelligence artificielle [cs.AI] Base de données [cs.DB] Recherche d'information [cs.IR]

Fichier principal

fp0172-Lacoste-Julien.pdf (312.92 Ko)

Origine : Accord explicite pour ce dépôt

Simon Lacoste-Julien : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00918671

Soumis le : samedi 14 décembre 2013-03:28:48

Dernière modification le : vendredi 19 avril 2024-16:18:57

Archivage à long terme le : mardi 18 mars 2014-13:20:41

Dates et versions

hal-00918671 , version 1 (14-12-2013)

Identifiants

HAL Id : hal-00918671 , version 1
DOI : 10.1145/2487575.2487592

Citer

Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, et al.. SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases. KDD 2013 - The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Aug 2013, Chicago, United States. pp.572-580, ⟨10.1145/2487575.2487592⟩. ⟨hal-00918671⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

3166 Consultations

438 Téléchargements

SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager