Simplifying Entity Resolution on Web Data with Schema-agnostic, Non-iterative Matching

Abstract : Entity Resolution (ER) aims to identify different descriptions in various Knowledge Bases (KBs) that refer to the same entity. ER is challenged by the Variety, Volume and Veracity of descriptions published in the Web of Data. To address them, we propose the MinoanER framework that fulfills full automation and support of highly heterogeneous entities. MinoanER leverages a token-based similarity of entities to define a new metric that derives the similarity of neighboring entities from the most important relations, indicated only by statistics. For high efficiency, similarities are computed from a set of schema-agnostic blocks and processed in a non-iterative way that involves four threshold-free heuristics. We demonstrate that the effectiveness of MinoanER is comparable to existing ER tools over real KBs exhibiting low heterogeneity in terms of entity types and content. Yet, MinoanER outperforms state-of-the-art ER tools when matching highly heterogeneous KBs.
Document type :
Conference papers
Complete list of metadatas

Cited literature [11 references]  Display  Hide  Download

https://hal.inria.fr/hal-01718040
Contributor : Vassilis Christophides <>
Submitted on : Tuesday, February 27, 2018 - 9:24:33 AM
Last modification on : Friday, January 4, 2019 - 12:55:22 PM
Long-term archiving on : Monday, May 28, 2018 - 6:28:36 PM

File

PID5235409.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01718040, version 1

Collections

Citation

Vasilis Efthymiou, George Papadakis, Kostas Stefanidis, Vassilis Christophides. Simplifying Entity Resolution on Web Data with Schema-agnostic, Non-iterative Matching. ICDE 2018 - 34th IEEE International Conference on Data Engineering, Apr 2018, Paris, France. pp.1-4. ⟨hal-01718040⟩

Share

Metrics

Record views

226

Files downloads

263