The Missing Path: Diagnosing Incompleteness in Linked Data - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Information Visualization Année : 2021

The Missing Path: Diagnosing Incompleteness in Linked Data

Marie Destandau
Jean-Daniel Fekete

Résumé

The Semantic Web is an interoperable ecosystem where data producers, such as libraries, public institutions, communities, and companies, publish and link heterogeneous resources. To support this heterogeneity, its format, RDF, allows to describe collections of items sharing some attributes but not necessarily all of them. This flexible framework leads to incompleteness and inconsistencies in information representation, which in turn leads to unreliable query results. In order to make their data reliable and usable, Linked Data producers need to provide the best level of completeness. We propose a novel visualization tool "The Missing Path" to support data producers in diagnosing incompleteness in their data. It relies on dimensional reduction techniques to create a map of RDF entities based on missing paths, revealing clusters of entities missing the same paths. The novelty of our work consists in describing the entities of interest as vectors of aggregated RDF paths of a fixed length. We show that identifying groups of items sharing a similar structure helps users find the cause of incompleteness for entire groups and allows them to decide if and how it has to be resolved. We describe our iterative design process and evaluation with Wikidata contributors.

Mots clés

Dates et versions

hal-02612896 , version 1 (19-05-2020)

Identifiants

Citer

Marie Destandau, Jean-Daniel Fekete. The Missing Path: Diagnosing Incompleteness in Linked Data. Information Visualization, 2021, 20 (1), pp.66-82. ⟨10.1177/1473871621991539⟩. ⟨hal-02612896⟩
98 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More