Query-Oriented Summarization of RDF Graphs

Šejla Čebirić 1, 2, 3 François Goasdoué 4 Ioana Manolescu 1, 2, 3
1 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
4 SHAMAN - Symbolic and Human-centric view of dAta MANagement
Résumé : The Resource Description Framework (RDF) is the W3C's graph data model for Semantic Web applications. We study the problem of RDF graph summarization: given an input RDF graph G, find an RDF graph HG which summarizes G as accurately as possible, while being possibly orders of magnitude smaller than the original graph. Summaries are aimed as a help for RDF graph exploration, as well as query formulation and optimization. We devise four kinds of RDF graph summaries obtained as quotient graphs, with equivalence relations reflecting the similarity between nodes w.r.t. their types or connections. We also study whether they enjoy the formal properties of representativeness (HG should represent as much information about G as possible) and accuracy (HG should avoid, to the possible extent, reflecting information that is not in G). Finally, we report the experiments we made on several synthetic and real-life RDF graphs.
Šejla Čebirić, François Goasdoué, Ioana Manolescu. Query-Oriented Summarization of RDF Graphs. BDA (Bases de Données Avancées), Nov 2016, Poitiers, France. ⟨hal-01363625⟩



