Query-oriented Summarization of RDF Graphs

Šejla Čebirić 1, 2 François Goasdoué 3, 1 Ioana Manolescu 1, 2
1 OAK - Database optimizations and architectures for complex large data
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
3 SHAMAN - Symbolic and Human-centric view of dAta MANagement
Abstract : The Resource Description Framework (RDF) is the W3C’s graph data model for Semantic Web applications. We study the problem of RDF graph summarization: given an input RDF graph G, find an RDF graph SG which summarizes G as accurately as possible, while being possibly orders of magnitude smaller than the original graph. Our approach is query-oriented, i.e., querying a summary of a graph should reflect whether the query has some answers against this graph. The summaries are aimed as a help for query formulation and optimization. We introduce two summaries: a baseline which is compact and simple and satisfies certain accuracy and representativeness properties, but may oversimplify the RDF graph, and a refined one which trades some of these properties for more accuracy in representing the structure.
