Parallel Quotient Summarization of RDF Graphs

Paweł Guzewicz 1, 2 Ioana Manolescu 1, 2
2 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : Discovering the structure and content of an RDF graph is hard for human users, due to its heterogeneity, complexity, and possibly large size. One class of tools for this task are structural RDF graph summaries, which allow users to grasp the different connections between RDF graph nodes. RDFQuotient graph summaries are a brand of structural summaries we developed. They are usually very compact, making them good for first-sight visual discovery. Existing algorithms for building these summaries are centralized, and require the graph to fit in memory. Going beyond, in this work we present novel algorithms for building RDFQuotient summaries in a parallel, shared-nothing architecture. We instantiate our algorithms to Apache Spark platform; our experiments demonstrate the merit of our approach.
Document type :
Conference papers
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-02106521
Contributor : Paweł Guzewicz <>
Submitted on : Tuesday, April 23, 2019 - 10:07:05 AM
Last modification on : Monday, July 8, 2019 - 2:59:23 PM

File

parasumm_optimized_for_fast_we...
Files produced by the author(s)

Identifiers

Données associées

Citation

Paweł Guzewicz, Ioana Manolescu. Parallel Quotient Summarization of RDF Graphs. Semantic Big Data 2019, Jun 2019, Amsterdam, Netherlands. ⟨10.1145/3323878.3325809⟩. ⟨hal-02106521⟩

Share

Metrics

Record views

108

Files downloads

112