Master thesis

Internship report: Quotient RDF graph summarization

Pawel Guzewicz 1, 2
1 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : This report describes the main results obtained during my 6 month research internship at Inria Saclay. I worked under the supervision of Ioana Manolescu and the topic was ”Exploring RDF graphs: expressivity and scale”. Before the beginning of my employment I had already been working with my supervisor starting from November 2017. The internship was therefore a continuation of our scientific collaboration in a full-time setting. The main focus of this research was the development of theoretical and practical solutions for summarization of heterogeneous graphs, RDF graphs in particular. Throughout my internship I have been involved in three projects: ”Quotient RDF Summaries Based on Type Hierarchies”, ”Compact Summaries of Rich Heterogeneous Graphs” and ”Distributed RDF Graph Summarization”. My internship focuses on large data graphs with heterogeneous structure, possibly featuring typed data and an ontology, such as RDF graphs. The aim is to find a compact yet informative representation of the graphs. Before my arrival in the team in March 2018, two new graph node equivalence relations, that lead to quotient summaries, had been introduced in a technical report [9]. The authors also presented extensions which capture the semantic information encoded in the RDF graph that provide a special way to treat the types and the ontology, along with an RDF graph summarization framework. Most of my internship was devoted to novel summarization algorithms. Further, I also worked on an extended, novel treatment of the typed triples during the summarization.
Contributor : Pawel Guzewicz
Submitted on : Wednesday, September 26, 2018
Last modification on : Monday, February 10, 2020 - 6:14:08 PM
  HAL Id : hal-01879898



Pawel Guzewicz. Internship report: Quotient RDF graph summarization. Databases [cs.DB]. 2018.



