Internship report: Quotient RDF graph summarization

Paweł Guzewicz 1, 2
1 CEDAR - Rich Data Analytics at Cloud Scale
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], Inria Saclay - Ile de France
Abstract : This report describes the main results obtained during my 6 month research internship at Inria Saclay. I worked under the supervision of Ioana Manolescu and the topic was ”Exploring RDF graphs: expressivity and scale”. Before the beginning of my employment I had already been working with my supervisor starting from November 2017. The internship was therefore a continuation of our scientific collaboration in a full-time setting. The main focus of this research was the development of theoretical and practical solutions for summarization of heterogeneous graphs, RDF graphs in particular. Throughout my internship I have been involved in three projects: ”Quotient RDF Summaries Based on Type Hierarchies”, ”Compact Summaries of Rich Heterogeneous Graphs” and ”Distributed RDF Graph Summarization”. My internship focuses on large data graphs with heterogeneous structure, possibly featuring typed data and an ontology, such as RDF graphs. The aim is to find a compact yet informative representation of the graphs. Before my arrival in the team in March 2018, two new graph node equivalence relations, that lead to quotient summaries, had been introduced in a technical report [9]. The authors also presented extensions which capture the semantic information encoded in the RDF graph that provide a special way to treat the types and the ontology, along with an RDF graph summarization framework. Most of my internship was devoted to novel summarization algorithms. Further, I also worked on an extended, novel treatment of the typed triples during the summarization.
Complete list of metadatas

Cited literature [44 references]  Display  Hide  Download

https://hal.inria.fr/hal-01879898
Contributor : Paweł Guzewicz <>
Submitted on : Wednesday, September 26, 2018 - 5:58:14 PM
Last modification on : Friday, June 14, 2019 - 1:58:55 AM
Long-term archiving on : Thursday, December 27, 2018 - 6:16:20 PM

File

report.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01879898, version 1

Collections

Citation

Paweł Guzewicz. Internship report: Quotient RDF graph summarization. Databases [cs.DB]. 2018. ⟨hal-01879898⟩

Share

Metrics

Record views

67

Files downloads

275