Answering Provenance-Aware Queries on RDF Data Cubes under Memory Budgets

Luis Galárraga 1, 2 Kim Ahlstrøm 2 Katja Hose 2 Torben Pedersen 2
1 LACODAM - Large Scale Collaborative Data Mining
Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
Abstract : The steadily-growing popularity of semantic data on the Web and the support for aggregation queries in SPARQL 1.1 have propelled the interest in Online Analytical Processing (OLAP) and data cubes in RDF. Query processing in such settings is challenging because SPARQL OLAP queries usually contain many triple patterns with grouping and aggregation. Moreover, one important factor of query answering on Web data is its provenance, i.e., metadata about its origin. Some applications in data analytics and access control require to augment the data with provenance metadata and run queries that impose constraints on this provenance. This task is called provenance-aware query answering. In this paper, we investigate the benefit of caching some parts of an RDF cube augmented with provenance information when answering provenance-aware SPARQL queries. We propose provenance-aware caching (PAC), a caching approach based on a provenance-aware partitioning of RDF graphs, and a benefit model for RDF cubes and SPARQL queries with aggregation. Our results on real and synthetic data show that PAC outperforms significantly the LRU strategy (least recently used) and the Jena TDB native caching in terms of hit-rate and response time.
Document type :
Conference papers
Complete list of metadatas

Cited literature [33 references]  Display  Hide  Download

https://hal.inria.fr/hal-01931333
Contributor : Galárraga Luis <>
Submitted on : Thursday, November 22, 2018 - 4:49:55 PM
Last modification on : Friday, September 13, 2019 - 9:49:21 AM
Long-term archiving on : Saturday, February 23, 2019 - 3:59:25 PM

File

paper.pdf
Files produced by the author(s)

Identifiers

Citation

Luis Galárraga, Kim Ahlstrøm, Katja Hose, Torben Pedersen. Answering Provenance-Aware Queries on RDF Data Cubes under Memory Budgets. ISWC 2018 - 17th International Semantic Web Conference, Oct 2018, Monterey, United States. pp.547-565, ⟨10.1007/978-3-030-00671-6_32⟩. ⟨hal-01931333⟩

Share

Metrics

Record views

171

Files downloads

220