HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Approximated Summarization of Data Provenance

Ainy Eleanor 1 Pierre Bourhis 2 Susan Davidson 3 Daniel Deutch 1 Tova Milo 1
2 LINKS - Linking Dynamic Data
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille - UMR 9189
Abstract : Many modern applications involve collecting large amounts of data from multiple sources, and then aggregating and manipulating it in intricate ways. The complexity of such applications, combined with the size of the collected data, makes it difficult to understand how the resulting information was derived. Data provenance has proven helpful in this respect, however, maintaining and presenting the full and exact provenance information may be infeasible due to its size and complexity. We therefore introduce the notion of approximated summarized provenance, which provides a compact representation of the provenance at the possible cost of information loss. Based on this notion, we present a novel provenance summarization algorithm which, based on the semantics of the underlying data and the intended use of provenance, outputs a summary of the input provenance. Experiments measure the conciseness and accuracy of the resulting provenance summaries, and improvement in provenance usage time.
Document type :
Conference papers
Complete list of metadata

Contributor : Inria Links Connect in order to contact the contributor
Submitted on : Sunday, October 4, 2015 - 5:52:59 PM
Last modification on : Wednesday, April 6, 2022 - 3:48:01 PM


  • HAL Id : hal-01211286, version 1


Ainy Eleanor, Pierre Bourhis, Susan Davidson, Daniel Deutch, Tova Milo. Approximated Summarization of Data Provenance. CIKM, Oct 2015, Melbourn, Australia. ⟨hal-01211286⟩



Record views