Data Exchange with MapReduce: A First Cut

Abstract : Data exchange is one of the oldest database problems, being of both practical and theoretical interest. Given the pace at which heterogeneous data are published on the web, thanks to initiatives such as Linked Data and Open Science, scalability of data exchange becomes crucial. Pivotal to data exchange is the chase algorithm, which is a fixpoint algorithm to evaluate both source-to-target constraints and target constraints in the data exchange process. In this paper, we investigate how new programming models such as MapReduce can be used to implement the chase on large-scale data sources. To the best of our knowledge, how to exchange data at scale has not been investigated so far. We present an initial solution for chasing source-to-target tuple generating dependencies and target tuple-generating dependencies, and discuss open issues that need to be addressed to leverage MapReduce for the data exchange problem.
Document type :
Conference papers
Complete list of metadatas

https://hal.inria.fr/hal-01401594
Contributor : Angela Bonifati <>
Submitted on : Wednesday, November 23, 2016 - 3:50:11 PM
Last modification on : Wednesday, November 20, 2019 - 3:09:38 AM

Identifiers

Citation

Khalid Belhajjame, Angela Bonifati. Data Exchange with MapReduce: A First Cut. International Conference on Scientific and Statistical Database Management (SSDBM), Jul 2016, Budapest, Hungary. pp.22:1-22:4, ⟨10.1145/2949689.2949702⟩. ⟨hal-01401594⟩

Share

Metrics

Record views

341