Pay-As-You-Go Data Integration Using Functional Dependencies

Naser Ayat 1 Hamideh Afsarmanesh 1 Reza Akbarinia 2 Patrick Valduriez 2
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Setting up a full data integration system for many application contexts, e.g. web and scientific data management, requires significant human effort which prevents it from being really scalable. In this paper, we propose IFD (Integration based on Functional Dependencies), a pay-as-you-go data integration system that allows integrating a given set of data sources, as well as incrementally integrating additional sources. IFD takes advantage of the background knowledge implied within functional dependencies for matching the source schemas. Our system is built on a probabilistic data model that allows capturing the uncertainty in data integration systems. Our performance evaluation results show significant performance gains of our approach in terms of recall and precision compared to the baseline approaches. They confirm the importance of functional dependencies and also the contribution of using a probabilistic data model in improving the quality of schema matching. The analytical study and experiments show that IFD scales well.
Liste complète des métadonnées

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/hal-01542460
Contributor : Hal Ifip <>
Submitted on : Monday, June 19, 2017 - 5:01:39 PM
Last modification on : Tuesday, April 16, 2019 - 6:26:02 PM
Document(s) archivé(s) le : Friday, December 15, 2017 - 10:21:30 PM

File

978-3-642-32498-7_28_Chapter.p...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Naser Ayat, Hamideh Afsarmanesh, Reza Akbarinia, Patrick Valduriez. Pay-As-You-Go Data Integration Using Functional Dependencies. International Cross-Domain Conference and Workshop on Availability, Reliability, and Security (CD-ARES), Aug 2012, Prague, Czech Republic. pp.375-389, ⟨10.1007/978-3-642-32498-7_28⟩. ⟨hal-01542460⟩

Share

Metrics

Record views

357

Files downloads

58