Data Fusion: Resolving Conflicts from Multiple Sources

Abstract : Many data management applications, such as setting up Web portals, managing enterprise data, managing community data, and sharing scientific data, require integrating data from multiple sources. Each of these sources provides a set of values and different sources can often provide conflicting values. To present quality data to users, it is critical to resolve conflicts and discover values that reflect the real world; this task is called {\em data fusion}. This paper describes a novel approach that finds true values from conflicting information when there are a large number of sources, among which some may copy from others. We present a case study on real-world data showing that the described algorithm can significantly improve accuracy of truth discovery and is scalable when there are a large number of data sources.
Complete list of metadatas

https://hal.inria.fr/hal-01855720
Contributor : Laure Berti-Equille <>
Submitted on : Wednesday, August 8, 2018 - 2:22:54 PM
Last modification on : Thursday, August 9, 2018 - 1:13:55 AM

Links full text

Identifiers

  • HAL Id : hal-01855720, version 1
  • ARXIV : 1503.00310

Collections

Citation

Xin Luna Dong, Laure Berti-Équille, Divesh Srivastava. Data Fusion: Resolving Conflicts from Multiple Sources. International Conference on Web-Age Information Management, Jun 2013, Beidaihe, China. pp.64-76. ⟨hal-01855720⟩

Share

Metrics

Record views

34