The 11th International Workshop on Quality in DataBases in conjunction with VLDB 2016

Laure Berti-Equille; Christoph Quix; Verikat Gudivada; Rihan Hai; Hongzhi Wang

Proceedings/Recueil Des Communications Année : 2016

The 11th International Workshop on Quality in DataBases in conjunction with VLDB 2016

(1) , , , ,

Laure Berti-Equille

Fonction : Auteur
PersonId : 19540
IdHAL : laure-berti-equille
ORCID : 0000-0002-8046-0570
IdRef : 130675725

UMR 228 Espace-Dev, Espace pour le développement

Christoph Quix

Fonction : Auteur

Verikat Gudivada

Fonction : Auteur

Rihan Hai

Fonction : Auteur

Hongzhi Wang

Fonction : Auteur

Résumé

Data quality problems arise frequently when data is integrated from disparate sources. In the context of Big Data applications, data quality is becoming more important because of the unprecedented volume, large variety, and high velocity. The challenges caused by volume and velocity of Big Data have been addressed by many research projects and commercial solutions and can be partially solved by modern, scalable data management systems. However, variety remains to be a daunting challenge for Big Data Integration and requires also special methods for data quality management. Variety (or heterogeneity) exists at several levels: at the instance level, the same entity might be described with different attributes; at the schema level, the data is structured with various schemas; but also at the level of the modeling language, different data models can be used (e.g., relational, XML, or a document-oriented JSON representation). This might lead to data quality issues such as consistency, understandability, or completeness. The heterogeneity of data sources in the Big Data Era requires new integration approaches which can handle the large volume and speed of the generated data as well as the variety and quality of the data. Traditional ‘schema first’ approaches as in the relational world with data warehouse systems and ETL (Extract-Transform-Load) processes are inappropriate for a flexible and dynamically changing data management landscape. The requirement for pre-defined, explicit schemas is a limitation which has drawn interest of many developers and researchers to NoSQL data management systems as these systems should provide data management features for a high amount of schema-less data. Nevertheless, a one-size-fits-all Big Data system is unlikely to solve all the challenges which are required from data management systems today. Instead, multiple classes of systems, optimized for specific requirements or hardware platforms, will co-exist in a data management landscape. Thus, heterogeneity and data quality are challenges for many Big Data applications. While in some applications, a limited data quality for individual data items does not cause serious problems when a huge amount of data is aggregated, data quality problems in data sources are often revealed by the integration of these sources with other information. Data quality has been coined as ‘fitness for use’; thus, if data is used in another context than originally planned, data quality might become an issue. Similar observations have been also made for data warehouses which lead to a separate research area about data warehouse quality. The workshop QDB 2016 aims at discussing recent advances and challenges on data quality management in database systems, and focuses especially on problems in related to Big Data Integration and Big Data Quality. The workshop will provide a forum for the presentation of research results, a panel discussion, and an attractive keynote speaker.

Mots clés

Big Data Integration and Quality

Domaines

Web Intelligence artificielle [cs.AI] Base de données [cs.DB] Apprentissage [cs.LG]

Laure Berti-Equille : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01856096

Soumis le : jeudi 9 août 2018-16:32:52

Dernière modification le : mercredi 15 novembre 2023-10:52:16

Dates et versions

hal-01856096 , version 1 (09-08-2018)

Identifiants

HAL Id : hal-01856096 , version 1

Citer

Laure Berti-Equille, Christoph Quix, Verikat Gudivada, Rihan Hai, Hongzhi Wang. The 11th International Workshop on Quality in DataBases in conjunction with VLDB 2016. International Quality in Databases workshop (QDB 2016) in conjunction with VLDB 2016,, Sep 2016, Delhi, India. , 2016. ⟨hal-01856096⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-AVIGNON UNIV-AG AFRIQ UNIV-PERP ESPACE-DEV AGROPOLIS GUYANE MIPS UNIV-MONTPELLIER

335 Consultations

0 Téléchargements

The 11th International Workshop on Quality in DataBases in conjunction with VLDB 2016

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager