Measuring and Constraining Data Quality with Analytic Workflows - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Measuring and Constraining Data Quality with Analytic Workflows

Résumé

One challenging aspects of data quality modeling and management is to provide flexible, declarative and appropriate ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints on data quality in RDBMS. The evaluation of the quality of data (QoD) is based on the declaration of data quality metrics that are computed and combined into so-called QoD analytic workflows. These workflows are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to characterize various quantifiable dimensions of data quality (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.
Fichier principal
Vignette du fichier
QDB2009-lbe.pdf (273.26 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01856351 , version 1 (10-08-2018)

Identifiants

  • HAL Id : hal-01856351 , version 1

Citer

Laure Berti-Equille. Measuring and Constraining Data Quality with Analytic Workflows. Proceedings of the 6th International Workshop on Quality in Databases in conjunction with the International Conference on Very Large Databases (VLDB 2008), Aug 2008, Auckland, New Zealand. ⟨hal-01856351⟩
396 Consultations
26 Téléchargements

Partager

Gmail Facebook X LinkedIn More