Skip to Main content Skip to Navigation
Conference papers

Measuring and Constraining Data Quality with Analytic Workflows

Abstract : One challenging aspects of data quality modeling and management is to provide flexible, declarative and appropriate ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints on data quality in RDBMS. The evaluation of the quality of data (QoD) is based on the declaration of data quality metrics that are computed and combined into so-called QoD analytic workflows. These workflows are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to characterize various quantifiable dimensions of data quality (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.
Document type :
Conference papers
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download
Contributor : Laure Berti-Equille Connect in order to contact the contributor
Submitted on : Friday, August 10, 2018 - 5:03:46 PM
Last modification on : Monday, April 4, 2022 - 9:28:20 AM
Long-term archiving on: : Sunday, November 11, 2018 - 1:31:39 PM


Files produced by the author(s)


  • HAL Id : hal-01856351, version 1


Laure Berti-Equille. Measuring and Constraining Data Quality with Analytic Workflows. Proceedings of the 6th International Workshop on Quality in Databases in conjunction with the International Conference on Very Large Databases (VLDB 2008), Aug 2008, Auckland, New Zealand. ⟨hal-01856351⟩



Record views


Files downloads