Skip to Main content Skip to Navigation
Conference papers

Measuring and Constraining Data Quality with Analytic Workflows

Abstract : One challenging aspects of data quality modeling and management is to provide flexible, declarative and appropriate ways to express requirements on the quality of data. The paper presents a framework for specifying and checking constraints on data quality in RDBMS. The evaluation of the quality of data (QoD) is based on the declaration of data quality metrics that are computed and combined into so-called QoD analytic workflows. These workflows are designed as a composition of statistical methods and data mining techniques used to detect patterns of anomalies in the data sets. As metadata they are used to characterize various quantifiable dimensions of data quality (e.g., completeness, freshness, consistency, accuracy). The paper proposes a query language extension for constraining data quality when querying both data and its associated QoD metadata. Probabilistic approximate constraints are checked to determine if the quality of data is (or not) acceptable to build quality-constrained query results.
Document type :
Conference papers
Complete list of metadata

Cited literature [25 references]  Display  Hide  Download

https://hal.inria.fr/hal-01856351
Contributor : Laure Berti-Equille <>
Submitted on : Friday, August 10, 2018 - 5:03:46 PM
Last modification on : Wednesday, June 16, 2021 - 3:42:01 AM
Long-term archiving on: : Sunday, November 11, 2018 - 1:31:39 PM

File

QDB2009-lbe.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01856351, version 1

Citation

Laure Berti-Equille. Measuring and Constraining Data Quality with Analytic Workflows. Proceedings of the 6th International Workshop on Quality in Databases in conjunction with the International Conference on Very Large Databases (VLDB 2008), Aug 2008, Auckland, New Zealand. ⟨hal-01856351⟩

Share

Metrics

Record views

965

Files downloads

38