New Directions for Data Quality Mining - Archive ouverte HAL Access content directly
Documents Associated With Scientific Events Year :

New Directions for Data Quality Mining

(1) ,


As data types and data structures change to keep up with evolving technologies and applications, data quality problems too have evolved and become more complex. Data streams, web logs, wikipedias, biomedical applications, video streams and social networking websites generate a mind boggling variety of data types. Data quality mining, the use of data mining to manage, measure and improve data quality, has focused mostly on addressing each category of data glitch separately as a static entity. In this tutorial we highlight new directions in data quality mining, particularly: (a) the applicability and effectiveness of the methodologies for various data types such as structured, semi-structured and stream data, (b) the detection of concomitant data glitches like the occurrence of outliers in data with missing values and duplicates (c) the design of sequential approaches to data quality mining, such as workflows composed of a sequence of tasks for data quality exploration and analysis. We give a brief overview of past work, introduce current research in this area, and highlight new directions and open problems in data quality mining. The tutorial includes extensive case studies, applications and practical examples.
Fichier principal
Vignette du fichier
tutorial-KDD09-Berti-Equille-Dasu.pdf (2.21 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-01856320 , version 1 (10-08-2018)


  • HAL Id : hal-01856320 , version 1


Laure Berti-Équille, Tamraparni Dasu. New Directions for Data Quality Mining. International Conference on Knowledge Discovery and Data Mining (KDD 2009), Jun 2009, Paris, France. ⟨hal-01856320⟩
126 View
100 Download


Gmail Facebook Twitter LinkedIn More