FRELS: Fast and Reliable Estimated Linguistic Summaries - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

FRELS: Fast and Reliable Estimated Linguistic Summaries

Résumé

The linguistic summarization of a dataset is a process whose complexity depends linearly on the size of the dataset and exponentially on the size of the fuzzy vocabulary. To efficiently summarize large datasets stored in Relational DataBases, reliable estimated cardinalities can be derived from statistics about the data distribution maintained by the RDB Management System, with no expensive data scans. This paper proposes to improve the precision of such estimated summaries while preserving their efficiency, by enriching the statistics-based approach with local scan-based corrections when needed: the proposed FRELS method provides efficient strategies both for identifying the needs and performing the corrections. Experiments conducted on real data show that FRELS remains incomparably more efficient than data-scan-based approaches to data summarization and offers a better precision than purely statistics-based approaches. The generation of estimated linguistic summaries takes a couple of seconds, even for datasets containing millions of tuples, with a reliability of more than 95%.
Fichier principal
Vignette du fichier
versionSoumise.pdf (376.42 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02116137 , version 1 (30-04-2019)

Identifiants

  • HAL Id : hal-02116137 , version 1

Citer

Grégory Smits, Pierre Nerzic, Marie-Jeanne Lesot, Olivier Pivert. FRELS: Fast and Reliable Estimated Linguistic Summaries. IEEE International Conference on Fuzzy Systems, Jun 2019, New-Orleans, United States. ⟨hal-02116137⟩
96 Consultations
286 Téléchargements

Partager

Gmail Facebook X LinkedIn More