FRELS: Fast and Reliable Estimated Linguistic Summaries

Abstract : The linguistic summarization of a dataset is a process whose complexity depends linearly on the size of the dataset and exponentially on the size of the fuzzy vocabulary. To efficiently summarize large datasets stored in Relational DataBases, reliable estimated cardinalities can be derived from statistics about the data distribution maintained by the RDB Management System, with no expensive data scans. This paper proposes to improve the precision of such estimated summaries while preserving their efficiency, by enriching the statistics-based approach with local scan-based corrections when needed: the proposed FRELS method provides efficient strategies both for identifying the needs and performing the corrections. Experiments conducted on real data show that FRELS remains incomparably more efficient than data-scan-based approaches to data summarization and offers a better precision than purely statistics-based approaches. The generation of estimated linguistic summaries takes a couple of seconds, even for datasets containing millions of tuples, with a reliability of more than 95%.
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-02116137
Contributor : Grégory Smits <>
Submitted on : Tuesday, April 30, 2019 - 5:10:26 PM
Last modification on : Friday, July 5, 2019 - 3:26:03 PM

File

versionSoumise.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02116137, version 1

Citation

Grégory Smits, Pierre Nerzic, Marie-Jeanne Lesot, Olivier Pivert. FRELS: Fast and Reliable Estimated Linguistic Summaries. IEEE International Conference on Fuzzy Systems, Jun 2019, New-Orleans, United States. ⟨hal-02116137⟩

Share

Metrics

Record views

28

Files downloads

515