Query Selectivity Estimation Based on Improved V-optimal Histogram by Introducing Information about Distribution of Boundaries of Range Query Conditions

Abstract : Selectivity estimation is a parameter used by a query optimizer for early estimation of the size of data that satisfies query condition. Selectivity is calculated using an estimator of distribution of attribute values of attribute involved in a processed query condition. Histograms built on attributes values from a database may be such representation of the distribution. The paper introduces a new query-distribution-aware V-optimal histogram which is useful in selectivity estimation for a range query. It takes into account either a 1-D distribution of attribute values or a 2-D distribution of boundaries of already processed queries. The advantages of qda-V-optimal histogram appears when it is applied for selectivity estimation of range query conditions that form so-called hot regions. To obtain the proposed error-optimal histogram we use dynamic programming method, Fuzzy C-Means clustering of a set of range boundaries.
Complete list of metadatas

Cited literature [13 references]  Display  Hide  Download

https://hal.inria.fr/hal-01405574
Contributor : Hal Ifip <>
Submitted on : Wednesday, November 30, 2016 - 10:55:04 AM
Last modification on : Wednesday, October 10, 2018 - 8:52:05 PM
Long-term archiving on : Monday, March 27, 2017 - 7:54:04 AM

File

978-3-662-45237-0_16_Chapter.p...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Dariusz Augustyn. Query Selectivity Estimation Based on Improved V-optimal Histogram by Introducing Information about Distribution of Boundaries of Range Query Conditions. 13th IFIP International Conference on Computer Information Systems and Industrial Management (CISIM), Nov 2014, Ho Chi Minh City, Vietnam. pp.151-164, ⟨10.1007/978-3-662-45237-0_16⟩. ⟨hal-01405574⟩

Share

Metrics

Record views

388

Files downloads

181