Skip to Main content Skip to Navigation
Conference papers

Similarity Aware Shuffling for the Distributed Execution of SQL Window Functions

Abstract : Window functions are extremely useful and have become increasingly popular, allowing ranking, cumulative sums and other analytic aggregations to be computed over a highly flexible and configurable sliding window. This powerful expressiveness comes naturally at the expense of heavy computational requirements which, so far, have been addressed through optimizations around centralized approaches by works both from the industry and academia. Distribution and parallelization has the potential to improve performance, but introduces several challenges associated with data distribution that may harm data locality. In this paper, we show how data similarity can be employed across partitions during the distributed execution of these operators to improve data co-locality between instances of a Distributed Query Engine and the associated data storage nodes. Our contribution can attain network gains in the average of 3 times and it is expected to scale as the number of instances increase. In the scenario with 8 nodes, we were to able attain bandwidth and time savings of 7.3 times and 2.61 times respectively.
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download
Contributor : Hal Ifip <>
Submitted on : Friday, May 25, 2018 - 3:17:47 PM
Last modification on : Friday, May 25, 2018 - 3:50:02 PM
Long-term archiving on: : Sunday, August 26, 2018 - 1:53:53 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Fábio Coelho, Miguel Matos, José Pereira, Rui Oliveira. Similarity Aware Shuffling for the Distributed Execution of SQL Window Functions. 17th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2017, Neuchâtel, Switzerland. pp.3-18, ⟨10.1007/978-3-319-59665-5_1⟩. ⟨hal-01800128⟩



Record views


Files downloads