HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters

Abstract : Graph processing is an emerging computation model for a wide range of applications and graph partitioning is important for optimizing the cost and performance of graph processing jobs. Recently, many graph applications store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the multi-level heterogeneities in network bandwidth and communication prices in geo-distributed DCs. In this paper, we propose an efficient graph partitioning method named Geo-Cut, which takes both the cost and performance objectives into consideration for large graph processing in geo-distributed DCs.Geo-Cut adopts two optimization stages. First, we propose a cost-aware streaming heuristic and utilize the one-pass streaming graph partitioning method to quickly assign edges to different DCs while minimizing inter-DC data communication cost. Second, we propose two partition refinement heuristics which identify the performance bottlenecks of geo-distributed graph processing and refine the partitioning result obtained in the first stage to reduce the inter-DC data transfer time while satisfying the budget constraint. Geo-Cut can be also applied to partition dynamic graphs thanks to its lightweight runtime overhead. We evaluate the effectiveness and efficiency of Geo-Cut using real-world graphs with both real geo-distributed DCs and simulations. Evaluation results show that Geo-Cut can reduce the inter-DC data transfer time by up to 79% (42% as the median) and reduce the monetary cost by up to 75% (26% as the median) compared to state-of-the-art graph partitioning methods with a low overhead.
Complete list of metadata

Cited literature [37 references]  Display  Hide  Download

Contributor : Shadi Ibrahim Connect in order to contact the contributor
Submitted on : Tuesday, December 3, 2019 - 2:14:41 PM
Last modification on : Wednesday, April 27, 2022 - 3:53:40 AM
Long-term archiving on: : Wednesday, March 4, 2020 - 7:33:48 PM


Files produced by the author(s)



Amelie Chi Zhou, Bingkun Shen, Yao Xiao, Shadi Ibrahim, Bingsheng He. Cost-Aware Partitioning for Efficient Large Graph Processing in Geo-Distributed Datacenters. IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, 2019, pp.1707-1723. ⟨10.1109/TPDS.2019.2955494⟩. ⟨hal-02389120⟩



Record views


Files downloads