HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters

Abstract : Graph partitioning is important for optimizing the performance and communication cost of large graph processing jobs. Recently, many graph applications such as social networks store their data on geo-distributed datacenters (DCs) to provide services worldwide with low latency. This raises new challenges to existing graph partitioning methods, due to the costly Wide Area Network (WAN) usage and the multi-levels of network heterogeneities in geo-distributed DCs. In this paper, we propose a geo-aware graph partitioning method named G-Cut, which aims at minimizing the inter-DC data transfer time of graph processing jobs in geo-distributed DCs while satisfying the WAN usage budget. G-Cut adopts two novel optimization phases which address the two challenges in WAN usage and network heterogeneities separately. G-Cut can be also applied to partition dynamic graphs thanks to its lightweight runtime overhead. We evaluate the effectiveness and efficiency of G-Cut using real-world graphs with both real geo-distributed DCs and simulations. Evaluation results show that G-Cut can reduce the inter-DC data transfer time by up to 58% and reduce the WAN usage by up to 70% compared to state-of-the-art graph partitioning methods with a low runtime overhead.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download

Contributor : Shadi Ibrahim Connect in order to contact the contributor
Submitted on : Tuesday, July 11, 2017 - 12:13:13 PM
Last modification on : Wednesday, April 27, 2022 - 4:42:04 AM
Long-term archiving on: : Wednesday, January 24, 2018 - 8:26:08 PM


Files produced by the author(s)



Amelie Zhou, Shadi Ibrahim, Bingsheng He. On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters. ICDCS'17 : IEEE 37th International Conference on Distributed Computing Systems, Jun 2017, Atlanta, United States. ⟨10.1109/ICDCS.2017.98⟩. ⟨hal-01560187⟩



Record views


Files downloads