Skip to Main content Skip to Navigation
Conference papers

Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing Workflows

Abstract : Workflow is an important model for big data processing and resource provisioning is crucial to the performance of workflows. Recently, system variations in the cloud and large-scale clusters, such as those in I/O and network performances, have been observed to greatly affect the performance of workflows. Traditional resource provisioning methods, which overlook these variations, can lead to suboptimal resource provisioning results. In this paper, we provide a general solution for workflow performance optimizations considering system variations. Specifically, we model system variations as time-dependent random variables and take their probability distributions as optimization input. Despite its effectiveness, this solution involves heavy computation overhead. Thus, we propose three pruning techniques to simplify workflow structure and reduce the probability evaluation overhead. We implement our techniques in a runtime library, which allows users to incorporate efficient probabilistic optimization into existing resource provisioning methods. Experiments show that probabilistic solutions can improve the performance by 51% compared to state-of-the-art static solutions while guaranteeing budget constraint, and our pruning techniques can greatly reduce the overhead of probabilistic optimization.
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download
Contributor : Shadi Ibrahim <>
Submitted on : Tuesday, December 10, 2019 - 11:10:12 AM
Last modification on : Tuesday, January 5, 2021 - 4:26:24 PM
Long-term archiving on: : Wednesday, March 11, 2020 - 3:19:28 PM


Files produced by the author(s)



Amelie Chi Zhou, Yao Xiao, Bingsheng He, Shadi Ibrahim, Reynold Cheng. Incorporating Probabilistic Optimizations for Resource Provisioning of Data Processing Workflows. ICPP 2019 - 48th International Conference on Parallel Processing, Aug 2019, Kyoto, Japan. pp.1-10, ⟨10.1145/3337821.3337847⟩. ⟨hal-02389078⟩



Record views


Files downloads