Skip to Main content Skip to Navigation
New interface
Conference papers

Adaptive File Management for Scientific Workflows on the Azure Cloud

Radu Tudoran 1 Alexandru Costan 1 Rad Ramin Rezai 2 Goetz Brasche 2 Gabriel Antoniu 1 
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
2 Cloud Team
EMIC - European Microsoft Innovation Center
Abstract : Scientific workflows typically communicate data between tasks using files. Currently, on public clouds, this is achieved by using the cloud storage services, which are unable to exploit the workflow semantics and are subject to low throughput and high latencies. To overcome these limitations, we propose an alternative leveraging data locality through direct file transfers between the compute nodes. We rely on the observation that workflows generate a set of common data access patterns that our solution exploits in conjunction with context information to self-adapt, choose the most adequate transfer protocol and expose the data layout within the virtual machines to the workflow engines. This file management system was integrated within the Microsoft Generic Worker workflow engine and was validated using synthetic benchmarks and a real-life application on the Azure cloud. The results show it can bring significant performance gains: up to 5x file transfer speedup compared to solutions based on standard cloud storage and over 25% application timespan reduction compared to Hadoop on Azure.
Document type :
Conference papers
Complete list of metadata

Cited literature [22 references]  Display  Hide  Download
Contributor : Radu Tudoran Connect in order to contact the contributor
Submitted on : Friday, January 10, 2014 - 10:57:26 AM
Last modification on : Saturday, November 19, 2022 - 3:43:20 AM
Long-term archiving on: : Thursday, April 10, 2014 - 10:26:54 PM


Files produced by the author(s)


  • HAL Id : hal-00926748, version 1


Radu Tudoran, Alexandru Costan, Rad Ramin Rezai, Goetz Brasche, Gabriel Antoniu. Adaptive File Management for Scientific Workflows on the Azure Cloud. IEEE Big Data, Oct 2013, Santa Clara, United States. pp.273 - 281. ⟨hal-00926748⟩



Record views


Files downloads