DataSteward: Using Dedicated Compute Nodes for Scalable Data Management on Public Clouds

Radu Tudoran 1 Alexandru Costan 1 Gabriel Antoniu 1
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
Abstract : A large spectrum of scientific applications, some generating data volumes exceeding petabytes, are currently being ported on clouds to build on their inherent elasticity and scalability. One of the critical needs in order to deal with this "data deluge" is an efficient, scalable and reliable storage. However, the storage services proposed by cloud providers suffer from high latencies, trading performance for availability. One alternative is to federate the local virtual disks on the compute nodes into a globally shared storage used for large intermediate or checkpoint data. This collocated storage supports a high throughput but it can be very intrusive and subject to failures that can stop the host node and degrade the application performance. To deal with these limitations we propose DataSteward, a data management system that provides a higher degree of reliability while remaining non-intrusive through the use of dedicated compute nodes. DataSteward harnesses the storage space of a set of dedicated VMs, selected using a topology-aware clustering algorithm, and has a lifetime dependent on the deployment lifetime. To capitalize on this separation, we introduce a set of scientific data processing services on top of the storage layer, that can overlap with the executing applications. We performed extensive experimentations on hundreds of cores in the Azure cloud: compared to state-of-the-art node selection algorithms, we show up to a 20% higher throughput, which improves the overall performance of a real life scientific application up to 45%.
Type de document :
Communication dans un congrès
Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Jul 2013, Melbourne, Australia. IEEE, pp.1057--1064, 2013, Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. 〈10.1109/TrustCom.2013.129〉
Liste complète des métadonnées

Littérature citée [12 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00927283
Contributeur : Radu Tudoran <>
Soumis le : dimanche 12 janvier 2014 - 16:37:31
Dernière modification le : mercredi 16 mai 2018 - 11:23:28
Document(s) archivé(s) le : samedi 8 avril 2017 - 14:26:17

Fichier

bare_conf.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Radu Tudoran, Alexandru Costan, Gabriel Antoniu. DataSteward: Using Dedicated Compute Nodes for Scalable Data Management on Public Clouds. Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Jul 2013, Melbourne, Australia. IEEE, pp.1057--1064, 2013, Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications. 〈10.1109/TrustCom.2013.129〉. 〈hal-00927283〉

Partager

Métriques

Consultations de la notice

638

Téléchargements de fichiers

200