Discovering and Leveraging Content Similarity to Optimize Collective On-Demand Data Access to IaaS Cloud Storage

Abstract : A critical feature of IaaS cloud computing is the ability to quickly disseminate the content of a shared dataset at large scale. In this context, a common pattern is collective on-demand read, i.e., accessing the same VM image or dataset from a large number of VM instances concurrently. There are various techniques that avoid I/O contention to the storage service where the dataset is located without relying on pre-broadcast. Most such techniques employ peer-to-peer collaborative behavior where the VM instances exchange information about the content that was accessed during runtime, such that it is possible to fetch the missing data pieces directly from each other rather than the storage system. However, such techniques are often limited within a group that performs a collective read. In light of high data redundancy on large IaaS data centers and multiple users that simultaneously run VM instance groups that perform collective reads, an important opportunity arises: enabling unrelated VM instances belonging to different groups to collaborate and exchange common data in order to further reduce the I/O pressure on the storage system. This paper deals with the challenges posed by such a solution, which prompt the need for novel techniques to efficiently detect and leverage common data pieces across groups. To this end, we introduce a low-overhead fingerprint based approach that we evaluate and demonstrate to be efficient in practice for a representative scenario on dozens of nodes and a variety of group configurations.
Type de document :
Communication dans un congrès
CCGrid'15: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2015, Shenzhen, China
Liste complète des métadonnées

Littérature citée [27 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01138684
Contributeur : Bogdan Nicolae <>
Soumis le : jeudi 2 avril 2015 - 14:29:39
Dernière modification le : jeudi 2 avril 2015 - 15:14:22
Document(s) archivé(s) le : mardi 18 avril 2017 - 09:02:13

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01138684, version 1

Citation

Bogdan Nicolae, Andrzej Kochut, Alexei Karve. Discovering and Leveraging Content Similarity to Optimize Collective On-Demand Data Access to IaaS Cloud Storage. CCGrid'15: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, May 2015, Shenzhen, China. 〈hal-01138684〉

Partager

Métriques

Consultations de la notice

371

Téléchargements de fichiers

151