28572 articles – 22064 Notices  [english version]

inria-00323248, version 1

Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

Bogdan Nicolae () a1, Gabriel Antoniu (Auteur à contacter de préférence) b1, Luc Bougé () c1

VECPAR '08: Proceedings of the 8th International Conference on High Performance Computing for Computational Science 5336 (2008) 532-543

Résumé : This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results.

  • a –  Université Rennes I
  • b –  INRIA
  • c –  Ecole Normale Supérieure de Cachan
  • 1 :  PARIS (INRIA - IRISA)
  • CNRS : UMR6074 – INRIA – École normale supérieure de Cachan - ENS Cachan – Institut National des Sciences Appliquées (INSA) - Rennes – Université de Rennes 1
  • Collaboration : Grid'5000
  • Domaine : Informatique/Calcul parallèle, distribué et partagé
  • Mots-clés : high performance distributed computing – large scale data sharing – distributed data management – lock-free – fine grain access
 
  • inria-00323248, version 1
  • oai:hal.inria.fr:inria-00323248
  • Contributeur : 
  • Soumis le : Dimanche 12 Octobre 2008, 22:45:21
  • Dernière modification le : Lundi 23 Avril 2012, 16:53:20