28967 articles – 22394 references  [version française]

inria-00323248, version 1

Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

Bogdan Nicolae () a1, Gabriel Antoniu (Author to contact preferably) b1, Luc Bougé () c1

VECPAR '08: Proceedings of the 8th International Conference on High Performance Computing for Computational Science 5336 (2008) 532-543

Abstract: This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results.

  • a –  Université Rennes I
  • b –  INRIA
  • c –  Ecole Normale Supérieure de Cachan
  • 1:  PARIS (INRIA - IRISA)
  • CNRS : UMR6074 – INRIA – École normale supérieure de Cachan - ENS Cachan – Institut National des Sciences Appliquées (INSA) - Rennes – Université de Rennes 1
  • Collaboration : Grid'5000
  • Domain : Computer Science/Distributed, Parallel, and Cluster Computing
  • Keywords : high performance distributed computing – large scale data sharing – distributed data management – lock-free – fine grain access
 
  • inria-00323248, version 1
  • oai:hal.inria.fr:inria-00323248
  • From: 
  • Submitted on: Sunday, 12 October 2008 22:45:21
  • Updated on: Monday, 23 April 2012 16:53:20