Skip to Main content Skip to Navigation
New interface
Conference papers

Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

Gabriel Antoniu 1 Julien Bigot 2 Christophe Blanchet 3 Luc Bougé 1 François Briant 4 Franck Cappello 5, 6, 7 Alexandru Costan 1 Frédéric Desprez 2 Gilles Fedak 2 Sylvain Gault 2 Kate Keahey 8 Bogdan Nicolae 5, 6 Christian Pérez 2 Anthony Simonet 2 Frédéric Suter 9 Bing Tang 2 Raphael Terreux 3 
1 KerData - Scalable Storage for Clouds and Beyond
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
2 AVALON - Algorithms and Software Architectures for Distributed and HPC Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
5 GRAND-LARGE - Global parallel and distributed computing
LRI - Laboratoire de Recherche en Informatique, LIFL - Laboratoire d'Informatique Fondamentale de Lille, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : As Map-Reduce emerges as a leading programming paradigm for data-intensive computing, today's frameworks which support it still have substantial shortcomings that limit its potential scalability. In this paper we discuss several directions where there is room for such progress: they concern storage efficiency under massive data access concurrency, scheduling, volatility and fault-tolerance. We place our discussion in the perspective of the current evolution towards an increasing integration of large-scale distributed platforms (clouds, cloud federations, enterprise desktop grids, etc.). We propose an approach which aims to overcome the current limitations of existing Map-Reduce frameworks, in order to achieve scalable, concurrency-optimized, fault-tolerant Map-Reduce data processing on hybrid infrastructures. This approach will be evaluated with real-life bio-informatics applications on existing Nimbus-powered cloud testbeds interconnected with desktop grids.
Complete list of metadata
Contributor : Gabriel Antoniu Connect in order to contact the contributor
Submitted on : Friday, April 20, 2012 - 11:43:30 AM
Last modification on : Friday, November 18, 2022 - 9:24:15 AM
Long-term archiving on: : Saturday, July 21, 2012 - 2:20:32 AM


Files produced by the author(s)


  • HAL Id : hal-00684866, version 1


Gabriel Antoniu, Julien Bigot, Christophe Blanchet, Luc Bougé, François Briant, et al.. Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures. 1st International IBM Cloud Academy Conference - ICA CON 2012, Apr 2012, Research Triangle Park, North Carolina, United States. ⟨hal-00684866⟩



Record views


Files downloads