Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures

Abstract : As Map-Reduce emerges as a leading programming paradigm for data-intensive computing, today's frameworks which support it still have substantial shortcomings that limit its potential scalability. In this paper we discuss several directions where there is room for such progress: they concern storage efficiency under massive data access concurrency, scheduling, volatility and fault-tolerance. We place our discussion in the perspective of the current evolution towards an increasing integration of large-scale distributed platforms (clouds, cloud federations, enterprise desktop grids, etc.). We propose an approach which aims to overcome the current limitations of existing Map-Reduce frameworks, in order to achieve scalable, concurrency-optimized, fault-tolerant Map-Reduce data processing on hybrid infrastructures. This approach will be evaluated with real-life bio-informatics applications on existing Nimbus-powered cloud testbeds interconnected with desktop grids.
Document type :
Conference papers
1st International IBM Cloud Academy Conference - ICA CON 2012, Apr 2012, Research Triangle Park, North Carolina, United States. 2012


https://hal.inria.fr/hal-00684866
Contributor : Gabriel Antoniu <>
Submitted on : Friday, April 20, 2012 - 11:43:30 AM
Last modification on : Thursday, May 14, 2015 - 1:09:43 AM

File

ICACON2012-MapReduce.pdf
fileSource_public_author

Identifiers

  • HAL Id : hal-00684866, version 1

Collections

Citation

Gabriel Antoniu, Julien Bigot, Christophe Blanchet, Luc Bougé, François Briant, et al.. Towards Scalable Data Management for Map-Reduce-based Data-Intensive Applications on Cloud and Hybrid Infrastructures. 1st International IBM Cloud Academy Conference - ICA CON 2012, Apr 2012, Research Triangle Park, North Carolina, United States. 2012. <hal-00684866>

Export

Share

Metrics

Consultation de
la notice

516

Téléchargement du document

214