J. Dean and S. Ghemawat, MapReduce, Communications of the ACM, vol.53, issue.1, pp.72-77, 2010.
DOI : 10.1145/1629175.1629198

T. White, Hadoop -The Definitive Guide, 2012.

P. Mell and T. Grance, The NIST Definition of Cloud Computing, pp.800-145
DOI : 10.6028/NIST.SP.800-145

J. Anjos, G. Fedak, and C. Geyer, BIGhybrid ? A Toolkit for Simulating MapReduce in Hybrid Infrastructures Computer Architecture and High Performance Computing Workshop, SBAC-PADW), 2014 International Symposium on, pp.132-137, 2014.

H. Casanova, A. Giersch, A. Legrand, M. Quinson, and F. Suter, Versatile, scalable, and accurate simulation of distributed applications and platforms, Journal of Parallel and Distributed Computing, vol.74, issue.10, pp.2899-2917, 2014.
DOI : 10.1016/j.jpdc.2014.06.008

URL : https://hal.archives-ouvertes.fr/hal-01017319

G. Fedak, H. He, and F. Cappello, BitDew: A programmable environment for large-scale data management and distribution, 2008 SC, International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-4512, 2008.
DOI : 10.1109/SC.2008.5213939

URL : https://hal.archives-ouvertes.fr/inria-00216126

M. Moca, G. Silaghi, and G. Fedak, Distributed Results Checking for MapReduce in Volunteer Computing. Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), IEEE Int. Symposium on, pp.1847-1854, 2011.

B. Nicolae, D. Moise, G. Antoniu, L. Bouge, and D. M. Blobseer, BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-11, 2010.
DOI : 10.1109/IPDPS.2010.5470433

URL : https://hal.archives-ouvertes.fr/inria-00456801

W. Kolberg, P. Marcos, J. Anjos, A. Miyazaki, C. Geyer et al., MRSG ??? A MapReduce simulator over SimGrid, Parallel Computing, vol.39, issue.4-5, pp.4-5233, 2013.
DOI : 10.1016/j.parco.2013.02.001

URL : https://hal.archives-ouvertes.fr/hal-00931855

J. Anjos, I. Carrera, W. Kolberg, A. Tibola, L. Arantes et al., MRA++: Scheduling and data placement on MapReduce for heterogeneous environments, Future Generation Computer Systems, vol.42, issue.0, pp.22-35, 2015.
DOI : 10.1016/j.future.2014.09.001

URL : https://hal.archives-ouvertes.fr/hal-01197424

D. Kondo, B. Javadi, A. Iosup, and D. Epema, The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, pp.398-407, 2010.
DOI : 10.1109/CCGRID.2010.71

URL : https://hal.archives-ouvertes.fr/inria-00433523

I. and C. Grid5000, URL https://www.grid5000.fr/mediawiki/index, 2015.

G. Antoniu, J. Bigot, C. Blanchet, L. Bouge, F. Briant et al., Scalable data management for map-reduce-based data-intensive applications: a view for cloud and hybrid infrastructures, International Journal of Cloud Computing, vol.2, issue.2/3, pp.150-170, 2013.
DOI : 10.1504/IJCC.2013.055265

B. Tang and G. Fedak, Analysis of Data Reliability Tradeoffs in Hybrid Distributed Storage Systems. Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), IEEE 26th International, pp.1546-1555, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00757091

L. Lu, J. H. Shi, X. Fedak, and G. , Assessing MapReduce for Internet Computing: A Comparison of Hadoop and BitDew-MapReduce, 2012 ACM/IEEE 13th International Conference on Grid Computing, pp.76-84, 2012.
DOI : 10.1109/Grid.2012.31

URL : https://hal.archives-ouvertes.fr/hal-00757070

Q. Zou, X. Li, W. Jiang, Z. Lin, G. Li et al., Survey of MapReduce frame operation in bioinformatics, Briefings in Bioinformatics, vol.15, issue.4, 2013.
DOI : 10.1093/bib/bbs088

A. Mckenna, M. Hanna, E. Banks, A. Sivachenko, K. Cibulskis et al., The Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data, Genome Research, vol.20, issue.9, pp.1297-1303, 2010.
DOI : 10.1101/gr.107524.110

R. Kinsella, A. Kahari, S. Haider, J. Zamora, G. Proctor et al., Ensembl BioMarts: a hub for data retrieval across taxonomic space, Database, vol.2011, issue.0, pp.1-9, 2011.
DOI : 10.1093/database/bar030

C. Jayalath, J. Stephen, and P. Eugster, From the Cloud to the Atmosphere: Running MapReduce across Data Centers, IEEE Transactions on Computers, vol.63, issue.1, pp.74-87, 2014.
DOI : 10.1109/TC.2013.121

R. Tudoran, A. Costan, R. Wang, L. Bouge, and G. Antoniu, Bridging Data in the Clouds: An Environment-Aware System for Geographically Distributed Data Transfers, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.92-101, 2014.
DOI : 10.1109/CCGrid.2014.86

URL : https://hal.archives-ouvertes.fr/hal-00978153

K. Krish, A. Anwar, and A. Butt, hatS: A Heterogeneity-Aware Tiered Storage for Hadoop, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp.502-511, 2014.
DOI : 10.1109/CCGrid.2014.51

Z. Zheng, Y. Gui, F. Wu, and G. Chen, STAR: Strategy-Proof Double Auctions for Multi-Cloud, Multi-Tenant Bandwidth Reservation. Computers, IEEE Transactions on, issue.99, pp.1-14, 2014.

D. Anderson and . Boinc, BOINC: A System for Public-Resource Computing and Storage, Fifth IEEE/ACM International Workshop on Grid Computing, pp.4-10, 2004.
DOI : 10.1109/GRID.2004.14

G. Fedak, C. Germain, V. Neri, and F. Cappello, XtremWeb: a generic global computing system. Cluster Computing and the Grid, Proceedings. First IEEE/ACM International Symposium on, pp.582-587, 2001.

S. Delamare, G. Fedak, D. Kondo, and O. Lodygensky, SpeQuloS, Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing, HPDC '12, pp.173-186
DOI : 10.1145/2287076.2287106

URL : https://hal.archives-ouvertes.fr/hal-00757074

S. Ostermann, K. Plankensteiner, R. Prodan, T. Fahringer, M. Guarracino et al., GroudSim: An Event-Based Simulation Framework for Computational Grids and Clouds, Lecture Notes in Computer Science, vol.20, issue.13, pp.305-313, 2010.
DOI : 10.1002/cpe.1307

R. Calheiros, R. Ranjan, A. Beloglazov, D. Rose, C. Buyya et al., CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms, Software: Practice and Experience, vol.43, issue.4, pp.23-50, 2011.
DOI : 10.1002/spe.995

A. Sulistio, U. Cibej, S. Venugopal, B. Robic, and R. Buyya, A toolkit for modelling and simulating data Grids: an extension to GridSim, Concurrency and Computation: Practice and Experience, vol.14, issue.13-15, pp.1591-1609, 2008.
DOI : 10.1002/cpe.1307

M. Bux and U. Leser, DynamicCloudSim, Proceedings of the 2nd ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, SWEET '13, pp.85-99, 2015.
DOI : 10.1145/2499896.2499897

R. Buyya, R. Ranjan, and R. Calheiros, InterCloud: Utility-oriented Federation of Cloud Computing Environments for Scaling of Application Services Algorithms and Architectures for Parallel Processing -Volume Part I, ICA3PP'10, Proceedings of the 10th International Conference on, pp.13-31978, 2010.

A. Kohne, M. Spohr, L. Nagel, and O. Spinczyk, FederatedCloudSim, Proceedings of the 2nd International Workshop on CrossCloud Systems, CCB '14, pp.1-3, 2014.
DOI : 10.1145/2676662.2676674

W. Tang, J. Jenkins, F. Meyer, R. Ross, R. Kettimuthu et al., Data-Aware Resource Scheduling for Multicloud Workflows: A Fine-Grained Simulation Approach, 2014 IEEE 6th International Conference on Cloud Computing Technology and Science, pp.887-89210, 2014.
DOI : 10.1109/CloudCom.2014.19

©. Copyright, Prepared using cpeauth.cls DOI: 10, Ltd. Concurrency Computat.: Pract. Exper, 1002.

J. Schad, J. Dittrich, and J. Quiané-ruiz, Runtime measurements in the cloud, Proc. VLDB Endow, pp.460-471, 2010.
DOI : 10.14778/1920841.1920902

Y. Chen, A. Ganapathi, R. Griffith, and R. Katz, The Case for Evaluating MapReduce Performance Using Workload Suites, 2011 IEEE 19th Annual International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, pp.390-399, 2011.
DOI : 10.1109/MASCOTS.2011.12

J. Ekanayake, S. Pallickara, and G. Fox, MapReduce for Data Intensive Scientific Analyses, 2008 IEEE Fourth International Conference on eScience, pp.277-284, 2008.
DOI : 10.1109/eScience.2008.59

C. Dobre and F. Xhafa, Parallel Programming Paradigms and Frameworks in Big Data Era, International Journal of Parallel Programming, vol.13, issue.4, pp.710-73810, 2014.
DOI : 10.1007/s10766-013-0272-7

R. Tudoran, A. Costan, and G. Antoniu, MapIterativeReduce, Proceedings of third international workshop on MapReduce and its Applications Date, MapReduce '12, pp.9-16, 2012.
DOI : 10.1145/2287016.2287019

URL : https://hal.archives-ouvertes.fr/hal-00684814

T. Hirofuchi and L. , Adding Virtual Machine Abstractions Into SimGrid: A First Step Toward the Simulation of Infrastructure-as-a-Service Concerns, 2013 International Conference on Cloud and Green Computing, pp.175-180, 2013.
DOI : 10.1109/CGC.2013.33

URL : https://hal.archives-ouvertes.fr/hal-00861848

B. Javadi, D. Kondo, J. Vincent, and D. Anderson, Discovering Statistical Models of Availability in Large Distributed Systems: An Empirical Study of SETI@home, IEEE Transactions on Parallel and Distributed Systems, vol.22, issue.11, 2011.
DOI : 10.1109/TPDS.2011.50

URL : https://hal.archives-ouvertes.fr/hal-00788745

S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang, The HiBench benchmark suite: Characterization of the MapReducebased data analysis, 2010 IEEE 26th International Conference on, pp.41-51, 2010.

S. Makridakis, Time-Series Analysis and Forecasting: An Update and Evaluation, International Statistical Review / Revue Internationale de Statistique, vol.46, issue.3, pp.255-278, 1978.
DOI : 10.2307/1402374

R. Jain, The art of computer systems performance analysis -techniques for experimental design, measurement, simulation, and modeling, 1991.