, VM Provisioning in Geo-distributed Clouds, vol.46

, 3 Network-aware Chunk Scheduling Algorithm, p.48

, 53 6.4.2 On compressing data chunk in Nitro

. .. Experimental-methodology,

. Applicability and . .. Limitations, Requesting multiple VMs from the same site, p.62

, Discussion on file-level deduplication in Nitro, p.63

.. .. Conclusion,

, Container Provisioning in the Edge: Limitations and Challenges

. .. Container-image-placement,

, 3.4 State-of-the-art placement algorithms

, 4.2 Results for real-world networks

. .. Discussion, 5.4 Impact of the percentage of layers concerned by KCBP-WC mechanism

, The impact of optimal retrieval

. .. , Maximal retrieval time per image, p.80

.. .. Conclusion,

, 5G -high-speed radio waves that can make your portfolio fly, 2019.

, Fascinating Google Search Statistics, 2018.

M. Abebe, K. Daudjee, B. Glasbergen, and Y. Tian, EC-Store: Bridging the Gap between Storage and Latency in Distributed Erasure Coded Systems, Proceedings of the 38th IEEE International Conference on Distributed Computing Systems (ICDCS), pp.255-266, 2018.

L. Russell and . Ackoff, From data to wisdom, Journal of applied systems analysis, vol.16, pp.3-9, 1989.

A. Ahmed and G. Pierre, Docker-pi: Docker Container Deployment in Fog Computing Infrastructures, International Journal of Cloud Computing, pp.1-20, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02271434

K. Ravindra, M. Ahuja, A. K. Kodialam, J. B. Mishra, and . Orlin, Computational investigations of maximum flow algorithms, European Journal of Operational Research, 1997.

, Akamai's state of the internet connectivity report, 2017.

M. Al-fares, A. Loukissas, and A. Vahdat, A Scalable, Commodity Data Center Network Architecture, Proceedings of the ACM SIG-COMM Conference on Data Communication (SIGCOMM), pp.63-74, 2008.

W. Allcock, GridFTP: Protocol Extensions to FTP for the Grid, Global Grid Forum, 2003.

W. Allcock, J. Bresnahan, R. Kettimuthu, M. Link, C. Dumitrescu et al., The Globus Striped GridFTP Framework and Server, Proceedings of the ACM/IEEE Conference on Supercomputing (SC), 2005.

A. Homepage, , 2019.

. Amazon and . Pricing, , 2019.

, Amazon Elastic Container Service, 2019.

E. Amazon, Easily Run and Scale Apache Spark, Hadoop, HBase, Presto, Hive, and other Big Data Frameworks, 2019.

. Amazon and . Glacier, Long-term, secure, durable object storage for data archiving, 2019.

G. Ananthanarayanan, A. Ghodsi, S. Shenker, and I. Stoica, Disklocality in Datacenter Computing Considered Irrelevant, Proceedings of the USENIX Workshop on Hot Topics in Operation Systems (HotOS), 2011.

A. Anwar, M. Mohamed, V. Tarasov, M. Littley, L. Rupprecht et al., Improving Docker Registry Design Based on Production Workload Analysis, Proceedings of the 16th USENIX Conference on File and Storage Technologies (FAST), pp.265-278, 2018.

A. Flink, , 2019.

A. Hadoop, , 2019.

A. Spark, , 2019.

A. Archer, K. Aydin, M. H. Bateni, V. Mirrokni, A. Schild et al., Cache-aware Load Balancing of Data Center Applications, Proceedings of the VLDB Endowment, vol.12, pp.709-723, 2019.

. Aufs--another, Union Filesystem, 2018.

, AWS Elastic Compute Cloud (EC2), 2019.

. Aws-elastic-mapreduce, , 2019.

, AWS Global Infrastructure, 2018.

, AWS: Regions and Availability Zones, 2019.

, AWS Simple Storage Service, issue.S3, 2019.

, Azure Blob Storage, 2019.

V. Bahl, Emergence of micro datacenter (cloudlets/edges) for mobile computing, 2015.

A. C. Baktir, A. Ozgovde, and C. Ersoy, How Can Edge Computing Benefit from Software-Defined Networking: A Survey, Use Cases and Future Directions, IEEE Communications Surveys and Tutorials PP, pp.1-1, 2017.

S. Balakrishnan, R. Black, A. Donnelly, P. England, A. Glass et al., Pelican: A Building Block for Exascale Cold Data Storage, Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation (OSDI), pp.351-365, 2014.

H. Ballani, P. Costa, T. Karagiannis, and A. Rowstron, Towards Predictable Datacenter Networks, Proceedings of the ACM SIGCOMM Conference on Data Communication (SIGCOMM, pp.242-253, 2011.

G. Basu, S. Nadgowda, and A. Verma, LVD: Lean Virtual Disks, Proceedings of the 15th International Middleware Conference (Middleware), pp.25-36, 2014.

S. Bazarbayev, M. Hiltunen, K. Joshi, W. H. Sanders, and R. Schlichting, Content-Based Scheduling of Virtual Machines (VMs) in the Cloud, Proceedings of the 33rd IEEE International Conference on Distributed Computing Systems (ICDCS), pp.93-101, 2013.

O. Beaumont, T. Lambert, L. Marchal, and B. Thomas, Data-Locality Aware Dynamic Schedulers for Independent Tasks with Replicated Inputs, Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp.1206-1213, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01878977

P. Bellavista and A. Zanni, Feasibility of Fog Computing Deployment Based on Docker Containerization over RaspberryPi, Proceedings of the 18th International Conference on Distributed Computing and Networking (ICDCN, vol.16, pp.1-16, 2017.

J. Benet, IPFS -Content Addressed, Versioned, P2P File System, 2014.

B. André and . Bondi, Characteristics of Scalability and Their Impact on Performance, Proceedings of the 2nd International Workshop on Software and Performance (WOSP, pp.195-203, 2000.

F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, Fog Computing and Its Role in the Internet of Things, Proceedings of the 1st Edition of the MCC Workshop on Mobile Cloud Computing (MCC), pp.13-16, 2012.

A. Eric and . Brewer, Towards robust distributed systems. abstract, Proceedings of the 9th Annual ACM Symposium on Principles of Distributed Computing (PODC), p.7, 2000.

C. Broekema, R. V. Van-nieuwpoort, and H. E. Bal, Exascale high performance computing in the square kilometer array, Proceedings of the workshop on High-Performance Computing for Astronomy Date, pp.9-16, 2012.

N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov et al., TAO: Facebook's Distributed Data Store for the Social Graph, Proceedings of the USENIX Annual Technical Conference (ATC), pp.49-60, 2013.

R. Buyya, R. Ranjan, and R. N. Calheiros, InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services, Algorithms and Architectures for Parallel Processing, pp.13-31, 2010.

B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold et al., Mian Fahim ul Haq, Muhammad Ikram ul Haq, Deepali Bhardwaj, Sowmya Dayanand, Anitha Adusumilli, Marvin McNett, Sriram Sankaran, Kavitha Manivannan, and Leonidas Rigas, Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP, pp.143-157, 2011.

R. Rodrigo-n-calheiros, R. Ranjan, and . Buyya, Virtual machine provisioning based on analytical performance and QoS in cloud computing environments, Proceedings of the International Conference on Parallel Processing, pp.295-304, 2011.

L. Chen, S. Liu, B. Li, and B. Li, Scheduling jobs across geo-distributed datacenters with max-min fairness, Proceedings of the IEEE International Conference on Computer Communications (INFOCOM, pp.1-9, 2017.

M. Peter, E. K. Chen, G. A. Lee, R. H. Gibson, D. A. Katz et al., RAID: High-performance, Reliable Secondary Storage, ACM Computing Surveys (CSUR), vol.26, pp.145-185, 1994.

Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz, Energy efficiency for large-scale mapreduce workloads with significant interactive analysis, Proceedings of the 7th ACM European Conference on Computer Systems (Eu-roSys), pp.43-56, 2012.

Y. Chen, S. Alspaugh, and R. Katz, Interactive Analytical Processing in Big Data Systems: A Cross-industry Study of MapReduce Workloads, Proceedings of the VLDB Endowment (PVLDB), vol.5, pp.1802-1813, 2012.

Y. Chen, S. Alspaugh, and R. H. Katz, Design insights for MapReduce from diverse production workloads, 2012.

S. Yu-lin-chen, J. Mu, C. Li, J. Huang, A. Li et al., Giza: Erasure Coding Objects across Global Data Centers, Proceedings of the USENIX Annual Technical Conference (ATC, pp.539-551, 2017.

M. Chowdhury, S. Kandula, and I. Stoica, Leveraging Endpoint Flexibility in Data-Intensive Clusters, ACM SIGCOMM Computer Communication Review, vol.43, pp.231-242, 2013.

C. Bigtable, A petabyte-scale, fully managed NoSQL database service for large analytical and operational workloads, 2019.

B. Cohen, Incentives build robustness in bittorrent, 2003.

, Configure back ends for OpenStack Galnce, 2019.

C. James, J. Corbett, M. Dean, A. Epstein, C. Fikes et al., Spanner: Google's Globally Distributed Database, ACM Transactions on Computer Systems (TOCS), vol.31, p.8, 2013.

P. Covington, J. Adams, and E. Sargin, Deep neural networks for youtube recommendations, Proceedings of the 10th ACM conference on recommender systems, pp.191-198, 2016.

J. Darrous and S. Ibrahim, Enabling Data Processing under Erasure Coding in the Fog, the 48th International Conference on Parallel Processing
URL : https://hal.archives-ouvertes.fr/hal-02388835

, Available online at, 2019.

J. Darrous, S. Ibrahim, and C. Perez, Is it time to revisit Erasure Coding in Data-intensive clusters?, In: Proceedings of the 27th IEEE International Symposium on the Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp.165-178, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02263116

J. Darrous, S. Ibrahim, A. C. Zhou, and C. Perez, Nitro: Network-Aware Virtual Machine Image Management in Geo-Distributed Clouds, Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp.553-562, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01745405

J. Darrous, T. Lambert, and S. Ibrahim, On the Importance of container images placement for service provisioning in the Edge, Proceedings of the 28th International Conference on Computer Communications and Networks (ICCCN, pp.1-9, 2019.

D. Lake, A no-limits data lake to power intelligent action, 2019.

, Data preservation: Preserving data for future generations, 2019.

J. Dean and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation (OSDI), vol.6, pp.10-10, 2004.

G. Decandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman et al., Dynamo: amazon's highly available key-value store, ACM SIGOPS Operating Systems Review, vol.41, pp.205-220, 2007.

T. Dillon, C. Wu, and E. Chang, Cloud computing: issues and challenges, Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Applications (AINA). 2010, pp.27-33

P. Alexandros-g-dimakis, Y. Godfrey, . Wu, J. Martin, K. Wainwright et al., Network Coding for Distributed Storage Systems, IEEE Transactions on Information Theory, vol.56, issue.9, pp.4539-4551, 2010.

F. Dinu and T. S. Ng, RCMP: Enabling Efficient Recomputation Based Failure Resilience for Big Data Analytics, Proceedings of the 28th International Parallel and Distributed Processing Symposium (IPDPS), pp.962-971, 2014.

F. Dinu and T. S. Ng, Understanding the Effects and Implications of Compute Node Related Failures in Hadoop, Proceedings of the 21st International Symposium on High-Performance Parallel and Distributed Computing (HPDC), pp.187-198, 2012.

, Docker homepage, 2018.

D. Hub, , 2019.

, Docker Hub Image Index, 2019.

L. Du, T. Wo, R. Yang, and C. Hu, Cider: A rapid docker container deployment system through sharing network storage, Proceedings of the IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems, pp.332-339, 2017.

J. Ekanayake, S. Pallickara, and G. Fox, MapReduce for Data Intensive Scientific Analyses, Proceedings of the IEEE Fourth International Conference on eScience, pp.277-284, 2008.

Y. Elkhatib, B. Porter, B. Heverson, M. F. Ribeiro, J. Zhani et al., On Using Micro-Clouds to Deliver the Fog, IEEE Internet Computing, vol.21, pp.8-15, 2017.

, Erasure Code Support in OpenStack Swift, 2019.

. Eucalyptus, , 2018.

, Executive Summary: Data Growth, Business Opportunities, and the IT Imperatives, 2014.

, Facebook puts 10,000 Blu-ray discs in low-power storage system, 2014.

W. Bin-fan, L. Tantisiriroj, G. Xiao, and . Gibson, DiskReduce: RAID for Data-intensive Scalable Computing, Proceedings of the 4th Annual Workshop on Petascale Data Storage, pp.6-10, 2009.

J. Fang, S. Wan, and X. He, RAFI: Risk-Aware Failure Identification to Improve the RAS in Erasure-coded Data Centers, Proceedings of the USENIX Annual Technical Conference (ATC), pp.495-506, 2018.

, Fast Data Transfer -FDT, 2019.

W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, An updated performance comparison of virtual machines and linux containers, Proceedings of the IEEE international symposium on performance analysis of systems and software (ISPASS, pp.171-172, 2015.

A. Fikes, Colossus, successor to Google File System, 2010.

, File Transfer Protocol (FTP), 2019.

D. Ford, F. Labelle, F. I. Popovici, M. Stokely, . Van-anh et al., Availability in Globally Distributed Storage Systems, Proceedings of the 9th USENIX conference on Operating Systems Design and Implementation (OSDI), pp.61-74, 2010.

I. Foster and C. Kesselman, The Grid 2: Blueprint for a new computing infrastructure, 2003.

. Suite, , 2019.

J. Gantz and D. Reinsel, Extracting Value from Chaos, 2011.

X. Peter, A. Gao, S. Narayan, J. Karandikar, S. Carreira et al., Network Requirements for Resource Disaggregation, Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI, pp.249-264, 2016.

P. G. Lopez, A. Montresor, D. Epema, A. Datta, T. Higashino et al., Edge-centric Computing: Vision and Challenges, 2015.

R. Michael and D. Garey, Computers and intractability, vol.29, 2002.

S. Garfinkel, Architects of the information society: 35 years of the Laboratory for Computer Science at MIT, 1999.

P. Garraghan, P. Townend, and J. Xu, An Empirical Failure-Analysis of a Large-Scale Cloud Computing Environment, Proceedings of the 15th IEEE International Symposium on High-Assurance Systems Engineering, pp.113-120, 2014.

F. Gens, Clouds and Beyond: Positioning for the Next 20 Years in Enterprise IT, 2009.

S. Ghemawat, H. Gobioff, and S. Leung, The Google File System, Proceedings of the 9th ACM Symposium on Operating Systems Principles (SOSP, pp.29-43, 2003.

I. Gog, M. Schwarzkopf, A. Gleave, N. M. Robert, S. Watson et al., Firmament: Fast, Centralized Cluster Scheduling at Scale, Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI, pp.99-115, 2016.

, Google Cloud Platform: Regions and Zones, 2019.

G. Datacenter-locations, , 2018.

. Google, EVERYTHING at Google runs in a container, 2014.

. Google-kubernetes-engine, , 2019.

A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim et al., Recognized as one of "the most important research results published in CS in recent years, Proceedings of the ACM SIG-COMM Conference on Data Communication (SIGCOMM), vol.39, pp.51-62, 2009.

S. Haryadi, M. Gunawi, R. O. Hao, A. Suminto, A. D. Laksono et al., Why Does the Cloud Stop Computing?: Lessons from Hundreds of Service Outages, Proceedings of the 7th ACM Symposium on Cloud Computing (SoCC), pp.1-16, 2016.

K. Ha, P. Pillai, G. Lewis, S. Simanta, S. Clinch et al., The impact of mobile multimedia applications on data center consolidation, Proceedings of the IEEE international conference on cloud engineering (IC2E), pp.166-176, 2013.

A. Haeberlen, A. Mislove, and P. Druschel, Glacier: Highly durable, decentralized storage despite massive correlated failures, Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation (NSDI), vol.2, pp.143-158, 2005.

T. Harter, B. Salmon, R. Liu, A. C. Arpaci-dusseau, and R. H. Arpaci-dusseau, Slacker: Fast Distribution with Lazy Docker Containers, Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST), pp.181-195, 2016.

C. Hdfs-erasure,

. Hdfs-raid and . Wiki, , 2011.

, HDInsight: Easy, cost-effective, enterprise-grade service for open source analytics, 2019.

J. G. Anthony, S. Hey, K. M. Tansley, and . Tolle, The Fourth Paradigm: Data-Intensive Scientific Discovery, vol.1, 2009.

C. Hong and B. Varghese, Resource Management in Fog/Edge Computing: A Survey on Architectures, Infrastructure, and Algorithms, In: ACM Comput. Surv, vol.52, p.37, 2019.

C. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill et al., Achieving High Utilization with Softwaredriven WAN, Proceedings of the ACM SIGCOMM Computer Communication Review, vol.43, pp.15-26, 2013.

, How many bytes for, 2008.

, How Much Data is Produced Every Day?, 2016.

, How Much Information?, 2003.

K. Hsieh, A. Harlap, N. Vijaykumar, D. Konomis, G. R. Ganger et al., Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds, Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI, pp.629-647, 2017.

W. Hsu and G. L. Nemhauser, Easy and hard bottleneck location problems, Discrete Applied Mathematics, vol.1, issue.3, pp.209-215, 1979.

P. Hu, S. Dhelim, H. Ning, and T. Qiu, Survey on fog computing: architecture, key technologies, applications and open issues, Journal of Network and Computer Applications, 2017.

Z. Hu, B. Li, and J. Luo, Flutter: Scheduling Tasks Closer to Data Across Geo-Distributed Datacenters, Proceedings of the 35th Annual IEEE International Conference on Computer Communications (INFOCOM), pp.1-9, 2016.

C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder et al., Erasure Coding in Windows Azure Storage, Proceedings of the USENIX Annual Technical Conference (ATC), pp.15-26, 2012.

S. Huang, J. Huang, J. Dai, T. Xie, and B. Huang, The Hi-Bench benchmark suite: Characterization of the MapReduce-based data analysis, Proceedings of the 26th IEEE International Conference on Data Engineering Workshops (ICDEW 2010), pp.41-51, 2010.

G. Chien-chun-hung, P. Ananthanarayanan, L. Bodik, M. Golubchik, P. Yu et al., VideoEdge: Processing Camera Streams using Hierarchical Clusters, Proceedings of the IEEE/ACM Symposium on Edge Computing (SEC), pp.115-131, 2018.

G. Chien-chun-hung, L. Ananthanarayanan, M. Golubchik, M. Yu, and . Zhang, Wide-area Analytics with Multiple Resources, Proceedings of the 13th European Conference on Computer Systems (EuroSys), vol.12, p.16, 2018.

S. Ibrahim, B. He, and H. Jin, Towards Pay-as-you-consume Cloud Computing, Proceedings of the IEEE International Conference on Services Computing (SCC), pp.370-377, 2011.

S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu et al., Maestro: Replica-Aware Map Scheduling for MapReduce, Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid, pp.435-442, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00670813

S. Ibrahim, H. Jin, L. Lu, S. Wu, B. He et al., LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud, Proceedings of the 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom), pp.17-24, 2010.

, IDC: Expect 175 zettabytes of data worldwide by 2025, 2018.

, Intel Intelligent Storage Acceleration Library Homepage, 2019.

H. Ipfs-github, , 2019.

. Isa-l-performance, , 2017.

M. Isard, V. Prabhakaran, J. Currey, U. Wieder, K. Talwar et al., Quincy: Fair Scheduling for Distributed Computing Clusters, Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP, pp.261-276, 2009.

. Bukhary-ikhwan-ismail, M. Ehsan-mostajeran-goortani, . Bazli-ab, . Karim, M. Wong et al., Evaluation of docker as edge computing platform, Proceedings of the IEEE International Confernece on Open Systems (ICOS), pp.130-135, 2015.

N. Jain and R. Potharaju, When the Network Crumbles: An Empirical Study of Cloud Network Failures and their Impact on Services, Proceedings of the 4th annual Symposium on Cloud Computing (SoCC), p.15, 2013.

S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski et al., B4: Experience with a Globallydeployed Software Defined WAN, Proceedings of the ACM SIGCOMM Conference on Data Communication (SIGCOMM, pp.3-14, 2013.

K. R. Jayaram, C. Peng, Z. Zhang, M. Kim, H. Chen et al., An Empirical Analysis of Similarity in Virtual Machine Images, Proceedings of the Middleware Industry Track Workshop (Middleware, p.6, 2011.

H. Jin, S. Ibrahim, T. Bell, W. Gao, D. Huang et al., Cloud Types and Services, Handbook of Cloud Computing, pp.335-355, 2010.

H. Jin, S. Ibrahim, T. Bell, L. Qi, H. Cao et al., Tools and Technologies for Building Clouds, Cloud Computing, pp.3-20, 2010.

H. Jin, S. Ibrahim, L. Qi, H. Cao, S. Wu et al., The MapReduce Programming Model and Implementations, Cloud Computing: Principles and Paradigms, pp.373-390, 2011.

K. Jin and E. L. Miller, The Effectiveness of Deduplication on Virtual Machine Disk Images, Proceedings of the SYSTOR, p.7, 2009.

E. Jonas, Q. Pu, S. Venkataraman, I. Stoica, and B. Recht, Occupy the Cloud: Distributed Computing for the 99%, Proceedings of the Symposium on Cloud Computing (SoCC, pp.445-451, 2017.

, Just how big is Amazon's AWS business? (hint: it's absolutely massive), 2014.

, Just one autonomous car will use 4,000 GB of data/day, 2016.

S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken, The Nature of Data Center Traffic: Measurements and Analysis, pp.202-208, 2009.

J. Kangasharju, J. Roberts, and K. Ross, Object replication strategies in content distribution networks, Computer Communications, vol.25, pp.376-383, 2002.

W. Kangjin, Y. Yong, L. Ying, L. Hanmei, and M. Lin, FID: A Faster Image Distribution System for Docker Platform, Proceedings of the 2nd IEEE International Workshops on Foundations and Applications of Self* Systems, pp.191-198, 2017.

A. Karve and A. Kochut, Redundancy Aware Virtual Disk Mobility for Cloud Computing, Proceedings of the IEEE 6th International Conference on Cloud Computing (CLOUD), pp.35-42, 2013.

, Kernel Virtual Machine (KVM), 2019.

O. Khan, R. Burns, J. Plank, W. Pierce, and C. Huang, Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads, Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST), p.20, 2012.

G. Khanna, U. Catalyurek, T. Kurc, R. Kettimuthu, P. Sadayappan et al., Using Overlays For Efficient Data Transfer Over Shared Wide-Area Networks, Proceedings of the ACM/IEEE conference on Supercomputing (SC), p.47, 2008.

G. Khanna, U. Catalyurek, T. Kurc, R. Kettimuthu, P. Sadayappan et al., A Dynamic Scheduling Approach for Coordinated Wide-Area Data Transfers using GridFTP, Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, pp.1-12, 2008.

N. Kim, J. Cho, and E. Seo, Energy-credit scheduler: An energy-aware virtual machine scheduler for cloud systems, Future Generation Computer Systems (FGCS), pp.128-137, 2014.

A. Kivity, D. Laor, G. Costa, P. Enberg, N. Har et al., OSv-Optimizing the Operating System for Virtual Machines, Proceedings of the USENIX Annual Technical Conference, pp.61-72, 2014.

K. Kloudas, M. Mamede, N. Preguiça, and R. Rodrigues, Pixida: Optimizing Data Parallel Jobs in Wide-area Data Analytics, Proceedings of the VLDB Endowment (PVLDB) 9, pp.72-83, 2015.

A. Kochut and A. Karve, Leveraging local image redundancy for efficient virtual machine provisioning, Proceedings of the IEEE Network Operations and Management Symposium (NOMS). 2012, pp.179-187

A. Kr-krish, A. R. Anwar, and . Butt, hatS: A Heterogeneity-Aware Tiered Storage for Hadoop, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp.502-511, 2014.

. Kubernetes-homepage, , 2018.

J. Kubiatowicz, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton et al., OceanStore: An Architecture for Global-scale Persistent Storage, SIGPLAN Not, vol.35, pp.190-201, 2000.

P. Kumar and H. Huang, Falcon: Scaling IO Performance in Multi-SSD Volumes, Proceedings of the USENIX Annual Technical Conference (ATC, pp.41-53, 2017.

G. M. Kurtzer, V. Sochat, and M. W. Bauer, Singularity: Scientific containers for mobility of compute, PLOS ONE, vol.12, pp.1-20, 2017.

T. Kurze, M. Klems, D. Bermbach, A. Lenk, S. Tai et al., Cloud federation, Cloud Computing, pp.32-38, 2011.

Y. Kwon, M. Balazinska, B. Howe, and J. Rolia, Skew-Tune: Mitigating Skew in Mapreduce Applications, Proceedings of the ACM International Conference on Management of Data (SIGMOD), pp.25-36, 2012.

C. Lai, S. Jiang, L. Yang, S. Lin, G. Sun et al., Atlas: Baidu's key-value storage system for cloud data, Proceedings of the 31st Symposium on Mass Storage Systems and Technologies (MSST, pp.1-14, 2015.

A. Lakshman and P. Malik, Cassandra: a Decentralized Structured Storage System, ACM SIGOPS Operating Systems Review, vol.44, pp.35-40, 2010.

D. Laney, 3D data management: Controlling data volume, velocity and variety, META group research note, vol.6, p.1, 2001.

J. Justin, P. Levandoski, R. Larson, and . Stoica, Identifying hot and cold data in main-memory databases, Proceedings of the 29th International Conference on Data Engineering (ICDE), pp.26-37, 2013.

H. Leveldb-github, , 2019.

H. Li, A. Ghodsi, M. Zaharia, S. Shenker, and I. Stoica, Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks, Proceedings of the 5th annual Symposium on Cloud Computing (SoCC), 2014.

J. Li and B. Li, On Data Parallelism of Erasure Coding in Distributed Storage Systems, Proceedings of the 37th IEEE International Conference on Distributed Computing Systems (ICDCS, pp.45-56, 2017.

J. Li and B. Li, Parallelism-Aware Locally Repairable Code for Distributed Storage Systems, Proceedings of the 38th IEEE International Conference on Distributed Computing Systems (ICDCS), pp.87-98, 2018.

R. Li, P. C. Patrick, Y. Lee, and . Hu, Degraded-First Scheduling for MapReduce in Erasure-Coded Storage Clusters, Proceedings of the 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), pp.419-430, 2014.

. Libtorrent-homepage, , 2019.

A. Liguori and E. Van-hensbergen, Experiences with Content Addressable Storage and Virtual Disks, Proceedings of the 1st Conference on I/O Virtualization (WIOV), pp.5-5, 2008.

C. Lin, P. Liu, and J. Wu, Energy-Aware Virtual Machine Dynamic Provision and Scheduling for Cloud Computing, Proceedings of the 4th IEEE International Conference on Cloud Computing (CLOUD), pp.736-737, 2011.

C. Lin, Y. Bi, G. Han, J. Yang, H. Zhao et al., Scheduling for Time-Constrained Big-File Transfer Over Multiple Paths in Cloud Computing, IEEE Transactions on Emerging Topics in Computational Intelligence, vol.2, issue.1, pp.25-40, 2018.

. Linux-scp-command, , 2019.

, Linux Traffic Control, 2006.

H. Liu, B. He, X. Liao, and H. Jin, Towards Declarative and Data-centric Virtual Machine Image Management in IaaS Clouds, IEEE Transactions on Cloud Computing, 2017.

W. Liu, B. Tieman, R. Kettimuthu, and I. Foster, A Data Transfer Framework for Large-Scale Science Experiments, Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC), pp.717-724, 2010.

X. Lu, S. Nusrat, M. Islam, J. Wasi-ur-rahman, H. Jose et al., High-Performance Design of Hadoop RPC with RDMA over InfiniBand, Proceedings of the 42nd International Conference on Parallel Processing, pp.641-650, 2013.

M. Luby, R. Padovani, T. J. Richardson, L. Minder, and P. Aggarwal, Liquid Cloud Storage, ACM Trans. Storage, vol.15, p.49, 2019.

J. Florence, N. Macwilliams, and . Sloane, The theory of error-correcting codes, vol.16, 1977.

A. Madhavapeddy, R. Mortier, C. Rotsos, D. Scott, B. Singh et al., Unikernels: Library operating systems for the cloud, ACM SIGPLAN Notices, vol.48, pp.461-472, 2013.

F. Manco, C. Lupu, F. Schmidt, J. Mendes, S. Kuenzer et al., My VM is Lighter (and Safer) than your Container, Proceedings of the 26th Symposium on Operating Systems Principles (SOSP, pp.218-233, 2017.

H. Mapreduce, Algorithms in Academic Papers, 2011.

J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Honda et al., ClickOS and the Art of Network Function Virtualization, Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp.459-473, 2014.

M. Mcloughlin, The QCOW2 Image Format, 2019.

, Microsoft 365, 2019.

-. Microsoft-hyper, , 2019.

. Microsoft-vhd-image and . Format,

S. Moon, J. Lee, and Y. S. Kee, Introducing SSDs to the Hadoop MapReduce Framework, Proceedings of the 7th IEEE International Conference on Cloud Computing, pp.272-279, 2014.

N. Mor, Edge Computing: Scaling resources within multiple administrative domains, 2019.

S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin et al., f4: Facebook's Warm BLOB Storage System, Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation, pp.383-398, 2014.

P. Nath, M. A. Kozuch, D. R. O'hallaron, J. Harkes, M. Satyanarayanan et al., Design Tradeoffs in Applying Content Addressable Storage to Enterprise-scale Systems Based on Virtual Machines, Proceedings of the USENIX Annual Technical Conference, pp.6-6, 2006.

S. Nathan, R. Ghosh, T. Mukherjee, and K. Narayanan, CoMICon: A Co-Operative Management System for Docker Container Images, Proceedings of the IEEE International Conference on Cloud Engineering (IC2E, pp.116-126, 2017.

C. Netflix and . Study, , 2016.

C. Ng, M. Ma, T. Wong, P. C. Patrick, J. C. Lee et al., Live Deduplication Storage of Virtual Machine Images in an Open-source Cloud, Proceedings of the 12th ACM/IFIP/USENIX International Conference on Middleware (Middleware, pp.81-100, 2011.
URL : https://hal.archives-ouvertes.fr/hal-01597754

T. L. Nguyen, R. Nou, and A. Lebre, YOLO: Speeding Up VM and Docker Boot Time by Reducing I/O Operations, Proceedings of the 25rd International European Conference on Parallel and Distributed Computing, pp.273-287, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02172288

S. Niazi, M. Ronström, S. Haridi, and J. Dowling, Size Matters: Improving the Performance of Small Files in Hadoop, Proceedings of the 19th International Middleware Conference (Middleware), pp.26-39, 2018.

B. Nicolae, A. Kochut, and A. Karve, Discovering and Leveraging Content Similarity to Optimize Collective On-Demand Data Access to IaaS Cloud Storage, Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid, pp.211-220, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01138684

V. Nitu, B. Teabe, A. Tchana, C. Isci, and D. Hagimont, Welcome to Zombieland: Practical and Energy-efficient Memory Disaggregation in a Datacenter, Proceedings of the 13th EuroSys Conference (EuroSys), p.16, 2018.

D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman et al., The Eucalyptus open-source cloudcomputing system, Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid), pp.124-131, 2009.

, Object replication in Swift Ocata, 2019.

P. Olivier, D. Chiba, and S. Lankes, Changwoo Min, and Binoy Ravindran, Proceedings of the 15th ACM SIG-PLAN/SIGOPS International Conference on Virtual Execution Environments, pp.59-73, 2019.

, OpenFog Reference Architecture for Fog Computing, 2017.

, OpenNebula: Open Source Data Center Virtualization, 2018.

. Openstack, , 2018.

, OpenStack Storage (Swift), 2018.

. Opentracker-homepage, , 2019.

O. Filesystem, , 2018.

M. Ovsiannikov, S. Rus, D. Reeves, P. Sutter, S. Rao et al., The Quantcast File System, Proceedings of the VLDB Endowment (PVLDB), vol.6, pp.1092-1101, 2013.

, Oz Github repository, 2019.

. Packer, Build Automated Machine Images, 2019.

L. Pamies-juarez, F. Blagojevi?, R. Mateescu, and C. Gyuot, Opening the Chrysalis: On the Real Repair Performance of MSR Codes, Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST), pp.81-94, 2016.

R. Patgiri and A. Ahmed, Big Data: The V's of the Game Changer Paradigm, Proceedings of the IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems, pp.17-24, 2016.

A. Pavlo, E. Paulson, A. Rasin, J. Daniel, . Abadi et al., A comparison of approaches to largescale data analysis, Proceedings of the ACM International Conference on Management of Data, pp.165-178, 2009.

C. Peng, M. Kim, Z. Zhang, and H. Lei, VDN: Virtual Machine Image Distribution Network for Cloud Data Centers, Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), pp.181-189, 2012.

T. Phan, S. Ibrahim, A. C. Zhou, G. Aupy, and G. Antoniu, Energy-Driven Straggler Mitigation in MapReduce, Proceedings of the 23rd International European Conference on Parallel and Distributed Computing, pp.385-398, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01560044

I. Pietri and R. Sakellariou, Mapping virtual machines onto physical machines in cloud computing: A survey, In: ACM Computing Surveys (CSUR), vol.49, p.49, 2016.

, Cross-platform lib for process and system monitoring in Python, 2019.

Q. Pu, G. Ananthanarayanan, P. Bodik, and S. Kandula, Aditya Akella, Paramvir Bahl, and Ion Stoica, Proceedings of the ACM Conference on Special Interest Group on Data Communication (SIGCOMM, pp.421-434, 2015.

L. Qiu, G. M. Venkata-n-padmanabhan, and . Voelker, On the placement of Web server replicas, Proceedings of the IEEE International Conference on Computer Communications (INFOCOM), vol.3, pp.1587-1596, 2001.

F. Quesnel, A. Lèbre, and M. Südholt, Cooperative and reactive scheduling in large-scale virtualized platforms with DVMS, Concurrency and Computation: Practice and Experience, vol.25, pp.1643-1655, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00675315

A. Rabkin, M. Arye, S. Sen, S. Vivek, M. Pai et al., Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area, Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pp.275-288, 2014.

K. V. Rashmi, M. Chowdhury, and J. Kosaian, EC-cache: Load-balanced, Low-latency Cluster Caching with Online Erasure Coding, Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI, pp.401-417, 2016.

K. V. Rashmi, P. Nakkiran, J. Wang, N. B. Shah, and K. Ramchandran, Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage and Network-bandwidth, Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST, pp.81-94, 2015.

K. V. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur et al., A Solution to the Network Challenges of Data Recovery in Erasure-coded Distributed Storage Systems: A Study on the Facebook Warehouse Cluster, Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage), 2013.

K. V. Rashmi, P. Nakkiran, J. Wang, N. B. Shah, and K. Ramchandran, Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage, and Network-bandwidth, Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST, pp.81-94, 2015.

K. V. Rashmi, N. B. Shah, D. Gu, H. Kuang, D. Borthakur et al., A "Hitchhiker's" Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers, ACM SIGCOMM Computer Communication Review, vol.44, pp.331-342, 2014.

K. Razavi, A. Ion, and T. Kielmann, Squirrel: Scatter Hoarding VM Image Contents on IaaS Compute Nodes, Proceedings of the 23rd International Symposium on High-performance Parallel and Distributed Computing (HPDC), pp.265-278, 2014.

K. Razavi and T. Kielmann, Scalable Virtual Machine Deployment Using VM Image Caches, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC), 2013.

, Real-time Video Analytics: the killer app for edge computing, 2018.

. Redis, , 2018.

I. Reed and G. Solomon, Polynomial codes over certain finite fields, Journal of the Society of Industrial and Applied Mathematics, vol.8, pp.300-304, 1960.

J. Reich, O. Laadan, E. Brosh, A. Sherman, V. Misra et al., VMTorrent: Scalable P2P Virtual Machine Streaming, Proceedings of the International Conference on emerging Networking EXperiments and Technologies (CoNEXT, pp.289-300, 2012.

D. Reimer, A. Thomas, G. Ammons, T. Mummert, V. Bowen-alpern et al., Opening Black Boxes: Using Semantic Information to Combat Virtual Machine Image Sprawl, Proceedings of the 4th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE), pp.111-120, 2008.

G. Xiaoqi-ren, A. Ananthanarayanan, M. Wierman, and . Yu, Hopper: Decentralized speculation-aware cluster scheduling at scale, ACM SIG-COMM Computer Communication Review, vol.45, pp.379-392, 2015.

E. Gibert-renart, J. Diaz-montes, and M. Parashar, Data-Driven Stream Processing at the Edge, Proceedings of the 1st IEEE International Conference on Fog and Edge Computing (ICFEC, pp.31-40, 2017.

, Report: AWS Market Share Is Triple Azure's, 2017.

L. Rizzo, Effective erasure codes for reliable computer communication protocols, ACM SIGCOMM Computer Communication Review, vol.27, pp.24-36, 1997.

B. Robi? and J. Miheli?, Solving the k-center problem efficiently with a dominating set algorithm, Journal of computing and information technology, vol.13, pp.225-234, 2005.

R. Rodrigues and B. Liskov, High Availability in DHTs: Erasure Coding vs. Replication, Proceedings of the Peer-to-Peer Systems IV, 2005.

P. Rsync-home, , 2019.

C. Ruiz, S. Harrache, M. Mercier, and O. Richard, Reconstructable Software Appliances with Kameleon, vol.49, pp.80-89, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01334135

P. Russom, Big Data Analytics, 2011.

H. Salimi, M. Najafzadeh, and M. Sharifi, Advantages, Challenges and Optimizations of Virtual Machine Scheduling in Cloud Computing Environments, International Journal of Computer Theory and Engineering, vol.4, issue.2, p.189, 2012.

M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis, R. Vadali et al., XORing Elephants: Novel Erasure Codes for Big Data, Proceedings of the VLDB Endowment, vol.6, pp.325-336, 2013.

M. Satyanarayanan, V. Bahl, R. Caceres, and N. Davies, The Case for VM-Based Cloudlets in Mobile Computing, IEEE Pervasive Computing, vol.8, pp.14-23, 2009.

R. Schwarzkopf, M. Schmidt, M. Rüdiger, and B. Freisleben, Efficient Storage of Virtual Machine Images, Proceedings of the 3rd Workshop on Scientific Cloud Computing Date (ScienceCloud, pp.51-60, 2012.

, Scientists will use a supercomputer to simulate the universe, 2019.

D. Shankar, X. Lu, and D. Panda, High-Performance and Resilient Key-Value Store with Online Erasure Coding for Big Data Workloads, Proceedings of the 37th IEEE International Conference on Distributed Computing Systems (ICDCS, pp.527-537, 2017.

P. Sharma, L. Chaufournier, P. Shenoy, and Y. C. Tay, Containers and Virtual Machines at Scale: A Comparative Study, Proceedings of the 17th International Middleware Conference (Middleware), vol.1, pp.1-1, 2016.

Z. Shen, Z. Zhang, A. Kochut, A. Karve, H. Chen et al., VMAR: Optimizing I/O performance and resource utilization in the cloud, Proceedings of the ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing, pp.183-203, 2013.
URL : https://hal.archives-ouvertes.fr/hal-01480776

J. Shi, Y. Qiu, L. Umar-farooq-minhas, C. Jiao, B. Wang et al., Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics, Proceedings of the VLDB Endowment (PVLDB), vol.8, p.13, 2015.

W. Shi and B. Hong, Towards Profitable Virtual Machine Placement in the Data Center, Proceedings of the 4th IEEE International Conference on Utility and Cloud Computing (UCC), pp.138-145, 2011.

W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, Edge computing: Vision and challenges, IEEE Internet of Things Journal, vol.3, pp.637-646, 2016.

W. Shi and S. Dustdar, The Promise of Edge Computing, Computer, vol.49, pp.78-81, 2016.

K. Shvachko, H. Kuang, S. Radia, and R. Chansler, The Hadoop Distributed File System, Proceedings of the 26th IEEE Symposium on Mass Storage Systems and Technologies (MSST), vol.10, pp.1-10, 2010.

D. Skourtis, L. Rupprecht, V. Tarasov, and N. Megiddo, Carving perfect layers out of Docker images, Proceedings of the 11th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud). 2019. BIBLIOGRAPHY

E. James, R. Smith, and . Nair, The architecture of virtual machines, Computer, vol.38, pp.32-38, 2005.

S. Soltesz, H. Pötzl, M. E. Fiuczynski, A. Bavier, and L. Peterson, Container-based Operating System Virtualization: A Scalable, Highperformance Alternative to Hypervisors, Proceedings of the 2nd ACM SIGOP-S/EuroSys European Conference on Computer Systems (EuroSys), pp.275-287, 2007.

M. Stonebraker, What Does 'big Data' Mean?, 2012.

, Swarm mode overview, 2019.

C. Tang, FVD: A High-performance Virtual Machine Image Format for Cloud, Proceedings of the USENIX Conference on USENIX Annual Technical Conference (ATC), pp.18-18, 2011.

. Tcconfig-homepage, , 2019.

, The Internet Topology Zoo, 2019.

, The NIST Definition of Cloud Computing, 2011.

, The OpenStack Marketplace, 2019.

S. The and . Project, , 2019.

, The world's most valuable resource is no longer oil, but data, 2017.

R. Tudoran, A. Costan, R. Wang, L. Bougé, and G. Antoniu, Bridging Data in the Clouds: An Environment-Aware System for Geographically Distributed Data Transfers, Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp.92-101, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00978153

, Under the Hood: Scheduling MapReduce jobs more efficiently with Corona, 2012.

, Unikernels: Rethinking Cloud Infrastructure, 2019.

. Vagrant, , 2018.

B. Varghese, N. Wang, D. S. Nikolopoulos, and R. Buyya, Feasibility of fog computing, 2017.

V. Kumar-vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar et al., Apache Hadoop YARN: Yet Another Resource Negotiator, Proceedings of the 4th annual Symposium on Cloud Computing (SoCC), p.5, 2013.

. Virtualbox-vdi-image and . Format, , 2019.

R. Viswanathan, G. Ananthanarayanan, and A. Akella, CLAR-INET: WAN-Aware Optimization for Analytics Queries, Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI, pp.435-450, 2016.

, VMware Virtual Disk Format 1.1, 2019.

A. Maria, A. Voinea, A. Uta, and . Iosup, POSUM: A Portfolio Scheduler for MapReduce Workloads, Proceedings of the IEEE International Conference on Big Data (Big Data), pp.351-357, 2018.

. Vopendata-dashboard, , 2019.

. Vsphere-hypervisor, , 2019.

Y. Wang, X. Que, W. Yu, D. Goldenberg, and D. Sehgal, Hadoop Acceleration Through Network Levitated Merge, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), vol.57, pp.1-57, 2011.

R. Wartel, T. Cass, B. Moreira, E. Roche, M. Guijarro et al., Image Distribution Mechanisms in Large Scale Cloud Providers, Proceedings of the IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom), pp.112-117, 2010.

H. Weatherspoon and J. D. Kubiatowicz, Erasure coding vs. replication: A quantitative comparison, Proceedings of the International Workshop on Peerto-Peer Systems, pp.328-337, 2002.

A. Sage, S. A. Weil, E. L. Brandt, D. D. Miller, C. Long et al., Ceph: A Scalable, High-performance Distributed File System, Proceedings of the 7th symposium on Operating Systems Design and Implementation (OSDI, pp.307-320, 2006.

J. Dr and . Welser, Cognitive Computing: Augmenting Human Capability, 2011.

, What are Availability Zones in Azure?, 2019.

, What is Edge Computing: The Network Edge Explained, 2018.

E. Wilder-james, What is big data? An introduction to the big data landscape, 2012.

, Windows Azure Regions, 2018.

S. Wu, Y. Wang, W. Luo, S. Di, H. Chen et al., ACStor: Optimizing Access Performance of Virtual Disk Images in Clouds, IEEE Transactions on Parallel and Distributed Systems (TPDS), vol.28, pp.2414-2427, 2017.

Y. Wu, Z. Zhang, C. Wu, C. Guo, Z. Li et al., Orchestrating Bulk Data Transfers across Geo-Distributed Datacenters, IEEE Transactions on Cloud Computing, vol.5, pp.112-125, 2015.

, Xen Project hypervisor, 2019.

X. Xu, H. Jin, S. Wu, and Y. Wang, Rethink the Storage of Virtual Machine Images in Clouds, Future Generation Computer Systems (FGCS), p.50, 2015.

G. Yadgar and M. Gabel, Avoiding the Streetlight Effect: I/O Workload Analysis with SSDs in Mind, Proceedings of the 8th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage), pp.36-40, 2016.

X. Yang, B. Nasser, M. Surridge, and S. Middleton, A businessoriented cloud federation model for real-time applications, Future Generation Computer Systems (FGCS), vol.28, pp.1158-1167, 2012.

X. Yao, C. Wang, and M. Zhang, EC-Shuffle: Dynamic Erasure Coding Optimization for Efficient and Reliable Shuffle in Spark, Proceedings of the 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid, pp.41-51, 2019.

S. Yi, Z. Hao, Z. Qin, and Q. Li, Fog Computing: Platform and Applications, Proceedings of the 3rd IEEE Workshop on Hot Topics in Web Systems and Technologies (HotWeb, pp.73-78, 2015.

O. Yildiz, S. Ibrahim, and G. Antoniu, Enabling Fast Failure Recovery in Shared Hadoop Clusters: Towards Failure-Aware Scheduling, Future Generation Computer Systems (FGCS), vol.74, pp.208-219, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01338336

O. Yildiz, S. Ibrahim, T. A. Phuong, and G. Antoniu, Chronos: Failure-aware scheduling in shared Hadoop clusters, Proceedings of the IEEE International Conference on Big Data (Big Data, pp.313-318, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01203001

M. T. Matt, . Yiu, H. W. Helen, P. Chan, and . Lee, Erasure Coding for Small Objects in In-memory KV Storage, Proceedings of the 10th ACM International Systems and Storage Conference (SYSTOR, p.14, 2017.

Y. Yu, R. Huang, W. Wang, J. Zhang, and K. Ben-letaief, SPcache: Load-balanced, Redundancy-free Cluster Caching with Selective Partition, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp.1-13, 2018.

M. Zaharia, D. Borthakur, . Joydeep-sen, K. Sarma, S. Elmeleegy et al., Delay Scheduling: A Simple Technique for Achieving Locality and Fairness in Cluster Scheduling, Proceedings of the 5th European Conference on Computer Systems (EuroSys), pp.265-278, 2010.

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster Computing with Working Sets, Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud), vol.10, p.95

H. Zhang, G. Ananthanarayanan, P. Bodík, M. Philipose, V. Bahl et al., Live Video Analytics at Scale with Approximation and Delay-Tolerance, Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI, pp.377-392, 2017.

H. Zhang, M. Dong, and H. Chen, Efficient and Available In-memory KV-Store with Hybrid Erasure Coding and Replication, Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST), pp.167-180, 2016.

X. Zhang and F. Xu, Survey of Research on Big Data Storage, Proceedings of the 12th International Symposium on Distributed Computing and Applications to Business, Engineering Science, pp.76-80, 2013.

Z. Zhang, Z. Li, K. Wu, D. Li, H. Li et al., VMThunder: Fast Provisioning of Large-Scale Virtual Machine Clusters, IEEE Transactions on Parallel and Distributed Systems (TPDS), vol.25, pp.3328-3338, 2014.

Z. Zhang, A. Deshpande, X. Ma, E. Thereska, and D. Narayanan, Does erasure coding have a role to play in my data center, Tech. rep. Microsoft research, 2010.

Z. Zhang, A. Wang, K. Zheng, U. Maheswara, G. Vinayakumar et al., Introduction to HDFS Erasure Coding in Apache Hadoop, 2015.

C. Zheng, L. Rupprecht, V. Tarasov, D. Thain, M. Mohamed et al., Wharf: Sharing Docker Images in a Distributed File System, Proceedings of the ACM Symposium on Cloud Computing (SoCC), pp.174-185, 2018.

A. C. Zhou, S. Ibrahim, and B. He, On Achieving Efficient Data Transfer for Graph Processing in Geo-Distributed Datacenters, Proceedings of the 37th IEEE International Conference on Distributed Computing Systems (ICDCS, pp.1397-1407, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01560187

A. C. Zhou, T. Phan, S. Ibrahim, and B. He, Energy-Efficient Speculative Execution Using Advanced Reservation for Heterogeneous Clusters, Proceedings of the 47th International Conference on Parallel Processing (ICPP), vol.8, pp.1-8, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01807496

W. Zhou, P. Ning, X. Zhang, G. Ammons, R. Wang et al., Always Up-to-date: Scalable Offline Patching of VM Images in a Compute Cloud, Proceedings of the 26th Annual Computer Security Applications Conference (ACSAC), pp.377-386, 2010.

B. Zhu, K. Li, and H. Patterson, Avoiding the Disk Bottleneck in the Data Domain Deduplication File System, Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST), vol.8, pp.1-14, 2008.