Applications Resilience on Clouds
Résumé
Cloud computing infrastructures support system and network fault-tolerance. They transparently repair and prevent communication and software errors. They also allow duplication and migration of jobs and data to prevent hardware failures. However, only limited work has been done so far on application resilience, i.e., the ability to resume normal execution after errors and abnormal executions in distributed environments and clouds. This paper addresses open issues and solutions for application errors detection and management. It also overviews a testbed used to to design, deploy, execute, monitor, restart and resume distributed applications on cloud infrastructures in cases of failures.
Origine : Fichiers produits par l'(les) auteur(s)
Loading...