Skip to Main content Skip to Navigation
Conference papers

Applications Resilience on Clouds

Toan Nguyen 1, * Jean-Antoine Desideri 1 Laurentiu Trifan 1
* Corresponding author
1 OPALE - Optimization and control, numerical algorithms and integration of complex multidiscipline systems governed by PDE
CRISAM - Inria Sophia Antipolis - Méditerranée , JAD - Laboratoire Jean Alexandre Dieudonné : UMR6621
Abstract : Cloud computing infrastructures support system and network fault-tolerance. They transparently repair and prevent communication and software errors. They also allow duplication and migration of jobs and data to prevent hardware failures. However, only limited work has been done so far on application resilience, i.e., the ability to resume normal execution after errors and abnormal executions in distributed environments and clouds. This paper addresses open issues and solutions for application errors detection and management. It also overviews a testbed used to to design, deploy, execute, monitor, restart and resume distributed applications on cloud infrastructures in cases of failures.
Complete list of metadata

Cited literature [21 references]  Display  Hide  Download

https://hal.inria.fr/hal-00766625
Contributor : Toan Nguyen <>
Submitted on : Tuesday, December 18, 2012 - 3:22:22 PM
Last modification on : Monday, December 14, 2020 - 5:00:21 PM
Long-term archiving on: : Tuesday, March 19, 2013 - 3:57:03 AM

File

HPCS2012.pdf
Files produced by the author(s)

Identifiers

Citation

Toan Nguyen, Jean-Antoine Desideri, Laurentiu Trifan. Applications Resilience on Clouds. HPCS - International Conference High Performance Computing and Simulation - 2012, Waleed W. Smari, Jul 2012, Madrid, Spain. pp.60-66, ⟨10.1109/HPCSim.2012.6266891⟩. ⟨hal-00766625⟩

Share

Metrics

Record views

429

Files downloads

432