Fault Tolerance in Petascale/ Exascale Systems: Current Knowledge, Challenges and Research Opportunities, International Journal of High Performance Computing Applications, vol.23, issue.3, pp.212-226, 2009. ,
DOI : 10.1177/1094342009106189
Toward Exascale Resilience, International Journal of High Performance Computing Applications, vol.23, issue.4, pp.374-388, 2009. ,
DOI : 10.1177/1094342009347767
Live Migration of Virtual Machines, Proc. of the 2nd Symp. on Networked Systems Design and Implementation, pp.273-286, 2005. ,
A higher order estimate of the optimum checkpoint interval for restart dumps, Future Generation Computer Systems, vol.22, issue.3, pp.303-312, 2004. ,
DOI : 10.1016/j.future.2004.11.016
The workload on parallel supercomputers: modeling the characteristics of rigid jobs, Journal of Parallel and Distributed Computing, vol.63, issue.11, pp.1105-1122, 2003. ,
DOI : 10.1016/S0743-7315(03)00108-4
Overcoming The Difficulties Created By The Volatile Nature Of Desktop Grids Through Understanding, Prediction And Redundancy, 2009. ,
A first order approximation to the optimum checkpoint interval, Communications of the ACM, vol.17, issue.9, pp.530-531, 1974. ,
DOI : 10.1145/361147.361115