S. M. Da-cruz, C. E. Paulino, D. De-oliveira, M. L. Campos, and M. Mattoso, Capturing distributed provenance metadata from cloud-based scientific workflows, Journal of Information and Data Management, vol.2, issue.1, p.43, 2011.

T. Samak, D. Gunter, M. Goode, E. Deelman, G. Juve et al., Failure analysis of distributed scientific workflows executing in the cloud. Network and service management (cnsm), 8th international conference and 2012 workshop on systems virtualiztion management (svm, pp.46-54, 2012.

Y. Liang, Y. Zhang, M. Jette, A. Sivasubramaniam, and R. Sahoo, BlueGene/L failure analysis and prediction models, International Conference on Dependable Systems and Networks (DSN), pp.425-434, 2006.

C. Pham, P. Cao, Z. Kalbarczyk, and R. K. Iyer, Toward a high availability cloud: Techniques and challenges, IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN 2012), pp.1-6, 2012.
DOI : 10.1109/DSNW.2012.6264687

X. Chen, C. D. Lu, and K. Pattabiraman, Failure Analysis of Jobs in Compute Clouds: A Google Cluster Case Study, 2014 IEEE 25th International Symposium on Software Reliability Engineering, 2014.
DOI : 10.1109/ISSRE.2014.34

K. W. Vishwanath and N. Nachiappan, Characterizing cloud computing hardware reliability, Proceedings of the 1st ACM symposium on Cloud computing, SoCC '10, 2010.
DOI : 10.1145/1807128.1807161

A. Bala and I. Chana, Fault tolerance-challenges, techniques and implementation in cloud computing, IJCSI International Journal of Computer Science Issues, vol.9, issue.1, pp.1694-0814, 2012.

K. Plankensteiner, R. Prodan, T. Fahringer, A. Kertesz, and P. Kacsuk, Fault-tolerant behavior in state-of-the-art grid workflow management systems, 2007.

E. M. Bahsi, Dynamic Workflow Management For Large Scale Scientific Applications, 2006.

E. Deelman, Y. Gil, M. Ellisman, T. Fahringer, G. Fox et al., Examining the challenges of scientific workflows, IEEE computer, issue.12, pp.40-66, 2007.

E. Deelman and Y. Gil, Managing Large-Scale Scientific Workflows in Distributed Environments: Experiences and Challenges, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06), p.144, 2006.
DOI : 10.1109/E-SCIENCE.2006.261077

B. Ludäscher, I. Altintas, S. Bowers, J. Cummings, T. Critchlow et al., Scientific Process Automation and Workflow Management, Scientific Data Management: Challenges, Existing Technology, and Deployment, Computational Science Series, pp.476-508, 2009.
DOI : 10.1201/9781420069815-c13

E. Kail, A. Bánáti, P. Kacsuk, and M. Kozlovszky, Provenance based adaptive and dynamic workflows, 15th IEEE International Symposium on Computational Intelligence and Informatics, pp.215-219, 2014.

A. Das, On Fault Tolerance of Resources in Computational Grids, International Journal of Grid Computing & Applications, vol.3, issue.3, pp.1-10, 2012.
DOI : 10.5121/ijgca.2012.3301

P. A. Mouallem and M. Vouk, A fault tolerance framework for kepler-based distributed scientific workflows, 2011.

R. A. Alsoghayer, Risk assessment models for resource failure in grid computing. Thesis, 2011.

A. Bánáti, P. Kacsuk, and M. Kozlovszky, Towards Flexible Provenance and Workflow Manipulation in Scientific Workflows, Proceedings of CGW 14, 2014.

E. Kail, P. Kacsuk, and M. Kozlovszky, A novel approach to user-steering in scientific workflows, 2015 IEEE 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics, 2014.
DOI : 10.1109/SACI.2015.7208205