L. Alvisi, K. Bhatia, and K. Marzullo, Causality Tracking in Causal Messagelogging Protocols, Distrib. Comput, vol.15, pp.1-15, 2002.

L. Alvisi and K. Marzullo, Message Logging: Pessimistic, Optimistic, Causal, and Optimal, IEEE Trans. on Software Engineering, vol.24, pp.149-159, 1998.

M. S. Ardekani, R. P. Singh, N. Agrawal, D. B. Terry, and R. O. Suminto, Rivulet: A Fault-tolerant Platform for Smart-home Applications, Proc. of Middleware'17 (Middleware '17), pp.41-54, 2017.

F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, Fog Computing and its Role in the Internet of Things, Proc. of MCC'12, pp.13-16, 2012.

F. Boyer, O. Gruber, and D. Pous, Robust Reconfigurations of Component Assemblies, Proc. of ICSE'13, pp.13-22, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00966078

F. Boyer, O. Gruber, and G. Salaün, Specifying and Verifying the Synergy Reconfiguration Protocol with LOTOS NT and CADP, Proc. of FM'11, vol.6664, pp.103-117, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00648909

A. Brogi, A. Canciani, and J. Soldani, Fault-Aware Management Protocols for Multi-Component Applications, Journal of Systems and Software, vol.139, pp.189-210, 2018.

S. K. Card, G. G. Robertson, and J. D. Mackinlay, The Information Visualizer, an Information Workspace, Proc. of CHI '91, pp.181-186, 1991.

K. M. Chandy and L. Lamport, Distributed Snapshots: Determining Global States of Distributed Systems, ACM Trans. Comput. Syst, vol.3, pp.63-75, 1985.

E. N. Elnozahy, L. Alvisi, Y. Wang, and D. B. Johnson, A Survey of Rollback-recovery Protocols in Message-passing Systems, ACM Comput. Surv, vol.34, pp.375-408, 2002.

X. Etchevers, G. Salaün, F. Boyer, T. Coupaye, and N. D. Palma, Reliable Self-deployment of Distributed Cloud Applications, Softw., Pract. Exper, vol.47, pp.3-20, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01290465

C. J. Fidge, Timestamps in Message-Passing Systems that Preserve the Partial Ordering, Proc. of the 11th Australian Computer Science Conference, pp.56-66, 1988.

G. Friedrich, M. Fugini, E. Mussi, B. Pernici, and G. Tagni, Exception Handling for Repair in Service-Based Processes, IEEE Trans. Software Eng, vol.36, issue.2, pp.198-215, 2010.

T. N. Gia, A. Rahmani, T. Westerlund, P. Liljeberg, and H. Tenhunen, Fault Tolerant and Scalable IoT-Based Architecture for Health Monitoring, In IEEE SAS. IEEE, pp.1-6, 2015.

Y. Huang and C. Kintala, Software Fault Tolerance in the Application Layer, Software fault tolerance, vol.3, pp.231-248, 1995.

A. Khunteta and P. Kumar, An Analysis of Checkpointing Algorithms for Distributed Mobile Systems, Int. Journal on Computer Sci. and Eng, vol.2, pp.1314-1326, 2010.

T. H. Lai and T. H. Yang, On Distributed Snapshots, Inform. Process. Lett, vol.25, pp.153-158, 1987.

B. Lampson and H. E. Sturgis, Crash Recovery in a Distributed Data Storage System, 1979.

J. B. Leners, H. Wu, W. Hung, M. K. Aguilera, and M. Walfish, Detecting Failures in Distributed Systems with the Falcon Spy Network, Proc. of SOSP '11, pp.279-294, 2011.

L. Letondeur, F. Ottogalli, and T. Coupaye, A Demo of Application Lifecycle Management for IoT Collaborative Neighborhood in the Fog, IEEE Fog World Congress, pp.1-6, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01635340

F. Mattern, Virtual Time and Global States of Distributed Systems, Parallel and Distributed Algorithms, pp.215-226, 1988.

F. Mattern, Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation, J. Parallel and Distrib. Comput, vol.18, pp.423-434, 1993.

R. B. Miller, Response Time in Man-Computer Conversational Transactions, Proc. of AFIPS '68 (Fall, part I), pp.267-277, 1968.

B. Randell, System Structure for Software Fault Tolerance. SIGPLAN Not, vol.10, pp.437-449, 1975.

R. Strom and S. Yemini, Optimistic Recovery in Distributed Systems, ACM Trans. Comput. Syst, vol.3, pp.204-226, 1985.

D. Terry, Toward a New Approach to IoT Fault Tolerance, Computer, vol.49, pp.80-83, 2016.

Y. Xia, X. Etchevers, L. Letondeur, T. Coupaye, and F. Desprez, Combining Hardware Nodes and Software Components Ordering-based Heuristics for Optimizing the Placement of Distributed IoT Applications in the Fog, Proc. of SAC'18, pp.751-760, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01908928

J. Xu and R. H. Netzer, Adaptive Independent Checkpointing for Reducing Rollback Propagation, Proc. 5 t h IEEE SPDP, pp.754-761, 1993.
DOI : 10.1109/spdp.1993.395456

F. Zambonelli, On the Effectiveness of Distributed Checkpoint Algorithms for Domino-Free Recovery, Proc. of HPDC'98. IEEE, pp.124-131, 1998.

S. Zhou, K. Lin, J. Na, C. Chuang, and C. Shih, Supporting Service Adaptation in Fault Tolerant Internet of Things, Proc. of SOCA '15. IEEE, pp.65-72, 2015.