L. Alvisi and K. Marzullo, Message logging: pessimistic, optimistic, causal, and optimal, IEEE Transactions on Software Engineering, vol.24, issue.2, pp.149-159, 1998.
DOI : 10.1109/32.666828

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.12.78

L. Alvisi, S. Rao, S. A. Husain, D. M. Asanka, and E. Elnozahy, An analysis of communication induced checkpointing, Digest of Papers. Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing (Cat. No.99CB36352), p.242, 1999.
DOI : 10.1109/FTCS.1999.781058

R. Baldoni, A communication-induced checkpointing protocol that ensures rollback-dependency trackability, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing, p.68, 1997.
DOI : 10.1109/FTCS.1997.614079

B. Bhargava and S. Lian, Independent checkpointing and concurrent rollback for recovery in distributed systems-an optimistic approach, Proceedings [1988] Seventh Symposium on Reliable Distributed Systems, pp.3-12, 1988.
DOI : 10.1109/RELDIS.1988.25775

A. Bouteiller, T. Herault, G. Krawezik, P. Lemarinier, and F. Cappello, MPICH-V Project: A Multiprotocol Automatic Fault-Tolerant MPI, International Journal of High Performance Computing Applications, vol.20, issue.3, pp.319-333, 2006.
DOI : 10.1177/1094342006067469

URL : https://hal.archives-ouvertes.fr/hal-00688637

A. Bouteiller, T. Ropars, G. Bosilca, C. Morin, and J. Dongarra, Reasons for a pessimistic or optimistic message logging protocol in MPI uncoordinated failure, recovery, 2009 IEEE International Conference on Cluster Computing and Workshops, 2009.
DOI : 10.1109/CLUSTR.2009.5289157

URL : https://hal.archives-ouvertes.fr/inria-00424017

F. Cappello, A. Geist, B. Gropp, L. Kale, B. Kramer et al., Toward Exascale Resilience, International Journal of High Performance Computing Applications, vol.23, issue.4, pp.374-388, 2009.
DOI : 10.1177/1094342009347767

F. Cappello, A. Guermouche, and M. Snir, On Communication Determinism in Parallel HPC Applications, 2010 Proceedings of 19th International Conference on Computer Communications and Networks, 2010.
DOI : 10.1109/ICCCN.2010.5560143

K. Chandy and L. Lamport, Distributed snapshots: determining global states of distributed systems, ACM Transactions on Computer Systems, vol.3, issue.1, pp.63-75, 1985.
DOI : 10.1145/214451.214456

E. N. Elnozahy, L. Alvisi, Y. Wang, and D. B. Johnson, A survey of rollback-recovery protocols in message-passing systems, ACM Computing Surveys, vol.34, issue.3, pp.375-408, 2002.
DOI : 10.1145/568522.568525

D. B. Johnson and W. Zwaenepoel, Sender-Based Message Logging, Digest of Papers: The 17th Annual International Symposium on Fault-Tolerant Computing, pp.14-19, 1987.

R. Koo and S. Toueg, Checkpointing and Rollback-Recovery for Distributed Systems, Proceedings of 1986 ACM Fall joint computer conference, ACM '86, pp.1150-1158, 1986.
DOI : 10.1109/TSE.1987.232562

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.1616

L. Lamport, Time, clocks, and the ordering of events in a distributed system, Communications of the ACM, vol.21, issue.7, pp.558-565, 1978.
DOI : 10.1145/359545.359563

N. Neves and W. K. Fuchs, Using time to improve the performance of coordinated checkpointing, Proceedings of IEEE International Computer Performance and Dependability Symposium, pp.282-291, 1996.
DOI : 10.1109/IPDS.1996.540229

R. A. Oldfield, S. Arunagiri, P. J. Teller, S. Seelam, M. R. Varela et al., Modeling the Impact of Checkpoints on Next-Generation Systems, 24th IEEE Conference on Mass Storage Systems and Technologies (MSST 2007), pp.30-46, 2007.
DOI : 10.1109/MSST.2007.4367962

R. Riesen, Communication Patterns, Workshop on Communication Architecture for Clusters CAC'06, 2006.

T. Ropars and C. Morin, Active Optimistic Message Logging for Reliable Execution of MPI Applications, 15th International Euro-Par Conference, pp.615-626, 2009.
DOI : 10.1145/3959.3962

URL : https://hal.archives-ouvertes.fr/inria-00424002

B. Schroeder and G. A. Gibson, Understanding failures in petascale computers, Journal of Physics: Conference Series, vol.78, issue.11pp, p.12022, 2007.
DOI : 10.1088/1742-6596/78/1/012022

Q. O. Snell, A. R. Mikler, and J. L. Gustafson, NetPIPE: A Network Protocol Independent Performance Evaluator, IASTED International Conference on Intelligent Information Management and Systems, 1996.