Algorithm-based fault tolerance applied to P2P computing networks

Thomas Roche 1 Jean-Louis Roch 1, * Mathieu Cunche 2, *
* Auteur correspondant
1 MOAIS - PrograMming and scheduling design fOr Applications in Interactive Simulation
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
2 PLANETE - Protocols and applications for the Internet
Inria Grenoble - Rhône-Alpes, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : P2P computing platforms are subject to a wide range of attacks. In this paper, we propose a generalisation of the previous disk-less checkpointing approach for fault-tolerance in High Performance Computing systems. Our contribution is in two di- rections: first, instead of restricting to 2D checksums that tolerate only a small number of node failures, we propose to base disk-less checkpointing on linear codes to tolerate potentially a large number of faults. Then, we compare and analyse the use of Low Density Parity Check (LDPC) to classical Reed-Solomon (RS) codes with respect to different fault models to fit P2P systems. Our LDPC disk-less checkpointing method is well suited when only node disconnections are considered, but cannot deal with byzantine peers. Our RS disk-less checkpointing method tolerates such byzantine errors, but is restricted to exact finite field computations.
Type de document :
Communication dans un congrès
IEEE First International Conference on Advances in P2P Systems, Oct 2009, Sliema, Malta. IEEE, pp.144 - 149, 2009, 2009 First International Conference on Advances in P2P Systems (AP2PS 2009). 〈10.1109/AP2PS.2009.30〉
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00786217
Contributeur : Jean-Louis Roch <>
Soumis le : vendredi 8 février 2013 - 10:18:10
Dernière modification le : mercredi 11 avril 2018 - 01:51:43
Document(s) archivé(s) le : jeudi 9 mai 2013 - 03:54:24

Fichier

2009-06-ap2ps.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Thomas Roche, Jean-Louis Roch, Mathieu Cunche. Algorithm-based fault tolerance applied to P2P computing networks. IEEE First International Conference on Advances in P2P Systems, Oct 2009, Sliema, Malta. IEEE, pp.144 - 149, 2009, 2009 First International Conference on Advances in P2P Systems (AP2PS 2009). 〈10.1109/AP2PS.2009.30〉. 〈hal-00786217〉

Partager

Métriques

Consultations de la notice

253

Téléchargements de fichiers

112