G. E. Fagg and J. Dongarra, FT-MPI: Fault Tolerant MPI, Supporting Dynamic Applications in a Dynamic World, Proc. 7th EuroPVM/MPI, pp.346-353, 2000.
DOI : 10.1007/3-540-45255-9_47

J. R. De-souza, E. Argollo, A. Duarte, D. Rexachs, and E. Luque, Fault tolerant masterworker over a multi-cluster architecture, Proc. of ParCo, pp.465-472, 2005.

T. Leblanc, R. Anand, E. Gabriel, and J. Subhlok, VolpexMPI: An MPI Library for Execution of Parallel Applications on Volatile Nodes, Proc. of EuroPVM/MPI 2009, pp.124-133, 2009.
DOI : 10.1007/978-3-642-03770-2_19

D. Buntinas, C. Coti, T. Herault, P. Lemarinier, L. Pilard et al., Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI Protocols, Future Generation Computer Systems, vol.24, issue.1, pp.73-84, 2008.
DOI : 10.1016/j.future.2007.02.002

URL : https://hal.archives-ouvertes.fr/hal-00688644

A. Chien, B. Calder, S. Elbert, and K. Bhatia, Entropia: architecture and performance of an enterprise desktop grid system, Journal of Parallel and Distributed Computing, vol.63, issue.5, pp.597-610, 2003.
DOI : 10.1016/S0743-7315(03)00006-6

E. Byun, S. Choi, M. Baik, J. Gil, C. Park et al., MJSA: Markov job scheduler based on availability in desktop grid computing environment, Future Generation Computer Systems, vol.23, issue.4, pp.616-622, 2007.
DOI : 10.1016/j.future.2006.09.004

D. Nurmi, J. Brevik, and R. Wolski, Modeling Machine Availability in Enterprise and Wide-Area Distributed Computing Environments, Proc. of Europar, 2005.
DOI : 10.1007/11549468_50

R. Wolski, D. Nurmi, and J. Brevik, An Analysis of Availability Distributions in Condor, 2007 IEEE International Parallel and Distributed Processing Symposium, 2007.
DOI : 10.1109/IPDPS.2007.370523

B. Javadi, D. Kondo, J. Vincent, and D. Anderson, Mining for statistical models of availability in large-scale distributed systems: An empirical study of SETI@home, 2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems, 2009.
DOI : 10.1109/MASCOT.2009.5367061

URL : https://hal.archives-ouvertes.fr/hal-00788912

X. Ren, S. Lee, R. Eigenmann, and S. Bagchi, Prediction of Resource Availability in Fine-Grained Cycle Sharing Systems Empirical Evaluation, Journal of Grid Computing, vol.3, issue.4, pp.173-195, 2007.
DOI : 10.1007/s10723-007-9077-5

B. Hong and V. K. Prasanna, Adaptive Allocation of Independent Tasks to Maximize Throughput, IEEE Transactions on Parallel and Distributed Systems, vol.18, issue.10, pp.1420-1435, 2007.
DOI : 10.1109/TPDS.2007.1042

J. M. Bahi, S. Contassot-vivier, and R. Couturier, Parallel Iterative Algorithms: From Sequential to Grid Computing, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00336524

A. Heddaya and K. Park, Mapping parallel iterative algorithms onto workstation networks, Proceedings of 3rd IEEE International Symposium on High Performance Distributed Computing, pp.211-218, 1994.
DOI : 10.1109/HPDC.1994.340242

A. Legrand, H. Renard, Y. Robert, and F. Vivien, Mapping and load-balancing iterative computations on heterogeneous clusters with shared links, IEEE TPDS, vol.15, pp.546-558, 2004.
URL : https://hal.archives-ouvertes.fr/hal-00789426

D. Kondo, A. Chien, and H. Casanova, Resource Management for Rapid Application Turnaround on Enterprise Desktop Grids, Proceedings of the ACM/IEEE SC2004 Conference, 2004.
DOI : 10.1109/SC.2004.50

D. Zhou and V. Lo, Wave Scheduler: Scheduling for Faster Turnaround Time in Peer-Based Desktop Grid Systems, Proc. of the 11th JSSPP Workshop, 2005.
DOI : 10.1007/11605300_10

T. Estrada, D. Flores, M. Taufer, P. Teller, A. Kerstens et al., The Effectiveness of Threshold-Based Scheduling Policies in BOINC Projects, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06), 2006.
DOI : 10.1109/E-SCIENCE.2006.261172

C. Anglano, J. Brevik, M. Canonico, D. Nurmi, and R. Wolski, Fault-aware scheduling for Bag-of-Tasks applications on Desktop Grids, 2006 7th IEEE/ACM International Conference on Grid Computing, pp.56-63, 2006.
DOI : 10.1109/ICGRID.2006.310998

T. Estrada, O. Fuentes, and . Taufer, A distributed evolutionary method to design scheduling policies for volunteer computing, ACM SIGMETRICS Performance Evaluation Review, vol.36, issue.3, pp.40-49, 2008.
DOI : 10.1145/1481506.1481515

J. Wingstrom and H. Casanova, Probabilistic allocation of tasks on desktop grids, 2008 IEEE International Symposium on Parallel and Distributed Processing, 2008.
DOI : 10.1109/IPDPS.2008.4536450

E. Heien, D. Anderson, and K. Hagihara, Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments, Journal of Grid Computing, vol.69, issue.347, pp.501-518, 2009.
DOI : 10.1007/s10723-009-9131-6

URL : http://dx.doi.org/10.1007/s10723-009-9131-6

N. Fujimoto and K. Hagihara, Near-optimal dynamic task scheduling of independent coarse-grained tasks onto a computational grid, 2003 International Conference on Parallel Processing, 2003. Proceedings., 2003.
DOI : 10.1109/ICPP.2003.1240603

C. Moretti, T. Faltemier, D. Thain, and P. Flynn, Challenges in Executing Data Intensive Biometric Workloads on a Desktop Grid, 2007 IEEE International Parallel and Distributed Processing Symposium, 2007.
DOI : 10.1109/IPDPS.2007.370671

T. Toyoma, Y. Yamada, and K. Konishi, A Resource Management System for Data- Intensive Applications in Desktop Grid Environments, Proc. of PDCS, 2006.

H. He, G. Fedak, B. Tang, and F. Cappello, BLAST Application with Data-Aware Desktop Grid Middleware, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp.284-291, 2009.
DOI : 10.1109/CCGRID.2009.91

URL : https://hal.archives-ouvertes.fr/hal-00684869

R. Guerraoui and A. Schiper, Software-based replication for fault tolerance, Computer, vol.30, issue.4, pp.68-74, 1997.
DOI : 10.1109/2.585156

P. Stelling, C. Dematteis, I. Foster, C. Kesselman, C. Lee et al., A fault detection service for wide area distributed computations, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244), pp.117-128, 1999.
DOI : 10.1109/HPDC.1998.709981

W. Gropp, MPICH2: A New Start for MPI Implementations, PVM/MPI, p.7, 2002.
DOI : 10.1007/3-540-45825-5_5

J. Håstad, Some optimal inapproximability results, STOC '97, pp.1-10, 1997.

D. Kondo, B. Javadi, A. Iosup, and D. Epema, The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010.
DOI : 10.1109/CCGRID.2010.71

URL : https://hal.archives-ouvertes.fr/inria-00433523