Pegasus, a workflow management system for science automation, Future Generation Computer Systems, vol.46, issue.0, pp.17-35, 2015. ,
DOI : 10.1016/j.future.2014.10.008
URL : https://doi.org/10.1016/j.future.2014.10.008
ASKALON: A Development and Grid Computing Environment for Scientific Workflows, Workflows for e-Science, pp.450-471, 2007. ,
DOI : 10.1007/978-1-84628-757-2_27
Swift: A language for distributed parallel scripting, Parallel Computing, vol.37, issue.9, pp.633-652, 2011. ,
DOI : 10.1016/j.parco.2011.05.005
URL : http://www.ci.uchicago.edu/~wilde/SwiftParallelScripting.Parco2011.pdf
The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud, Nucleic Acids Research, vol.41, issue.W1, p.328, 2013. ,
DOI : 10.1093/nar/gkt328
Kepler: an extensible system for design and execution of scientific workflows, Proceedings. 16th International Conference on Scientific and Statistical Database Management, 2004., pp.423-424, 2004. ,
DOI : 10.1109/SSDM.2004.1311241
Makeflow, Proceedings of the 1st ACM SIGMOD Workshop on Scalable Workflow Execution Engines and Technologies, SWEET '12, p.1, 2012. ,
DOI : 10.1145/2443416.2443417
Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, pp.1352-1363, 2012. ,
DOI : 10.1109/IPDPS.2012.122
Computational complexity of PERT problems, Networks, vol.8, issue.2, pp.139-147, 1988. ,
DOI : 10.1002/net.3230180206
Scheduling: Theory, Algorithms, and Systems, 2016. ,
The Complexity of Enumeration and Reliability Problems, SIAM Journal on Computing, vol.8, issue.3, pp.410-421, 1979. ,
DOI : 10.1137/0208032
The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected, SIAM Journal on Computing, vol.12, issue.4, pp.777-788, 1983. ,
DOI : 10.1137/0212053
A note on the complexity of network reliability problems, IEEE Trans. Inf. Theory, vol.47, pp.1971-1988, 2004. ,
The recognition of series parallel digraphs, Proc. 11th ACM Symp. Theory of Computing, ser. STOC '79, pp.1-12, 1979. ,
Parallel algorithms for series parallel graphs, pp.277-289, 1996. ,
DOI : 10.1007/3-540-61680-2_62
URL : http://archive.cs.uu.nl/pub/RUU/CS/techreps/CS-1996/1996-13.ps.gz
A Mapping Algorithm for Parallel Sparse Cholesky Factorization, SIAM Journal on Scientific Computing, vol.14, issue.5, pp.1253-1257, 1993. ,
DOI : 10.1137/0914074
On the Optimum Checkpoint Selection Problem, SIAM Journal on Computing, vol.13, issue.3, 1984. ,
DOI : 10.1137/0213039
URL : http://graal.ens-lyon.fr/%7Eabenoit/CR02/papers/chains1.pdf
Pegasus workflow generator, 2014. ,
Scheduling under Uncertainty: Bounding the Makespan Distribution, Computational Discrete Mathematics: Advanced Lectures, pp.79-97, 2001. ,
DOI : 10.1007/3-540-45506-X_7
Correlation-Aware Heuristics for Evaluating the Distribution of the Longest Path Length of a DAG with Random Weights, IEEE Transactions on Parallel and Distributed Systems, vol.27, issue.11, 2016. ,
DOI : 10.1109/TPDS.2016.2528983
URL : https://hal.archives-ouvertes.fr/hal-01412922
The Completion Time of PERT Networks, Journal of the Operational Research Society, vol.34, issue.2, pp.155-158, 1983. ,
DOI : 10.1057/jors.1983.27
Computing the Expected Makespan of Task Graphs in the Presence of Silent Errors, 2016 45th International Conference on Parallel Processing Workshops (ICPPW), 2016. ,
DOI : 10.1109/ICPPW.2016.34
URL : https://hal.archives-ouvertes.fr/hal-01354711
Probability and Computing: Randomized Algorithms and Probabilistic Analysis, 2005. ,
DOI : 10.1017/CBO9780511813603
Letter to the Editor???Monte Carlo Methods and the PERT Problem, Operations Research, vol.11, issue.5, pp.839-860, 1963. ,
DOI : 10.1287/opre.11.5.839
Community Resources for Enabling Research in Distributed Scientific Workflows, 2014 IEEE 10th International Conference on e-Science, pp.177-184, 2014. ,
DOI : 10.1109/eScience.2014.44
Characterization of scientific workflows, 2008 Third Workshop on Workflows in Support of Large-Scale Science, pp.1-10, 2008. ,
DOI : 10.1109/WORKS.2008.4723958
Characterizing and profiling scientific workflows, Future Generation Computer Systems, vol.29, issue.3, pp.682-692, 2013. ,
DOI : 10.1016/j.future.2012.08.015
Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems, Scientific Programming, pp.219-237, 2005. ,
DOI : 10.1155/2005/128026
URL : https://doi.org/10.1155/2005/128026
Checkpointing Workflows for Fail-Stop Errors, 2017 IEEE International Conference on Cluster Computing (CLUSTER), 2017. ,
DOI : 10.1109/CLUSTER.2017.14
URL : https://hal.archives-ouvertes.fr/hal-01559967
Checkpointing Workflows for Fail-Stop Errors, 2017 IEEE International Conference on Cluster Computing (CLUSTER), 2017. ,
DOI : 10.1109/CLUSTER.2017.14
URL : https://hal.archives-ouvertes.fr/hal-01559967
Design for a Soft Error Resilient Dynamic Task-Based Runtime, 2015 IEEE International Parallel and Distributed Processing Symposium, pp.765-774, 2015. ,
DOI : 10.1109/IPDPS.2015.81
Performance Under Failures of DAGbased Parallel Computing, CCGRID '09, 2009. ,
DOI : 10.1109/ccgrid.2009.55
URL : http://www.cs.iit.edu/~scs/psfiles/3622a236.pdf
A novel adaptive checkpointing method based on information obtained from workflow structure, Computer Science, vol.17, issue.3, 2016. ,
DOI : 10.7494/csci.2016.17.3.387
Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids, IEEE Transactions on Parallel and Distributed Systems, vol.20, issue.2, pp.180-190, 2009. ,
DOI : 10.1109/TPDS.2008.93
Fault-Tolerant Dynamic Task Graph Scheduling, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, pp.719-730, 2014. ,
DOI : 10.1109/SC.2014.64
Nanocheckpoints: A task-based asynchronous dataflow framework for efficient and scalable checkpoint/restart, 23rd Euromicro PDP, pp.99-102, 2015. ,
CRC-Based Memory Reliability for Task-Parallel HPC Applications, 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp.1101-1112, 2016. ,
DOI : 10.1109/IPDPS.2016.70
Algorithm-based fault tolerance for matrix operations, IEEE Trans. Comput, vol.33, issue.6, pp.518-528, 1984. ,
Algorithm-based fault tolerance applied to high performance computing, Journal of Parallel and Distributed Computing, vol.69, issue.4, pp.410-416, 2009. ,
DOI : 10.1016/j.jpdc.2008.12.002
Fault tolerant preconditioned conjugate gradient for sparse linear system solution, Proceedings of the 26th ACM international conference on Supercomputing, ICS '12, 2012. ,
DOI : 10.1145/2304576.2304588
Lightweight Silent Data Corruption Detection Based on Runtime Data Analysis for HPC Applications, Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, HPDC '15, 2015. ,
DOI : 10.1145/1810085.1810120
Detecting silent data corruption through data dynamic monitoring for scientific applications, SIGPLAN Notices, pp.381-382, 2014. ,
Fault-secure scheduling of arbitrary task graphs to multiprocessor systems, Proceeding International Conference on Dependable Systems and Networks. DSN 2000, pp.203-212, 2000. ,
DOI : 10.1109/ICDSN.2000.857536
A Novel Bicriteria Scheduling Heuristics Providing a Guaranteed Global System Failure Rate, IEEE Transactions on Dependable and Secure Computing, vol.6, issue.4, pp.241-254, 2009. ,
DOI : 10.1109/TDSC.2008.50
URL : https://hal.archives-ouvertes.fr/hal-00746768
Designing and Modelling Selective Replication for Fault-Tolerant HPC Applications, 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2017. ,
DOI : 10.1109/CCGRID.2017.40
Assessing general-purpose algorithms to cope with fail-stop and silent errors, ACM Trans. Parallel Computing, vol.3, issue.2, 2016. ,
DOI : 10.1007/978-3-319-17248-4_11
URL : https://hal.archives-ouvertes.fr/hal-01066664
Scheduling Computational Workflows on Failure-Prone Platforms, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp.2-26, 2016. ,
DOI : 10.1109/IPDPSW.2015.33
URL : https://hal.archives-ouvertes.fr/hal-01075100
Replication-Based Fault-Tolerance for Large-Scale Graph Processing, 2014 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, pp.562-573, 2014. ,
DOI : 10.1109/DSN.2014.58
A bi-criteria scheduling heuristic for distributed embedded systems under reliability and real-time constraints, International Conference on Dependable Systems and Networks, 2004, 2004. ,
DOI : 10.1109/DSN.2004.1311904
Modeling stream processing applications for dependability evaluation, 2011 IEEE/IFIP 41st International Conference on Dependable Systems & Networks (DSN), 2011. ,
DOI : 10.1109/DSN.2011.5958256
A survey of graph layout problems, ACM Computing Surveys, vol.34, issue.3, pp.313-356, 2002. ,
DOI : 10.1145/568522.568523