Task-based FMM for heterogeneous architectures, Concurrency and Computation: Practice and Experience, vol.28, issue.9, pp.2608-2629, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-00974674
Multifrontal QR Factorization for Multicore Architectures over Runtime Systems, Euro-Par 2013 Parallel Processing, vol.8097, pp.521-532, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-01220611
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures, Concurrency and Computation: Practice and Experience, vol.23, issue.2, pp.187-198, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
DAG Scheduling Using a Lookahead Variant of the Heterogeneous Earliest Finish Time Algorithm, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.27-34, 2010. ,
Optimization and parallelization of the boundary element method for the wave equation Theses, 2016. ,
Optimization of a discontinuous Galerkin solver with OpenCL and StarPU, International Journal on Finite, vol.15, issue.1, pp.1-19, 2020. ,
URL : https://hal.archives-ouvertes.fr/hal-01942863
Impact study of data locality on task-based applications through the Heteroprio scheduler, PeerJ Computer Science, vol.5, p.e190, 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02120736
Algorithms and Complexity, Complex Scheduling, pp.29-115, 2011. ,
Scheduling independent tasks to reduce mean finishing time, Communications of the ACM, vol.17, issue.7, pp.382-387, 1974. ,
An efficient scheduling scheme using estimated execution time for heterogeneous computing systems, The Journal of Supercomputing, vol.65, issue.2, pp.886-902, 2013. ,
Permutation distance ,
Hybrid Static/dynamic Scheduling for Already Optimized Dense Matrix Factorization, 2012 IEEE 26th International Parallel and Distributed Processing Symposium, 2012. ,
URL : https://hal.archives-ouvertes.fr/inria-00631348
The Multifrontal Solution of Indefinite Sparse Symmetric Linear, ACM Transactions on Mathematical Software, vol.9, issue.3, pp.302-325, 1983. ,
Static scheduling algorithms for allocating directed task graphs to multiprocessors, ACM Computing Surveys, vol.31, issue.4, pp.406-471, 1999. ,
Minimizing schedule length subject to minimum flow time, SIAM J. Comput, vol.18, issue.2, pp.314-326, 1989. ,
Degree-of-Node Task Scheduling of Fine-Grained Parallel Programs on Heterogeneous Systems, Journal of Computer Science and Technology, vol.34, issue.5, pp.1096-1108, 2019. ,
Task-based multifrontal QR solver for heterogeneous architectures, 2015. ,
URL : https://hal.archives-ouvertes.fr/tel-01386600
Box 1. TissueMiner can be found on the web-based repository GitHub https://github.com/mpicbg-scicomp/tissue_miner#about along with its documentation and tutorials. ,
GEODYNAMICS, Matrix: JGD_Forest/TF16 ,
Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems, vol.13, issue.3, pp.260-274, 2002. ,
Smart multi-task scheduling for OpenCL programs on CPU/GPU heterogeneous platforms, 2014 21st International Conference on High Performance Computing (HiPC), pp.1-10, 2014. ,
Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors, IEEE Transactions on Parallel and Distributed Systems, vol.7, issue.5, pp.506-521, 1996. ,
Thermal-Aware Task Scheduling for Energy Minimization in Heterogeneous Real-Time MPSoC Systems, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.35, issue.8, pp.1269-1282, 2016. ,
internal schema ,
Figure 2: Heteroprio schematic view. ,
Figure 1: Example of clustering a DAG of 7 nodes targeting cluster of M = 2 nodes. ,
Example executions for the 3 cases ,
CONCEPTUAL SCHEMA MATCHING BASED ON SIMILARITY HEURISTICS, p.15 ,
20 8 Execution times of the 28 heuristics on consecutive executions, vol.32, p.33 ,
Figure 5: Emulated execution times against cluster granularity G for different test cases, different machine configurations (colors ----) and different strategies (nodes ??)., p.35 ,
, Figure 9: Execution example with four tasks. B and C are uncertain tasks and here B did not write on the data, while C did. Consequently, the RS has disabled or enabled the other tasks accordingly.
38 5 Slowdown of each method, part 1 ,