R. D. Blumofe, C. F. Joerg, B. C. Kuszmaul, C. E. Leiserson, K. H. Randall et al., Cilk: An efficient multithreaded runtime system, Journal of parallel and distributed computing, vol.37, 1996.

T. Gautier, J. V. Lima, N. Maillard, and B. Raffin, Xkaapi: A runtime system for data-flow task programming on heterogeneous architectures, IEEE Intl. Symposium on Parallel and Distributed Processing, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00799904

A. Duran, E. Ayguadé, R. M. Badia, J. Labarta, L. Martinell et al., OmpSs: a proposal for programming heterogeneous multi-core architectures, Paral. Proces. Letters, vol.21, 2011.

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, SI:EuroPar'09, vol.23, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00384363

E. Agullo, O. Beaumont, L. Eyraud-dubois, J. Herrmann, S. Kumar et al., Bridging the gap between performance and bounds of cholesky factorization on heterogeneous platforms, IEEE Intl. Parallel and Distributed Processing Symp. Workshop, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01120507

L. Stanisic, S. Thibault, A. Legrand, B. Videau, and J. Méhaut, Faithful Performance Prediction of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, Concurrency and Computation: Practice and Experience, p.16, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01147997

L. Stanisic, E. Agullo, A. Buttari, A. Guermouche, A. Legrand et al., Fast and Accurate Simulation of Multithreaded Sparse Linear Algebra Solvers, The 21st IEEE International Conference on Parallel and Distributed Systems, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01180272

V. G. Pinto, L. M. Schnorr, L. Stanisic, A. Legrand, S. Thibault et al., A visual performance analysis framework for task based parallel applications running on hybrid clusters, Concurrency and Computation: Practice and Experience, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01616632

C. Augonnet, J. Clet-ortega, S. Thibault, and R. Namyst, Data-Aware Task Scheduling on Multi-Accelerator based Platforms, 16th Intl. Conference on Parallel and Distributed Systems, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00523937

S. Toledo, A survey of out-of-core algorithms in numerical linear algebra, Ext. Mem. Alg. and Vis, vol.50, pp.161-179, 1999.

E. Agullo, C. Augonnet, J. Dongarra, H. Ltaief, R. Namyst et al., Faster, Cheaper, Better -a Hybridization Methodology to Develop Linear Algebra Software for GPUs, GPU Computing Gems, vol.2, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00547847

E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems, ACM Tr. Math. Softw, vol.43, issue.2, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01333645

M. Sergent, D. Goudin, S. Thibault, and O. Aumage, Controlling the Memory Subscription of Distributed Applications with a Task-Based Runtime System, 21st Intl. Workshop on High-Level Paral. Prog. Models and Supportive Environments, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01284004

B. Haugen, S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, Visualizing execution traces with task dependencies, The 2nd Workshop on Visual Perf. Analysis, 2015.

A. Huynh, D. Thain, M. Pericàs, and K. Taura, DAGViz: A DAG Visualization Tool for Analyzing Task-parallel Program Traces, Proceedings of the 2nd Workshop on Visual Performance Analysis, ser. VPA '15, vol.3, pp.1-3, 2015.

A. Muddukrishna, P. A. Jonsson, A. Podobas, and M. Brorsson, Grain graphs: OpenMP performance analysis made easy, 24th ACM SIG-PLAN Symp. on Principles and Practice of Paral. Prog, 2016.

R. Keller, S. Brinkmann, J. Gracia, and C. Niethammer, Temanejo: Debugging of thread-based task-parallel programs in StarSs, Proc. of the 5th Intl. Workshop on Paral. Tools for High Performance Computing, pp.131-137, 2011.

A. Knüpfer, H. Brunst, J. Doleschal, M. Jurenz, M. Lieber et al., The Vampir performance analysis toolset, Proc. of the 2nd Intl. Workshop on Parallel Tools for High Performance Computing, pp.139-155, 2008.

V. Pillet, J. Labarta, T. Cortes, and S. Girona, Paraver: A Tool to Visualize and Analyze Parallel Code, Proceedings of WoTUG-18: Transputer and occam Developments, pp.17-31, 1995.

K. Coulomb, M. Faverge, J. Jazeix, O. Lagrasse, J. Marcoueille et al., Visual trace explorer (ViTE), Tech. Rep, 2009.

K. E. Isaacs, P. Bremer, I. Jusufi, T. Gamblin, A. Bhatele et al., Combing the communication hairball: Visualizing parallel execution traces using logical time, IEEE Trans. on visualization and computer graphics, vol.20, issue.12, pp.2349-2358, 2014.

G. Ceballos, T. Grass, A. Hugo, and D. Black-schaffer, Analyzing performance variation of task schedulers with TaskInsight, Parallel Computing, vol.75, pp.11-27, 2018.

M. Pericàs, A. Amer, K. Taura, and S. Matsuoka, Analysis of data reuse in task-parallel runtimes, High Perf. Comp. Syst. Perf. Modeling, Bench. and Simul, pp.73-87, 2014.