H. Topcuoglu, S. Hariri, and M. Wu, Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems, vol.13, issue.3, pp.260-274, 2002.
DOI : 10.1109/71.993206

E. Schulte, D. Davison, T. Dye, and C. Dominik, A Multi-Language Computing Environment for Literate Programming and Reproducible Research, Journal of Statistical Software, vol.46, issue.3, 2012.
DOI : 10.18637/jss.v046.i03

E. Agullo, G. Bosilca, B. Bramas, C. Castagnede, O. Coulaud et al., Poster: Matrices over Runtime Systems at Exascale, 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pp.1332-1332, 2012.
DOI : 10.1109/SC.Companion.2012.168

C. Augonnet, S. Thibault, R. Namyst, and P. Wacrenier, Starpu: a unified platform for task scheduling on heterogeneous multicore architectures, Conc. and Comp.: Pract. and Exp, vol.23, issue.2, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00384363

E. Agullo, J. Demmel, J. Dongarra, B. Hadri, J. Kurzak et al., Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, vol.180, issue.1, 2009.
DOI : 10.1088/1742-6596/180/1/012037

A. Duran, E. Ayguadé, R. M. Badia, J. Labarta, L. Martinell et al., OmpSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES, Parallel Processing Letters, vol.21, issue.02, 2011.
DOI : 10.1142/S0129626411000151

G. Bosilca, A. Bouteiller, A. Danalis, T. Herault, P. Lemarinier et al., DAGuE: A generic distributed DAG engine for High Performance Computing, extensions for Next-Generation Parallel Programming Models, pp.37-51, 2012.
DOI : 10.1016/j.parco.2011.10.003

C. Augonnet, O. Aumage, N. Furmento, R. Namyst, and S. Thibault, StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators, Proceedings of the 19th European Conference on Recent Advances in the Message Passing Interface, ser. EuroMPI'12, pp.298-299, 2012.
DOI : 10.1007/978-3-642-33518-1_40

URL : https://hal.archives-ouvertes.fr/hal-00725477

S. Ohshima, S. Katagiri, K. Nakajima, S. Thibault, and R. Namyst, Implementation of FEM Application on GPU with StarPU, SIAM Conference on Computational Science and Engineering 2013
URL : https://hal.archives-ouvertes.fr/hal-00926144

]. V. Martínez, D. Michéa, F. Dupros, O. Aumage, S. Thibault et al., Towards Seismic Wave Modeling on Heterogeneous Many-Core Architectures Using Task-Based Runtime System, 2015 27th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2015.
DOI : 10.1109/SBAC-PAD.2015.33

E. Agullo, L. Giraud, A. Guermouche, S. Nakov, and J. Roman, Taskbased Conjugate Gradient: from multi-GPU towards heterogeneous architectures, Inria Bordeaux Research Report, vol.8912, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01334734

X. Lacoste, M. Faverge, P. Ramet, S. Thibault, and G. Bosilca, Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, pp.29-38, 2014.
DOI : 10.1109/IPDPSW.2014.9

URL : https://hal.archives-ouvertes.fr/hal-00925017

R. L. Graham, Bounds for Certain Multiprocessing Anomalies, Bell System Technical Journal, vol.45, issue.9, pp.1563-1581, 1966.
DOI : 10.1002/j.1538-7305.1966.tb01709.x

K. Coulomb, M. Faverge, J. Jazeix, O. Lagrasse, J. Marcoueille et al., Visual trace explorer (vite)

L. M. Schnorr, M. Faverge, F. Trahay, B. De-oliveira-stein, and J. C. De-kergommeaux, The Paje trace file format, UFRGS, Tech. Rep, 2016.

V. Pillet, J. Labarta, T. Cortes, and S. Girona, Paraver: A tool to visualize and analyze parallel code, Proceedings of WoTUG-18: Transputer and occam Developments, pp.17-31, 1995.

A. Knüpfer, H. Brunst, J. Doleschal, M. Jurenz, M. Lieber et al., The vampir performance analysis toolset, Tools for High Perf. Comp, pp.139-155, 2008.

A. Huynh, D. Thain, M. Pericàs, and K. Taura, DAGViz, Proceedings of the 2nd Workshop on Visual Performance Analysis, VPA '15, pp.1-3, 2015.
DOI : 10.1145/2835238.2835241

B. Haugen, S. Richmond, J. Kurzak, C. A. Steed, and J. Dongarra, Visualizing execution traces with task dependencies Performance Analysis, ser. VPA '15, Proceedings of the 2nd Workshop on Visual, pp.1-2, 2015.

L. M. Schnorr and A. Legrand, Visualizing More Performance Data Than What Fits on Your Screen, Tools for High Performance Computing 2012, pp.149-162, 2013.
DOI : 10.1007/978-3-642-37349-7_10

URL : https://hal.archives-ouvertes.fr/hal-00737651

G. Pagano and V. Marangozova-martin, FrameSoC Workbench: Facilitating Trace Analysis through a Consistent User Interface, Inria, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00977887

V. Danjean, R. Namyst, and P. Wacrenier, An Efficient Multi-level Trace Toolkit for Multi-threaded Applications, European Conference on Parallel Processing, pp.166-175, 2005.
DOI : 10.1007/11549468_21

URL : https://hal.archives-ouvertes.fr/hal-00360309

E. Agullo, O. Beaumont, L. Eyraud-dubois, J. Herrmann, S. Kumar et al., Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms, 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015.
DOI : 10.1109/IPDPSW.2015.35

URL : https://hal.archives-ouvertes.fr/hal-01120507

E. Agullo, A. Buttari, A. Guermouche, and F. Lopez, Multifrontal QR Factorization for Multicore Architectures over Runtime Systems, European Conf. on Parallel Processing, pp.521-532, 2013.
DOI : 10.1007/978-3-642-40047-6_53

URL : https://hal.archives-ouvertes.fr/hal-01220611