Cilk: An efficient multithreaded runtime system, Journal of parallel and distributed computing, vol.37, 1996. ,
Xkaapi: A runtime system for data-flow task programming on heterogeneous architectures, IEEE Intl. Symposium on Parallel and Distributed Processing, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00799904
OmpSs: a proposal for programming heterogeneous multi-core architectures, Paral. Proces. Letters, vol.21, 2011. ,
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, SI:EuroPar'09, vol.23, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Bridging the gap between performance and bounds of cholesky factorization on heterogeneous platforms, IEEE Intl. Parallel and Distributed Processing Symp. Workshop, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01120507
Faithful Performance Prediction of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures, Concurrency and Computation: Practice and Experience, p.16, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01147997
Fast and Accurate Simulation of Multithreaded Sparse Linear Algebra Solvers, The 21st IEEE International Conference on Parallel and Distributed Systems, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01180272
A visual performance analysis framework for task based parallel applications running on hybrid clusters, Concurrency and Computation: Practice and Experience, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01616632
Data-Aware Task Scheduling on Multi-Accelerator based Platforms, 16th Intl. Conference on Parallel and Distributed Systems, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00523937
A survey of out-of-core algorithms in numerical linear algebra, Ext. Mem. Alg. and Vis, vol.50, pp.161-179, 1999. ,
Faster, Cheaper, Better -a Hybridization Methodology to Develop Linear Algebra Software for GPUs, GPU Computing Gems, vol.2, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00547847
Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems, ACM Tr. Math. Softw, vol.43, issue.2, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01333645
Controlling the Memory Subscription of Distributed Applications with a Task-Based Runtime System, 21st Intl. Workshop on High-Level Paral. Prog. Models and Supportive Environments, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01284004
Visualizing execution traces with task dependencies, The 2nd Workshop on Visual Perf. Analysis, 2015. ,
DAGViz: A DAG Visualization Tool for Analyzing Task-parallel Program Traces, Proceedings of the 2nd Workshop on Visual Performance Analysis, ser. VPA '15, vol.3, pp.1-3, 2015. ,
Grain graphs: OpenMP performance analysis made easy, 24th ACM SIG-PLAN Symp. on Principles and Practice of Paral. Prog, 2016. ,
Temanejo: Debugging of thread-based task-parallel programs in StarSs, Proc. of the 5th Intl. Workshop on Paral. Tools for High Performance Computing, pp.131-137, 2011. ,
The Vampir performance analysis toolset, Proc. of the 2nd Intl. Workshop on Parallel Tools for High Performance Computing, pp.139-155, 2008. ,
Paraver: A Tool to Visualize and Analyze Parallel Code, Proceedings of WoTUG-18: Transputer and occam Developments, pp.17-31, 1995. ,
Visual trace explorer (ViTE), Tech. Rep, 2009. ,
Combing the communication hairball: Visualizing parallel execution traces using logical time, IEEE Trans. on visualization and computer graphics, vol.20, issue.12, pp.2349-2358, 2014. ,
Analyzing performance variation of task schedulers with TaskInsight, Parallel Computing, vol.75, pp.11-27, 2018. ,
Analysis of data reuse in task-parallel runtimes, High Perf. Comp. Syst. Perf. Modeling, Bench. and Simul, pp.73-87, 2014. ,