OpenMP 3.0 tasking implementation in OpenUH, 2009. ,
Task-Based FMM for Multicore Architectures, SIAM Journal on Scientific Computing, vol.36, issue.1, pp.66-93, 2014. ,
DOI : 10.1137/130915662
URL : https://hal.archives-ouvertes.fr/hal-00807368
Task-based FMM for heterogeneous architectures, Concurrency and Computation: Practice and Experience, pp.2608-2629, 2016. ,
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures, Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par, 2009. ,
The design of openmp tasks, Transactions on Parallel and Distributed Systems, 2009. ,
Fast hierarchical algorithms for generating Gaussian random fields, Research Report, vol.8811, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01228519
Dague: A generic distributed dag engine for high performance computing, Parallel Computing, 2012. ,
Implementing OmpSs support for regions of data in architectures with multiple address spaces, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, 2013. ,
DOI : 10.1145/2464996.2465017
Parallel algorithms, 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00789466
Gpu hybrid implementation and model-driven scheduling of the fast multipole method, Proceedings of Workshop on General Purpose Processing Using GPUs, pp.64-64 ,
The implementation of the cilk-5 multithreaded language, Conference on Programming Language Design and Implementation, 1998. ,
Locality-aware work stealing on multi-cpu and multi-gpu architectures, Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00780890
A fast algorithm for particle simulations, Journal of Computational Physics, vol.73, issue.2, pp.325-348, 1987. ,
DOI : 10.1016/0021-9991(87)90140-9
A ROSE-Based OpenMP 3.0 Research Compiler Supporting Multiple Runtime Libraries, Beyond Loop Level Parallelism in OpenMP: Accelerators, Tasking and More, 2010. ,
DOI : 10.1007/978-3-642-13217-9_2
Data-driven execution of fast multipole methods, Concurrency and Computation: Practice and Experience, pp.1935-1946, 2013. ,
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems, International Journal of High Performance Computing Applications, vol.26, issue.4, pp.337-346, 2012. ,
DOI : 10.1177/1094342011429952