StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures. Concurrency and Computation: Practice and Experience, Euro-Par, 2009. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2010. ,
DOI : 10.1109/SC.2010.19
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp.1-15, 2010. ,
DOI : 10.1109/IPDPS.2010.5470415
Hybrid MPI-Thread Parallelization of the Fast Multipole Method, Sixth International Symposium on Parallel and Distributed Computing (ISPDC'07), p.52, 2007. ,
DOI : 10.1109/ISPDC.2007.29
URL : https://hal.archives-ouvertes.fr/inria-00131001
High performance BLAS formulation of the adaptive Fast Multipole Method, Proceedings of the International Conference of Computational Methods in Sciences and Engineering, pp.3-4177, 2005. ,
DOI : 10.1016/j.mcm.2009.08.039
URL : https://hal.archives-ouvertes.fr/hal-01146520
The black-box fast multipole method, Journal of Computational Physics, vol.228, issue.23, pp.8712-8725, 2009. ,
DOI : 10.1016/j.jcp.2009.08.031
A fast algorithm for particle simulations, Journal of Computational Physics, vol.73, issue.2, pp.325-348, 1987. ,
DOI : 10.1016/0021-9991(87)90140-9
A new version of the Fast Multipole Method for the Laplace equation in three dimensions, Acta Numerica, vol.448, pp.229-269, 1997. ,
DOI : 10.1016/0009-2614(92)90053-P
Data-Driven Execution of Fast Multipole Methods. CoRR, abs, 1203. ,
Optimized M2L Kernels for the Chebyshev Interpolation based Fast Multipole Method, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00746089
Fast directional multilevel summation for oscillatory kernels based on Chebyshev interpolation, Journal of Computational Physics, vol.231, issue.4, pp.1175-1196, 2012. ,
DOI : 10.1016/j.jcp.2011.09.027
A parallel hashed Oct-Tree N-body algorithm, Proceedings of the 1993 ACM/IEEE conference on Supercomputing , Supercomputing '93, pp.12-21, 1993. ,
DOI : 10.1145/169627.169640
A tuned and scalable fast multipole method as a preeminent algorithm for exascale systems, International Journal of High Performance Computing Applications, vol.26, issue.4, pp.337-346, 2012. ,
DOI : 10.1177/1094342011429952
Dynamic prioritization for parallel traversal of irregularly structured spatio-temporal graphs, 2011. ,