Top500 supercomputer sites, Supercomputer, vol.13, pp.89-111, 1997. ,
Analyzing and mitigating the impact of manufacturing variability in power-constrained supercomputing, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC '15, 2015. ,
Static LU decomposition on heterogeneous platforms, Int. Journal of High Performance Computing Applications, vol.15, pp.310-323, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-00856641
, ScaLAPACK User's Guide. USA: Society for Industrial and Applied Mathematics, 1997.
Communication avoiding gaussian elimination, SC '08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp.1-12, 2008. ,
URL : https://hal.archives-ouvertes.fr/inria-00277901
Heterogeneous distribution of computations solving linear algebra problems on networks of heterogeneous computers, J. of Par. and Distr. Comp, vol.61, issue.4, p.520, 2001. ,
Partitioning a square into rectangles: NP-completeness and approximation algorithms, Algorithmica, vol.34, pp.217-239, 2002. ,
URL : https://hal.archives-ouvertes.fr/hal-00807407
, Matrix multiplication on heterogeneous platforms, IEEE Trans. Parallel Distributed Systems, vol.12, issue.10, pp.1033-1051, 2001.
An approximation algorithm for dissecting a rectangle into rectangles with specified areas, Discrete Applied Mathematics, vol.155, issue.4, pp.523-537, 2007. ,
A proposal for a heterogeneous cluster ScaLAPACK (dense linear solvers), IEEE Trans. Computers, vol.50, issue.10, pp.1052-1070, 2001. ,
URL : https://hal.archives-ouvertes.fr/hal-00808287
With extreme computing, the rules have changed, Comp. in Sci. Eng, vol.19, issue.3, p.52, 2017. ,
On runtime systems for task-based programming on heterogeneous platforms, Hab. à diriger des rech., U. Bordeaux, 2018. ,
URL : https://hal.archives-ouvertes.fr/tel-01959127
Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum, pp.1432-1441, 2011. ,
Faster, cheaper, better -a hybridization methodology to develop linear algebra software for GPUs, GPU Computing Gems, vol.2, 2010. ,
URL : https://hal.archives-ouvertes.fr/inria-00547847
Using static allocation algorithms for matrix matrix multiplication on multicores and GPUs, ICPP 2018 -47th International Conference on Parallel Processing, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01670678
StarPU: A unified platform for task scheduling on heterogeneous multicore architectures, SI:EuroPar'09, vol.23, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
Faithful performance prediction of a dynamic task-based runtime system for heterogeneous multi-core architectures, Concurrency and Computation: Practice and Experience, p.16, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01147997
Implementing multifrontal sparse solvers for multicore architectures with sequential task flow runtime systems, ACM Tr. Math. Softw, vol.43, issue.2, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01333645
StarPU-MPI: Task programming over clusters of machines enhanced with accelerators, Proceedings of the 19th European Conference on Recent Advances in the Message Passing Interface, ser. EuroMPI'12, pp.298-299, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00725477
New madeleine: A fast communication scheduling engine for high performance networks, 2007 IEEE International Parallel and Distributed Processing Symposium, pp.1-8, 2007. ,
Open mpi: Goals, concept, and design of a next generation mpi implementation, European Parallel Virtual Machine/Message Passing Interface Users' Group Meeting, pp.97-104, 2004. ,
Versatile, scalable, and accurate simulation of distributed applications and platforms, Journal of Parallel and Distributed Computing, vol.74, issue.10, pp.2899-2917, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01017319
A visual performance analysis framework for task based parallel applications running on hybrid clusters, Concurrency and Computation: Practice and Experience, 2018. ,
URL : https://hal.archives-ouvertes.fr/hal-01616632
Visual performance analysis of memory behavior in a task-based runtime on hybrid platforms, 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2019. ,
URL : https://hal.archives-ouvertes.fr/hal-02275363
Evaluation of programming models to address load imbalance on distributed multicore CPUs: A case study with block low-rank factorization, p.2019 ,
, Alternatives To MPI (PAW-ATM), pp.25-36, 2019.
Using dynamic broadcasts to improve task-based runtime performances, Euro-Par 2020: 26th International European Conference on Parallel and Distributed Computing, 2020. ,