Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms

Emmanuel Agullo 1 Olivier Beaumont 2, 3 Lionel Eyraud-Dubois 2, 3 Julien Herrmann 4 Suraj Kumar 1, 3, 5, 6 Loris Marchal 4, 7 Samuel Thibault 6, 2, 5
1 HiePACS - High-End Parallel Algorithms for Challenging Numerical Simulations
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
3 Realopt - Reformulations based algorithms for Combinatorial Optimization
LaBRI - Laboratoire Bordelais de Recherche en Informatique, IMB - Institut de Mathématiques de Bordeaux, Inria Bordeaux - Sud-Ouest
4 ROMA - Optimisation des ressources : modèles, algorithmes et ordonnancement
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
5 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
6 STORM - STatic Optimizations, Runtime Methods
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : We consider the problem of allocating and scheduling dense linear application on fully heterogeneous platforms made of CPUs and GPUs. More specifically, we focus on the Cholesky factorization since it exhibits the main features of such problems. Indeed, the relative performance of CPU and GPU highly depends on the sub-routine: GPUs are for instance much more efficient to process regular kernels such as matrix-matrix multiplications rather than more irregular kernels such as matrix factorization. In this context, one solution consists in relying on dynamic scheduling and resource allocation mechanisms such as the ones provided by PaRSEC or StarPU. In this paper we analyze the performance of dynamic schedulers based on both actual executions and simulations, and we investigate how adding static rules based on an offline analysis of the problem to their decision process can indeed improve their performance, up to reaching some improved theoretical performance bounds which we introduce.
Type de document :
Communication dans un congrès
Heterogeneity in Computing Workshop 2015, May 2015, Hyderabad, India. 2015
Liste complète des métadonnées


https://hal.inria.fr/hal-01120507
Contributeur : Suraj Kumar <>
Soumis le : mercredi 25 février 2015 - 18:47:24
Dernière modification le : lundi 18 septembre 2017 - 09:52:09
Document(s) archivé(s) le : mardi 26 mai 2015 - 11:25:16

Fichier

Camera_ready.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01120507, version 1

Collections

Citation

Emmanuel Agullo, Olivier Beaumont, Lionel Eyraud-Dubois, Julien Herrmann, Suraj Kumar, et al.. Bridging the Gap between Performance and Bounds of Cholesky Factorization on Heterogeneous Platforms. Heterogeneity in Computing Workshop 2015, May 2015, Hyderabad, India. 2015. <hal-01120507>

Partager

Métriques

Consultations de
la notice

735

Téléchargements du document

362