A Visual Performance Analysis Framework for Task-based Parallel Applications running on Hybrid Clusters

Abstract : Programming paradigms in High-Performance Computing have been shifting towards task-based models which are capable of adapting readily to heterogeneous and scalable supercomputers. The performance of task-based application heavily depends on the runtime scheduling heuristics and on its ability to exploit computing and communication resources. Unfortunately, the traditional performance analysis strategies are unfit to fully understand task-based runtime systems and applications: they expect a regular behavior with communication and computation phases, while task-based applications demonstrate no clear phases. Moreover, the finer granularity of task-based applications typically induces a stochastic behavior that leads to irregular structures that are difficult to analyze. Furthermore, the combination of application structure, scheduler, and hardware information is generally essential to understand performance issues. This paper presents a flexible framework that enables one to combine several sources of information and to create custom visualization panels allowing to understand and pinpoint performance problems incurred by bad scheduling decisions in task-based applications. Three case-studies using StarPU-MPI, a task-based multi-node runtime system, are detailed to show how our framework can be used to study the performance of the well-known Cholesky factorization. Performance improvements include a better task partitioning among the multi-(GPU,core) to get closer to theoretical lower bounds, improved MPI pipelining in multi-(node,core,GPU) to reduce the slow start, and changes in the runtime system to increase MPI bandwidth, with gains of up to 13% in the total makespan.
Type de document :
Article dans une revue
Concurrency and Computation: Practice and Experience, Wiley, 2018, 30 (18), pp.1-31. 〈https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4472〉. 〈10.1002/cpe.4472〉
Liste complète des métadonnées

Littérature citée [40 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01616632
Contributeur : Samuel Thibault <>
Soumis le : mardi 17 juillet 2018 - 14:44:56
Dernière modification le : jeudi 11 octobre 2018 - 08:48:05
Document(s) archivé(s) le : jeudi 18 octobre 2018 - 15:42:35

Fichier

CCPE_article_submitted_2018_02...
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Vinicius Garcia Pinto, Lucas Schnorr, Luka Stanisic, Arnaud Legrand, Samuel Thibault, et al.. A Visual Performance Analysis Framework for Task-based Parallel Applications running on Hybrid Clusters. Concurrency and Computation: Practice and Experience, Wiley, 2018, 30 (18), pp.1-31. 〈https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4472〉. 〈10.1002/cpe.4472〉. 〈hal-01616632v2〉

Partager

Métriques

Consultations de la notice

479

Téléchargements de fichiers

172