A sampling-based approach for communication libraries auto-tuning

Élisabeth Brunet 1 François Trahay 1 Alexandre Denis 2, 3 Raymond Namyst 2, 3
2 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
Abstract : Communication performance is a critical issue in HPC applications, and many solutions have been proposed on the literature (algorithmic, protocols, etc.) In the meantime, computing nodes become massively multicore, leading to a real imbalance between the number of communication sources and the number of physical communication resources. Thus it is now mandatory to share network boards between computation flows, and to take this sharing into account while performing communication optimizations. In previous papers, we have proposed a model and a framework for on-the-fly optimizations of multiplexed concurrent communication flows, and implemented this model in the \nm communication library. This library features optimization strategies able for example to aggregate several messages to reduce the number of packets emitted on the network, or to split messages to use several NICs at the same time. In this paper, we study the tuning of these dynamic optimization strategies. We show that some parameters and thresholds (\rdv threshold, aggregation packet size) depend on the actual hardware, both host and NICs. We propose and implement a method based on sampling of the actual hardware to auto-tune our strategies. Moreover, we show that multi-rail can greatly benefit from performance predictions. We propose an approach for multi-rail that dynamically balance the data between NICs using predictions based on sampling.
Keywords : MPI NewMadeleine MadMPI
Type de document :
Communication dans un congrès
IEEE International Conference on Cluster Computing, Sep 2011, Austin, United States. 2011
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00605735
Contributeur : Alexandre Denis <>
Soumis le : lundi 4 juillet 2011 - 11:20:38
Dernière modification le : jeudi 9 février 2017 - 15:21:58
Document(s) archivé(s) le : lundi 12 novembre 2012 - 09:56:45

Fichier

main.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00605735, version 1

Citation

Élisabeth Brunet, François Trahay, Alexandre Denis, Raymond Namyst. A sampling-based approach for communication libraries auto-tuning. IEEE International Conference on Cluster Computing, Sep 2011, Austin, United States. 2011. 〈inria-00605735〉

Partager

Métriques

Consultations de
la notice

474

Téléchargements du document

295