Modeling Non-Uniform Memory Access on Large Compute Nodes with the Cache-Aware Roofline Model

Abstract : NUMA platforms, emerging memory architectures with on-package high bandwidth memories bring new opportunities and challenges to bridge the gap between computing power and memory performance. Heterogeneous memory machines feature several performance trade-offs, depending on the kind of memory used, when writing or reading it. Finding memory performance upper-bounds subject to such trade-offs aligns with the numerous interests of measuring computing system performance. In particular, representing applications performance with respect to the platform performance bounds has been addressed in the state-of-the-art Cache-Aware Roofline Model (CARM) to troubleshoot performance issues. In this paper, we present a Locality-Aware extension (LARM) of the CARM to model NUMA platforms bottlenecks, such as contention and remote access. On top of this, the new contribution of this paper is the design and validation of a novel hybrid memory bandwidth model. This new hybrid model quantifies the achievable bandwidth upper-bound under above-described trade-offs with less than 3% error. Hence, when comparing applications performance with the maximum attainable performance, software designers can now rely on more accurate information.
Type de document :
Article dans une revue
IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, In press, 〈10.1109/TPDS.2018.2883056〉
Liste complète des métadonnées

Littérature citée [9 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01924951
Contributeur : Brice Goglin <>
Soumis le : vendredi 16 novembre 2018 - 13:09:01
Dernière modification le : mardi 27 novembre 2018 - 01:20:11

Fichier

HAL.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Nicolas Denoyelle, Brice Goglin, Aleksandar Ilic, Emmanuel Jeannot, Leonel Sousa. Modeling Non-Uniform Memory Access on Large Compute Nodes with the Cache-Aware Roofline Model. IEEE Transactions on Parallel and Distributed Systems, Institute of Electrical and Electronics Engineers, In press, 〈10.1109/TPDS.2018.2883056〉. 〈hal-01924951〉

Partager

Métriques

Consultations de la notice

93

Téléchargements de fichiers

51