A Topology-Aware Performance Monitoring Tool for Shared Resource Management in Multicore Systems

Nicolas Denoyelle 1, 2 Brice Goglin 1, 2 Emmanuel Jeannot 1, 2, *
* Auteur correspondant
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : Nowadays, performance optimization involves careful data and task placement to deal with parallel application needs with respect to the underlying hardware topology. Monitoring the application behavior provides useful information that still needs to be matched with the actual placement, for instance to understand whether bottlenecks are caused by the sequential code itself or by shared resources in parallel programs. We propose an insightful monitoring tool based on two cornerstones of hardware performance counters monitoring and hardware locality mod-eling, respectively named PAPI and hwloc. It enables a dynamic visual analysis of parallel applications' phases at runtime, revealing their possibly variable and heterogeneous behaviors and needs. A purpose designed application shows that the topology-aware visual representation of hardware counters can help guring out shared resource bottlenecks and ease the task placement decision process in runtime systems. 1 Introduction The memory wall makes data locality increasingly important on the road to exascale. Data and computing tasks have to be colocated to better exploit the performance of parallel platforms. Many research projects focus on locality-aware data and/or task placement, for parallel programing models ranging from MPI and OpenMP to graphs of tasks. However nding out which placement is the best remains a dicult exercise that depends on the topology and characteristics of the hardware and on the application needs. Indeed, the hardware is increasingly complex, and software anities can be of dierent kinds. For instance memory-bound tasks may prefer being scattered all across the machine, while, on the contrary, communication and synchronization may want to keep them close. Runtime systems require help identifying these needs and bottlenecks before they can place tasks accordingly. Performance monitoring is a very active software area that oers many tools to gather information about the execution of tasks, the bottlenecks, etc. We introduce , in this paper, a new way to analyze performance by crossing the roads of performance monitoring and topology-aware placement. We propose an extension of the Hardware Locality software (hwloc [2]) that enhances its graphical
Type de document :
Communication dans un congrès
Springer. Proceedings of Euro-Par 2015: Parallel Processing Workshops, Aug 2015, Vienna, Austria. 2015, Lecture Notes in Computer Science
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01183083
Contributeur : Brice Goglin <>
Soumis le : jeudi 6 août 2015 - 11:04:36
Dernière modification le : jeudi 11 janvier 2018 - 06:27:21
Document(s) archivé(s) le : mercredi 26 avril 2017 - 09:41:19

Fichier

ROME-workshop-camera-ready.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01183083, version 1

Collections

Citation

Nicolas Denoyelle, Brice Goglin, Emmanuel Jeannot. A Topology-Aware Performance Monitoring Tool for Shared Resource Management in Multicore Systems. Springer. Proceedings of Euro-Par 2015: Parallel Processing Workshops, Aug 2015, Vienna, Austria. 2015, Lecture Notes in Computer Science. 〈hal-01183083〉

Partager

Métriques

Consultations de la notice

431

Téléchargements de fichiers

197