Improving Parallel System Performance with a NUMA-aware Load Balancer

Laércio L. Pilla; Christiane Pousa Ribeiro; Daniel Cordeiro; Abhinav Bhatele; Philippe O. A. Navaux; Jean-François Mehaut; Laxmikant V. Kalé

Rapport (Rapport De Recherche) Année : 2011

Improving Parallel System Performance with a NUMA-aware Load Balancer

(1) , (2) , (3, 4) , (5) , (1) , (2) , (6)

1
2
3
4
5
6

Laércio L. Pilla

Fonction : Auteur

Instituto de Informática da UFRGS

Christiane Pousa Ribeiro

Fonction : Auteur
PersonId : 856332

Middleware efficiently scalable

Daniel Cordeiro

Fonction : Auteur
PersonId : 8493
IdHAL : dcordeiro
ORCID : 0000-0003-4971-7355
IdRef : 166410160

PrograMming and scheduling design fOr Applications in Interactive Simulation

Laboratoire d'Informatique de Grenoble

Abhinav Bhatele

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

Philippe O. A. Navaux

Fonction : Auteur

Instituto de Informática da UFRGS

Jean-François Mehaut

Fonction : Auteur
PersonId : 6046
IdHAL : jean-francois-mehaut
ORCID : 0000-0003-1047-7462
IdRef : 086451227

Middleware efficiently scalable

Laxmikant V. Kalé

Fonction : Auteur

Department of Computer Science [UIUC]

Résumé

Multi-core nodes with Non-Uniform Memory Access (NUMA) are now a common architecture for high performance computing. On such NUMA nodes, the shared memory is physically distributed into memory banks connected by a network. Owing to this, memory access costs may vary depending on the distance between the processing unit and the memory bank. Therefore, a key element in improving the performance on these machines is dealing with memory affinity. We propose a NUMA-aware load balancer that combines the information about the NUMA topology with the statistics captured by the Charm++ runtime system. We present speedups of up to 1.8 for synthetic benchmarks running on different NUMA platforms. We also show improvements over existing load balancing strategies both in benchmark performance and in the time for load balancing. In addition, by avoiding unnecessary migrations, our algorithm incurs up to seven times smaller overheads in migration, than the other strategies.

Mots clés

balancing non-uniform memory access memory contention performance object migration

Domaines

Calcul parallèle, distribué et partagé [cs.DC]

Arnaud Legrand : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-00788813

Soumis le : vendredi 15 février 2013-11:17:00

Dernière modification le : jeudi 4 avril 2024-21:02:08

Dates et versions

hal-00788813 , version 1 (15-02-2013)

Identifiants

HAL Id : hal-00788813 , version 1

Citer

Laércio L. Pilla, Christiane Pousa Ribeiro, Daniel Cordeiro, Abhinav Bhatele, Philippe O. A. Navaux, et al.. Improving Parallel System Performance with a NUMA-aware Load Balancer. [Research Report] Inria. 2011. ⟨hal-00788813⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG LIG_SRCPR LIG_SRCPR_MOAIS INRIA2 LARA POLYTECH-GRENOBLE LIG_SIDCH

301 Consultations

0 Téléchargements

Improving Parallel System Performance with a NUMA-aware Load Balancer

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager