Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer

The Linpack benchmark, in particular the High-Performance Linpack (HPL) implementation, has emerged as the de-facto standard benchmark to rank supercomputers in the TOP500. With a power consumption of several MW per hour on a TOP500 machine, test-running HPL on the whole machine for hours is extremely expensive. With core-counts beyond the 100,000 cores threshold being common and sometimes even ranging into the millions, an optimization of HPL parameters (problem size, grid arrangement, granularity, collective operation algorithms, etc.) specifically suited to the network topology and performance is essential. Such optimization can be particularly time consuming and can hardly be done through simple mathematical performance models. In this article, we explain how we both extended the SimGrid's SMPI simulator and slightly modified HPL to allow a fast emulation of HPL on a single commodity computer at the scale of a supercomputer. More precisely, we take as a motivating use case the large-scale run performed on the Stampede cluster at TACC in 2013, when it got ranked 6th in the TOP500. While this qualification run required the dedication of 6,006 computing nodes of the supercomputer and more than 120 TB of RAM for more than 2 hours, we manage to simulate a similar configuration on a commodity computer with 19 GB of RAM in about 62 hours. Allied to a careful modeling of Stampede, this simulation allows us to evaluate the performance that would have been obtained using the freely available version of HPL. Such performance reveals much lower than what was reported and which was obtained using a closed-source version specifically designed by the Intel engineers. Our simulation allows us to hint where the main algorithmic improvements must have been done in HPL.

Domaines

Calcul parallèle, distribué et partagé [cs.DC] Modélisation et simulation

Fichier principal

paper.pdf (675.9 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Franz C. Heinrich : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01654804

Soumis le : lundi 4 décembre 2017-14:10:21

Dernière modification le : jeudi 4 avril 2024-21:35:15

Dates et versions

hal-01654804 , version 1 (04-12-2017)

Identifiants

HAL Id : hal-01654804 , version 1

Citer

Tom Cornebize, Franz C Heinrich, Arnaud Legrand, Jérôme Vienne. Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer. 2017. ⟨hal-01654804⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIG GRID5000 LIG_SRCPR INRIA2 TDS-MACS LIG-SRCPR-POLARIS SILECS LIG_SIDCH

518 Consultations

695 Téléchargements