Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer - Archive ouverte HAL Access content directly
Preprints, Working Papers, ... Year :

Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer

(1, 2) , (1, 2, 3) , (1, 3, 4) , (5)
1
2
3
4
5

Abstract

The Linpack benchmark, in particular the High-Performance Linpack (HPL) implementation, has emerged as the de-facto standard benchmark to rank supercomputers in the TOP500. With a power consumption of several MW per hour on a TOP500 machine, test-running HPL on the whole machine for hours is extremely expensive. With core-counts beyond the 100,000 cores threshold being common and sometimes even ranging into the millions, an optimization of HPL parameters (problem size, grid arrangement, granularity, collective operation algorithms, etc.) specifically suited to the network topology and performance is essential. Such optimization can be particularly time consuming and can hardly be done through simple mathematical performance models. In this article, we explain how we both extended the SimGrid's SMPI simulator and slightly modified HPL to allow a fast emulation of HPL on a single commodity computer at the scale of a supercomputer. More precisely, we take as a motivating use case the large-scale run performed on the Stampede cluster at TACC in 2013, when it got ranked 6th in the TOP500. While this qualification run required the dedication of 6,006 computing nodes of the supercomputer and more than 120 TB of RAM for more than 2 hours, we manage to simulate a similar configuration on a commodity computer with 19 GB of RAM in about 62 hours. Allied to a careful modeling of Stampede, this simulation allows us to evaluate the performance that would have been obtained using the freely available version of HPL. Such performance reveals much lower than what was reported and which was obtained using a closed-source version specifically designed by the Intel engineers. Our simulation allows us to hint where the main algorithmic improvements must have been done in HPL.
Fichier principal
Vignette du fichier
paper.pdf (675.9 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01654804 , version 1 (04-12-2017)

Identifiers

  • HAL Id : hal-01654804 , version 1

Cite

Tom Cornebize, Franz C Heinrich, Arnaud Legrand, Jérôme Vienne. Emulating High Performance Linpack on a Commodity Server at the Scale of a Supercomputer. 2017. ⟨hal-01654804⟩
479 View
607 Download

Share

Gmail Facebook Twitter LinkedIn More