Using Simulation to Evaluate and Tune the Performance of Dynamic Load Balancing of an Over-decomposed Geophysics Application

Abstract : Finite difference methods are, in general, well suited to execution on parallel machines and are thus commonplace in High Performance Computing. Yet, despite their apparent regularity, they often exhibit load imbalance that damages their efficiency. In this article, we first characterize the spatial and temporal load imbalance of Ondes3D, a seismic wave propagation simulator used to conduct regional scale risk assessment. Our analysis reveals that this imbalance originates from the structure of the input data and from low-level CPU optimizations. Such dynamic imbalance should, therefore, be quite common and can not be solved by any static approach or classical code reorganization. An effective solution for such scenarios, incurring minimal code modification, is to use AMPI/CHARM++. By over-decomposing the application, the CHARM++ runtime can dynamically rebalance the load by migrating data and computation at the granularity of an MPI rank. We show that this approach is effective to balance the spatial/temporal dynamic load of the application, thus drastically reducing its execution time. However, this approach requires a careful selection of the load balancing algorithm, its activation frequency, and of the over-decomposition level. These choices are, unfortunately, quite dependent on application structure and platform characteristics. Therefore, we propose a methodology that leverages the capabilities of the SimGrid simulation framework and allows to conduct such study at low experimental cost. Our approach relies on a combination of emulation, simulation, and application modeling that requires minimal code modification and yet manages to capture both spatial and temporal load imbalance and to faithfully predict the performance of dynamic load balancing. We evaluate the quality of our simulation by systematically comparing simulation results with the outcome of real executions and demonstrate how this approach can be used to quickly find the optimal load balancing configuration for a given application/hardware configuration.
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.inria.fr/hal-01391401
Contributor : Arnaud Legrand <>
Submitted on : Thursday, February 16, 2017 - 3:34:18 PM
Last modification on : Thursday, October 11, 2018 - 8:48:05 AM
Long-term archiving on : Wednesday, May 17, 2017 - 8:16:27 PM

File

RR_9031.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01391401, version 2

Citation

Rafael Keller Tesser, Lucas Mello Schnorr, Arnaud Legrand, Fabrice Dupros, Philippe Navaux. Using Simulation to Evaluate and Tune the Performance of Dynamic Load Balancing of an Over-decomposed Geophysics Application. [Research Report] RR-9031, INRIA Grenoble - Rhone-Alpes. 2017, pp.21. ⟨hal-01391401v2⟩

Share

Metrics

Record views

926

Files downloads

307