Skip to Main content Skip to Navigation
Theses

Understanding and Guiding the Computing Resource Management in a Runtime Stacking Context

Arthur Loussert 1
1 STORM - STatic Optimizations, Runtime Methods
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : With the advent of multicore and manycore processors as buildingblocks of HPC supercomputers, many applications shift from relying solely on a distributed programming model (e.g., MPI) to mixing distributed and shared-memory models (e.g., MPI+OpenMP). This leads to a better exploitation of shared-memory communications and reduces the overall memory footprint.However, this evolution has a large impact on the software stack as applications’ developers do typically mix several programming models to scale over a largenumber of multicore nodes while coping with their hiearchical depth. Oneside effect of this programming approach is runtime stacking: mixing multiplemodels involve various runtime libraries to be alive at the same time. Dealing with different runtime systems may lead to a large number of execution flowsthat may not efficiently exploit the underlying resources.We first present a study of runtime stacking. It introduces stacking configurations and categories to describe how stacking can appear in applications.We explore runtime-stacking configurations (spatial and temporal) focusing on thread/process placement on hardware resources from different runtime libraries. We build this taxonomy based on the analysis of state-of-the-artruntime stacking and programming models.We then propose algorithms to detect the misuse of compute resources when running a hybrid parallel application. We have implemented these algorithms inside a dynamic tool, called the Overseer. This tool monitors applications,and outputs resource usage to the user with respect to the application timeline, focusing on overloading and underloading of compute resources.Finally, we propose a second external tool called Overmind, that monitors the thread/process management and (re)maps them to the underlyingcores taking into account the hardware topology and the application behavior. By capturing a global view of resource usage the Overmind adapts theprocess/thread placement, and aims at taking the best decision to enhance the use of each compute node inside a supercomputer. We demonstrate the relevance of our approach and show that our low-overhead implementation is able to achieve good performance even when running with configurations that would have ended up with bad resource usage.
Complete list of metadata

Cited literature [68 references]  Display  Hide  Download

https://hal.inria.fr/tel-02438652
Contributor : Arthur Loussert <>
Submitted on : Tuesday, January 14, 2020 - 12:10:30 PM
Last modification on : Thursday, January 23, 2020 - 9:03:44 AM
Long-term archiving on: : Wednesday, April 15, 2020 - 5:18:04 PM

File

these.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : tel-02438652, version 1

Collections

Citation

Arthur Loussert. Understanding and Guiding the Computing Resource Management in a Runtime Stacking Context. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Bordeaux, 2019. English. ⟨tel-02438652⟩

Share

Metrics

Record views

237

Files downloads

358