Opportunities for Partitioning Non-Volatile Memory DIMMs between Co-scheduled Jobs on HPC Nodes

Brice Goglin 1 Andrès Rubio Proaño 1
1 TADAAM - Topology-Aware System-Scale Data Management for High-Performance Computing
LaBRI - Laboratoire Bordelais de Recherche en Informatique, Inria Bordeaux - Sud-Ouest
Abstract : The emergence of non-volatile memory DIMMs such as Intel Optane DCPMM blurs the gap between usual volatile memory and persistent storage by enabling byte-accessible persistent memory with reasonable performance. This new hardware supports many possible use cases for high-performance applications, from high performance storage to very high-capacity volatile memory (terabytes). However the numerous ways to configure the memory subsystem raises the question of how to configure nodes to satisfy applications' needs (memory, storage, fault tolerance, etc.). We focus on the issue of partitioning HPC nodes with NVDIMMs in the context of co-scheduling multiple jobs. We show that the basic NVDIMM configuration modes would require node reboots and expensive hardware configuration. Moreover it does not allow the co-scheduling of all kinds of jobs, and it does not always allow locality to be taken into account during resource allocation. Then we show that using 1-Level-Memory and the Device DAX mode by default is a good compromise. It may be easily used and partitioned for storage and memory-bound applications with locality awareness.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-02173336
Contributor : Andrès Rubio Proaño <>
Submitted on : Thursday, July 4, 2019 - 2:02:30 PM
Last modification on : Thursday, September 26, 2019 - 4:04:03 PM

File

article.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02173336, version 1

Citation

Brice Goglin, Andrès Rubio Proaño. Opportunities for Partitioning Non-Volatile Memory DIMMs between Co-scheduled Jobs on HPC Nodes. Euro-Par 2019: Parallel Processing Workshops, Aug 2019, Göttingen, Germany. ⟨hal-02173336⟩

Share

Metrics

Record views

130

Files downloads

485