Scheduling distributed I/O resources in HPC systems - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2024

Scheduling distributed I/O resources in HPC systems

Résumé

This paper presents a comprehensive investigation on optimizing I/O performance in the access to distributed I/O resources in high-performance computing (HPC) environments. I/O resources, such as the I/O forwarding nodes and object storage targets (OST), are shared between a subset of applications. Each application has access to a subset of them and multiple applications can access the same resources. We propose heuristics to schedule these distributed I/O resources in two steps: for a set of applications, determining the number of I/O resources each will use (allocation) and which resources they will use (placement). We discuss a wide range of required information about applications' characteristics that can be used by the scheduling algorithms. Despite the fact that a higher level of application knowledge is associated with enhanced performance, our comprehensive analysis indicates that strategic decision-making with limited information can still yield significant enhancements in most scenarios. This research provides insights into the trade-offs between the depth of application characterization and the practicality of scheduling I/O resources.
Fichier principal
Vignette du fichier
main.pdf (1.49 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04394004 , version 1 (15-01-2024)

Licence

Paternité

Identifiants

  • HAL Id : hal-04394004 , version 1

Citer

Alexis Bandet, Francieli Boito, Guillaume Pallez. Scheduling distributed I/O resources in HPC systems. 2024. ⟨hal-04394004⟩
74 Consultations
102 Téléchargements

Partager

Gmail Facebook X LinkedIn More