Architecture for the Next Generation System Management Tools for High Performance Computing Platforms

Geoffroy Vallée 1 Thomas Naughton 1 Anand Tikotekar 1 Jérôme Gallard 2, * Stephen Scott 1 Christine Morin 2
* Auteur correspondant
2 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : Today, computational scientists mainly execute parallel or distributed applications, and try to scale up to get more results or greater data precision. As a result, they use more and more distributed resources, using local large-scale HPC systems (such as clusters or MPP), grids or even clouds. The difficulty of managing those platforms is their differences in nature, each degree abstracting some of the complexity created by resource distribution. For instance, clusters and MPP systems are located on a single site, composed of different ``partitions'' (e.g., I/O nodes, compute nodes). In grids, ``virtual organizations (VOs)'' are one of the main concepts; since VOs are global and multi-users, they abstract both the complexity of the local resource management and account management away from the users. Finally, clouds provide an high degree of abstraction via the concept of ``services'', which can be implemented via a direct privileged access to the hardware or the usage of Internet based services. But all those cases require local management of resources and some kind of coordination (e.g., coordination between partitions, remote sites, different administration domains). This document presents a detailed description of the architecture of our novel system-management tool that can be used for the management of clusters/MPP systems, grids, and clouds. The architecture is based on three different concepts: (i) Virtual System Environment (VSE), (ii) Virtual Organizations (VOs), and (iii) Virtual Platforms (VPs).
Type de document :
Rapport
[Research Report] RR-7062, INRIA. 2009
Liste complète des métadonnées

Littérature citée [1 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00424107
Contributeur : Jérôme Gallard <>
Soumis le : mercredi 14 octobre 2009 - 09:59:27
Dernière modification le : mercredi 16 mai 2018 - 11:23:04
Document(s) archivé(s) le : mardi 16 octobre 2012 - 12:12:18

Fichier

RR-7062.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00424107, version 1

Citation

Geoffroy Vallée, Thomas Naughton, Anand Tikotekar, Jérôme Gallard, Stephen Scott, et al.. Architecture for the Next Generation System Management Tools for High Performance Computing Platforms. [Research Report] RR-7062, INRIA. 2009. 〈inria-00424107〉

Partager

Métriques

Consultations de la notice

585

Téléchargements de fichiers

151