Skip to Main content Skip to Navigation

Architecture for the Next Generation System Management Tools for Distributed Computing Platforms

Jérôme Gallard 1 Geoffroy Vallée 2 Thomas Naughton 2 Adrien Lebre 3, 4 Stephen Scott 2 Christine Morin 1
1 MYRIADS - Design and Implementation of Autonomous Distributed Systems
Inria Rennes – Bretagne Atlantique , IRISA-D1 - SYSTÈMES LARGE ÉCHELLE
3 ASCOLA - Aspect and composition languages
Inria Rennes – Bretagne Atlantique , Département informatique - EMN, LINA - Laboratoire d'Informatique de Nantes Atlantique
Abstract : In order to get more results or greater accuracy, computational scientists execute mainly parallel or distributed applications, and try to scale these applications up. Accordingly, they use more and more distributed resources, using local large-scale HPC systems, grids or even clouds. However, in most of cases, the use and management of such platforms is static. Indeed generally, the application has to be adapted to the environment rather than adapting the environment to the applications' needs. In addition, platforms are managed through the concept of time and space partitioning mainly via the use of batch schedulers: time partitioning enables the execution of several applications on a same resources, and space partitioning enables the execution of applications across several distributed resources. This leads to some usage limitations, where applications can only be executed on a subset of the available resources. Therefore, scientists have to manage technical details related to the execution of their applications on each target HPC platforms, which could result in application modifications, rather than focusing on the science. In this article, we advocate for a system management tool enabling the transparent configuration of the HPC platform and the customization of the execution environment for large-scale HPC systems (such as clusters or MPPs), grids, and clouds. We propose a new approach to manage these systems in a more dynamic way, where the resources can be configured and reconfigured automatically and transparently. The proposed solution is not removing the benefit of resource management systems such as batch system (they still provide a well-known interface for job submission), but rather redefine the underlying system capabilities. Our approach is based on a refinement of the concept of emulation and virtualization introduced by Goldberg. Furthermore, the proposed approach leads to the definition of a method that provides a unique interface to scientists for the deployment and management of their applications on HPC platforms. This method is based on two concepts: (i) the Virtual System Environment (VSE), and (ii) the Virtual Platforms (VPs).
Document type :
Complete list of metadatas

Cited literature [2 references]  Display  Hide  Download
Contributor : Jérôme Gallard <>
Submitted on : Tuesday, June 22, 2010 - 6:00:57 PM
Last modification on : Thursday, January 7, 2021 - 4:25:27 PM
Long-term archiving on: : Monday, October 22, 2012 - 2:41:30 PM


Files produced by the author(s)


  • HAL Id : inria-00494328, version 1


Jérôme Gallard, Geoffroy Vallée, Thomas Naughton, Adrien Lebre, Stephen Scott, et al.. Architecture for the Next Generation System Management Tools for Distributed Computing Platforms. [Research Report] RR-7325, INRIA. 2010. ⟨inria-00494328⟩



Record views


Files downloads