Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters

Abstract : Intelligent workload consolidation and dynamic cluster adaptation offer a great opportunity for energy savings in current large-scale clusters. Because of the heterogeneous nature of these environments, scalable, fault-tolerant and distributed consolidation managers are necessary in order to efficiently manage their workload and thus conserve energy and reduce the operating costs. However, most of the consolidation managers available nowadays do not fulfill these requirements. Hence, they are mostly centralized and solely designed to be operated in virtualized environments. In this work, we present the architecture of a novel scalable, fault-tolerant and distributed consolidation manager called Snooze that is able to dynamically consolidate the workload of a software and hardware heterogeneous large-scale cluster composed out of resources using the virtualization and Single System Image (SSI) technologies. Therefore, a common cluster monitoring and management API is introduced, which provides a uniform and transparent access to the features of the underlying platforms. Our architecture is open to support any future technologies and can be easily extended with monitoring metrics and algorithms. Finally, a comprehensive use case study demonstrates the feasibility of our approach to manage the energy consumption of a large-scale cluster.
Type de document :
Communication dans un congrès
2010 IEEE/ACM International Conference on Green Computing and Communications (GreenCom-2010), Dec 2010, Hangzhou, China. 2010
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/inria-00529702
Contributeur : Eugen Feller <>
Soumis le : jeudi 10 novembre 2011 - 14:20:56
Dernière modification le : mercredi 16 mai 2018 - 11:23:31
Document(s) archivé(s) le : jeudi 15 novembre 2012 - 11:37:30

Fichier

efeller.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00529702, version 1

Citation

Eugen Feller, Louis Rilling, Christine Morin, Renaud Lottiaux, Daniel Leprince. Snooze: A Scalable, Fault-Tolerant and Distributed Consolidation Manager for Large-Scale Clusters. 2010 IEEE/ACM International Conference on Green Computing and Communications (GreenCom-2010), Dec 2010, Hangzhou, China. 2010. 〈inria-00529702〉

Partager

Métriques

Consultations de la notice

880

Téléchargements de fichiers

188