ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing

Olivier Aumage 1, 2 Jacques M. Bahi 3 Sylvain Contassot-Vivier 3 Raphaël Couturier 3 Alexandre Denis 4 Raymond Namyst 1, 2 Guillaume Papauré 4 Christian Pérez 4 Marc Sauget 3
1 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
4 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes an environment dedicated to the building of efficient scientific applications for the Grid on top of unreliable communication networks. Nowadays, scientific computing appli-cations are usually built on top of reliable communication proto-cols (such as TCP). Nevertheless, the additional cost introduced by the reliability layer is not negligible in wide area network-based grid environments. On the other hand, data loss in communications may have a dramatic impact over the performance – if not over the correctness – of classical parallel algorithms. However, a particular class of parallel iterative algorithms hap-pens to be tolerant to such losses. This is the class of asynchronous iterative algorithms, which are commonly used in large scientific applications. They are particularly prone to a good communica-tion/computation overlap since processors are no more synchro-nized. In this study, we aim at proposing a new architecture suit-able for the development of asynchronous iterative algorithms tolerant to message losses.
Type de document :
Communication dans un congrès
3rd International workshop on Parallel Matrix Algorithms and Applications (PMAA'04), Oct 2004, Marseille, France
Liste complète des métadonnées

https://hal.inria.fr/hal-01101475
Contributeur : Alexandre Denis <>
Soumis le : lundi 12 janvier 2015 - 13:49:23
Dernière modification le : vendredi 6 juillet 2018 - 15:06:09

Annexe

Identifiants

  • HAL Id : hal-01101475, version 1

Citation

Olivier Aumage, Jacques M. Bahi, Sylvain Contassot-Vivier, Raphaël Couturier, Alexandre Denis, et al.. ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing. 3rd International workshop on Parallel Matrix Algorithms and Applications (PMAA'04), Oct 2004, Marseille, France. 〈hal-01101475〉

Partager

Métriques

Consultations de la notice

822

Téléchargements de fichiers

49