ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing

Olivier Aumage 1, 2 Jacques M. Bahi 3 Sylvain Contassot-Vivier 3 Raphaël Couturier 3 Alexandre Denis 4 Raymond Namyst 1, 2 Guillaume Papauré 4 Christian Pérez 4 Marc Sauget 3
1 RUNTIME - Efficient runtime systems for parallel architectures
Inria Bordeaux - Sud-Ouest, UB - Université de Bordeaux, CNRS - Centre National de la Recherche Scientifique : UMR5800
4 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes an environment dedicated to the building of efficient scientific applications for the Grid on top of unreliable communication networks. Nowadays, scientific computing appli-cations are usually built on top of reliable communication proto-cols (such as TCP). Nevertheless, the additional cost introduced by the reliability layer is not negligible in wide area network-based grid environments. On the other hand, data loss in communications may have a dramatic impact over the performance – if not over the correctness – of classical parallel algorithms. However, a particular class of parallel iterative algorithms hap-pens to be tolerant to such losses. This is the class of asynchronous iterative algorithms, which are commonly used in large scientific applications. They are particularly prone to a good communica-tion/computation overlap since processors are no more synchro-nized. In this study, we aim at proposing a new architecture suit-able for the development of asynchronous iterative algorithms tolerant to message losses.
Contributor : Alexandre Denis <>
Submitted on : Monday, January 12, 2015 - 1:49:23 PM
Last modification on : Thursday, January 7, 2021 - 4:20:27 PM



  • HAL Id : hal-01101475, version 1


Olivier Aumage, Jacques M. Bahi, Sylvain Contassot-Vivier, Raphaël Couturier, Alexandre Denis, et al.. ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing. 3rd International workshop on Parallel Matrix Algorithms and Applications (PMAA'04), Oct 2004, Marseille, France. ⟨hal-01101475⟩



