Skip to Main content Skip to Navigation
Conference papers

ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing

Olivier Aumage 1, 2 Jacques M. Bahi 3 Sylvain Contassot-Vivier 3 Raphaël Couturier 3 Alexandre Denis 4 Raymond Namyst 1, 2 Guillaume Papauré 4 Christian Pérez 4 Marc Sauget 3
1 RUNTIME - Efficient runtime systems for parallel architectures
CNRS - Centre National de la Recherche Scientifique : UMR5800, UB - Université de Bordeaux, Inria Bordeaux - Sud-Ouest
4 PARIS - Programming distributed parallel systems for large scale numerical simulation
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique
Abstract : This paper describes an environment dedicated to the building of efficient scientific applications for the Grid on top of unreliable communication networks. Nowadays, scientific computing appli-cations are usually built on top of reliable communication proto-cols (such as TCP). Nevertheless, the additional cost introduced by the reliability layer is not negligible in wide area network-based grid environments. On the other hand, data loss in communications may have a dramatic impact over the performance – if not over the correctness – of classical parallel algorithms. However, a particular class of parallel iterative algorithms hap-pens to be tolerant to such losses. This is the class of asynchronous iterative algorithms, which are commonly used in large scientific applications. They are particularly prone to a good communica-tion/computation overlap since processors are no more synchro-nized. In this study, we aim at proposing a new architecture suit-able for the development of asynchronous iterative algorithms tolerant to message losses.
Document type :
Conference papers
Complete list of metadata

https://hal.inria.fr/hal-01101475
Contributor : Alexandre Denis Connect in order to contact the contributor
Submitted on : Monday, January 12, 2015 - 1:49:23 PM
Last modification on : Friday, January 21, 2022 - 3:09:04 AM

Annex

Identifiers

  • HAL Id : hal-01101475, version 1

Citation

Olivier Aumage, Jacques M. Bahi, Sylvain Contassot-Vivier, Raphaël Couturier, Alexandre Denis, et al.. ALTA: Asynchronous Loss Tolerant Algorithms for Grid Computing. 3rd International workshop on Parallel Matrix Algorithms and Applications (PMAA'04), Oct 2004, Marseille, France. ⟨hal-01101475⟩

Share

Metrics

Les métriques sont temporairement indisponibles