Out-of-Core Wavefront Computations with Reduced Synchronization - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2008

Out-of-Core Wavefront Computations with Reduced Synchronization

Pierre-Nicolas Clauss
  • Fonction : Auteur
  • PersonId : 843073
Jens Gustedt
Frédéric Suter

Résumé

Matrix computation algorithms often exhibit dependencies between neighboring elements inside loop nests such that the frontier between computed elements and those to be computed wanders in form of a 'wave' through the matrix. Macro-pipelining techniques can achieve an efficient parallelization of such algorithms by overlapping communication and computation. Usually these techniques are limited to situations where all the data to be processed fits into main memory, whereas for larger data the I/O usage pattern for external storage requires special attention. The work [CDS05] presented a first extension of the wavefront framework to these so-called out-of-core problems. The present paper proposes a redesign of their algorithm that minimizes both overhead and perturbations coming from communications. To tackle the issue of non-contiguous I/O, we also propose an optimized data layout. These two major modifications of the original algorithm eventually allow us to present a third improvement as our implementation shortens the transition phase between two consecutive iterations of the wavefront algorithm. Experiments performed with the parXXL library show that we can significantly reduce the time lost during inefficient I/O operations and thus obtain faster computations.
Fichier principal
Vignette du fichier
cs07_th.pdf (133.17 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

inria-00176084 , version 1 (02-10-2007)

Identifiants

  • HAL Id : inria-00176084 , version 1

Citer

Pierre-Nicolas Clauss, Jens Gustedt, Frédéric Suter. Out-of-Core Wavefront Computations with Reduced Synchronization. 16th Euromicro International Conference on Parallel, Distributed and network-based Processing, Feb 2008, Toulouse, France. pp.293-300. ⟨inria-00176084⟩
99 Consultations
145 Téléchargements

Partager

Gmail Facebook X LinkedIn More