Implementing Wilson-Dirac Operator on the Cell Broadband Engine

Khaled Z. Ibrahim 1 François Bodin 1
1 CAPS - Compilation, parallel architectures and system
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : Computing the actions of Wilson-Dirac operators consumes most of the CPU time for the grand challenge problem of simulating Lattice Quantum Chromodynamics (Lattice QCD). This routine exhibits many challenges to implementation on most computational environments because of the multiple pattern of accessing the same data that make it difficult to align the data efficiently at compile time. Additionally, the low computation to memory access ratio makes this computation both memory bandwidth and memory latency bounded. In this work, we present an implementation of this routine on Cell Broadband Engine. We propose runtime data fusion, an approach aiming at aligning data at runtime, for data that cannot be aligned optimally at compile time, to improve SIMDized execution. We also show DMA optimization technique that reduces the impact of BW limits on performance. Our implementation for this routine achieves 31.2 GFlops for single precision computations and 8.75 GFlops for double precision computations.
Type de document :
Rapport
[Research Report] PI 1880, 2007, pp.23
Liste complète des métadonnées

https://hal.inria.fr/inria-00203478
Contributeur : Anne Jaigu <>
Soumis le : jeudi 10 janvier 2008 - 11:45:28
Dernière modification le : jeudi 11 janvier 2018 - 06:20:08
Document(s) archivé(s) le : mardi 13 avril 2010 - 16:55:09

Fichier

PI-1880.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : inria-00203478, version 1

Collections

Citation

Khaled Z. Ibrahim, François Bodin. Implementing Wilson-Dirac Operator on the Cell Broadband Engine. [Research Report] PI 1880, 2007, pp.23. 〈inria-00203478〉

Partager

Métriques

Consultations de la notice

214

Téléchargements de fichiers

68