A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction

Allen Leung; Nicolas Vasilache; Benoît Meister; Muthu Manikandan; David Wohlford; Cédric Bastoul; Richard Lethin

Communication Dans Un Congrès Année : 2010

A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction

(1) , (1) , (1) , (1) , (1) , (1, 2) , (1)

1
2

Allen Leung

Fonction : Auteur

Reservoir Labs

Nicolas Vasilache

Fonction : Auteur

Reservoir Labs

Benoît Meister

Fonction : Auteur

Reservoir Labs

Muthu Manikandan

Fonction : Auteur

Reservoir Labs

David Wohlford

Fonction : Auteur

Reservoir Labs

Cédric Bastoul

Fonction : Auteur
PersonId : 832008

Reservoir Labs

Architectures, Languages and Compilers to Harness the End of Moore Years

Richard Lethin

Fonction : Auteur

Reservoir Labs

Résumé

Programmers for GPGPU face rapidly changing substrate of programming abstractions, execution models, and hardware implementations. It has been established, through numerous demonstrations for particular conjunctions of application kernel, programming languages, and GPU hardware instance, that it is possible to achieve significant improvements in the price/performance and energy/performance over general purpose processors. But these demonstrations are each the result of significant dedicated programmer labor, which is likely to be duplicated for each new GPU hardware architecture to achieve performance portability. This paper discusses the implementation, in the R-Stream compiler, of a source to source mapping pathway from a high-level, textbook-style algorithm expression method in ANSI C, to multi-GPGPU accelerated computers. The compiler performs hierarchical decomposition and parallelization of the algorithm between and across host, multiple GPGPUs, and within-GPU. The semantic transformations are expressed within the polyhedral model, including optimization of integrated parallelization, locality, and contiguity tradeoffs. Hierarchical tiling is performed. Communication and synchronizations operations at multiple levels are generated automatically. The resulting mapping is currently emitted in the CUDA programming language. The GPU backend adds to the range of hardware and accelerator targets for R-Stream and indicates the potential for performance portability of single sources across multiple hardware targets.

Domaines

Performance et fiabilité [cs.PF] Calcul parallèle, distribué et partagé [cs.DC]

Cédric Bastoul : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00551084

Soumis le : dimanche 2 janvier 2011-15:05:43

Dernière modification le : lundi 12 février 2024-10:38:04

Dates et versions

inria-00551084 , version 1 (02-01-2011)

Identifiants

HAL Id : inria-00551084 , version 1

Citer

Allen Leung, Nicolas Vasilache, Benoît Meister, Muthu Manikandan, David Wohlford, et al.. A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction. Proceedings of 3rd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU 2010, Mar 2010, Pittsburgh, Pennsylvania, United States. pp.51--61. ⟨inria-00551084⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 UNIV-PARIS-SACLAY

107 Consultations

0 Téléchargements

A mapping path for multi-GPGPU accelerated computers from a portable high level programming abstraction

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager