Implementing a GPU Programming Model on a non-GPU Accelerator Architecture

Stephen M. Kofsky; Daniel R. Johnson; John A. Stratton; Wen-Mei W. Hwu; Sanjay J. Patel; Steven S. Lumetta

Communication Dans Un Congrès Année : 2010

Implementing a GPU Programming Model on a non-GPU Accelerator Architecture

(1) , (1) , (1) , (1) , (1) , (2)

1
2

Stephen M. Kofsky

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

Daniel R. Johnson

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

John A. Stratton

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

Wen-Mei W. Hwu

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

Sanjay J. Patel

Fonction : Auteur

University of Illinois at Urbana-Champaign [Urbana]

Steven S. Lumetta

Fonction : Auteur
PersonId : 872172

Department of Computer Science [UIUC]

Résumé

Parallel codes are written primarily for the purpose of performance. It is highly desirable that parallel codes be portable between parallel architectures without significant performance degradation or code rewrites. While performance portability and its limits have been studied thoroughly on single processor systems, this goal has been less extensively studied and is more difficult to achieve for parallel systems. Emerging single-chip parallel platforms are no exception; writing code that obtains good performance across GPUs and other many-core CMPs can be challenging. In this paper, we focus on CUDA codes, noting that programs must obey a number of constraints to achieve high performance on an NVIDIA GPU. Under such constraints, we develop optimizations that improve the performance of CUDA code on a MIMD accelerator architecture that we are developing called Rigel. We demonstrate performance improvements with these optimizations over na¨ıve translations, and final performance results comparable to those of codes that were hand-optimized for Rigel.

Domaines

Autre [cs.OH] Architectures Matérielles [cs.AR]

Fichier principal

A4MMC-kofsky.pdf (350.14 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Ist Rennes : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00493905

Soumis le : lundi 21 juin 2010-15:44:35

Dernière modification le : vendredi 12 janvier 2024-13:50:04

Archivage à long terme le : mercredi 22 septembre 2010-18:11:22

Dates et versions

inria-00493905 , version 1 (21-06-2010)

Identifiants

HAL Id : inria-00493905 , version 1

Citer

Stephen M. Kofsky, Daniel R. Johnson, John A. Stratton, Wen-Mei W. Hwu, Sanjay J. Patel, et al.. Implementing a GPU Programming Model on a non-GPU Accelerator Architecture. A4MMC 2010 - 1st Workshop on Applications for Multi and Many Core Processors, Jun 2010, Saint Malo, France. ⟨inria-00493905⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ISCA2010 A4MMC

32 Consultations

221 Téléchargements

Implementing a GPU Programming Model on a non-GPU Accelerator Architecture

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager