Skip to Main content Skip to Navigation

A framework for efficient execution on GPU and CPU+GPU systems

Jean-François Dollinger 1
1 CAMUS - Compilation pour les Architectures MUlti-coeurS
Inria Nancy - Grand Est, ICube - Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie
Abstract : Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted the increase in performance of the sequential computation units. Nowadays, the trend is to increase the number of processor cores per socket and to progressively use the GPU cards for highly parallel computations. Complexity of the recent architectures makes it difficult to statically predict the performance of a program. We describe a reliable and accurate parallel loop nests execution time prediction method on GPUs based on three stages: static code generation, offline profiling, and online prediction. In addition, we present two techniques to fully exploit the computing resources at disposal on a system. The first technique consists in jointly using CPU and GPU for executing a code. In order to achieve higher performance, it is mandatory to consider load balance, in particular by predicting execution time. The runtime uses the profiling results and the scheduler computes the execution times and adjusts the load distributed to the processors. The second technique, puts CPU and GPU in a competition: instances of the considered code are simultaneously executed on CPU and GPU. The winner of the competition notifies its completion to the other instance, implying the termination of the latter.
Complete list of metadata

Cited literature [134 references]  Display  Hide  Download
Contributor : Vincent Loechner Connect in order to contact the contributor
Submitted on : Wednesday, January 6, 2016 - 4:12:50 PM
Last modification on : Thursday, December 2, 2021 - 3:16:56 AM
Long-term archiving on: : Thursday, April 7, 2016 - 4:07:46 PM


  • HAL Id : tel-01251719, version 1


Jean-François Dollinger. A framework for efficient execution on GPU and CPU+GPU systems. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Strasbourg, 2015. English. ⟨tel-01251719⟩



Record views


Files downloads