A framework for efficient execution on GPU and CPU+GPU systems

Jean-François Dollinger 1
1 CAMUS - Compilation pour les Architectures MUlti-coeurS
Inria Nancy - Grand Est, ICube - Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie
Abstract : Technological limitations faced by the semi-conductor manufacturers in the early 2000's restricted the increase in performance of the sequential computation units. Nowadays, the trend is to increase the number of processor cores per socket and to progressively use the GPU cards for highly parallel computations. Complexity of the recent architectures makes it difficult to statically predict the performance of a program. We describe a reliable and accurate parallel loop nests execution time prediction method on GPUs based on three stages: static code generation, offline profiling, and online prediction. In addition, we present two techniques to fully exploit the computing resources at disposal on a system. The first technique consists in jointly using CPU and GPU for executing a code. In order to achieve higher performance, it is mandatory to consider load balance, in particular by predicting execution time. The runtime uses the profiling results and the scheduler computes the execution times and adjusts the load distributed to the processors. The second technique, puts CPU and GPU in a competition: instances of the considered code are simultaneously executed on CPU and GPU. The winner of the competition notifies its completion to the other instance, implying the termination of the latter.
Complete list of metadatas

Cited literature [134 references]  Display  Hide  Download

https://hal.inria.fr/tel-01251719
Contributor : Vincent Loechner <>
Submitted on : Wednesday, January 6, 2016 - 4:12:50 PM
Last modification on : Saturday, October 27, 2018 - 1:23:58 AM
Long-term archiving on : Thursday, April 7, 2016 - 4:07:46 PM

Identifiers

  • HAL Id : tel-01251719, version 1

Citation

Jean-François Dollinger. A framework for efficient execution on GPU and CPU+GPU systems. Distributed, Parallel, and Cluster Computing [cs.DC]. Université de Strasbourg, 2015. English. ⟨tel-01251719⟩

Share

Metrics

Record views

811

Files downloads

1198