Skip to Main content Skip to Navigation
New interface
Conference papers

Register Optimizations for Stencils on GPUs

Abstract : The recent advent of compute-intensive GPU architecture has allowed application developers to explore high-order 3D stencils for better computational accuracy. A common optimization strategy for such stencils is to expose sufficient data reuse by means such as loop unrolling, with the expectation of register-level reuse. However, the resulting code is often highly constrained by register pressure. While current state-of-the-art register allocators are satisfactory for most applications, they are unable to effectively manage register pressure for such complex high-order stencils, resulting in sub-optimal code with a large number of register spills. In this paper, we develop a statement reordering framework that models stencil computations as a DAG of trees with shared leaves, and adapts an optimal scheduling algorithm for minimizing register usage for expression trees. The effectiveness of the approach is demonstrated through experimental results on a range of stencils extracted from application codes.
Document type :
Conference papers
Complete list of metadata

Cited literature [60 references]  Display  Hide  Download
Contributor : Fabrice Rastello Connect in order to contact the contributor
Submitted on : Friday, December 14, 2018 - 2:15:07 PM
Last modification on : Friday, November 18, 2022 - 9:24:45 AM
Long-term archiving on: : Friday, March 15, 2019 - 3:52:19 PM


Files produced by the author(s)


  • HAL Id : hal-01955542, version 1


Prashant Singh, Aravind Sukumaran-Rajam, Atanas Rountev, Fabrice Rastello, Louis-Noël Pouchet, et al.. Register Optimizations for Stencils on GPUs. PPoPP 2018 - 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Feb 2018, Vienna, Austria. pp.1-15. ⟨hal-01955542⟩



Record views


Files downloads