Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units

Christian Obrecht 1 Frédéric Kuznik 1 Bernard Tourancheau 2, 3, 4 Jean-Jacques Roux 1
2 SWING - Smart Wireless Networking
Inria Grenoble - Rhône-Alpes, CITI - CITI Centre of Innovation in Telecommunications and Integration of services
4 GRAAL - Algorithms and Scheduling for Distributed Heterogeneous Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : In this work, we investigate the global memory access mech- anism on recent GPUs. For the purpose of this study, we created spe- cific benchmark programs, which allowed us to explore the scheduling of global memory transactions. Thus, we formulate a model capable of estimating the execution time for a large class of applications. Our main goal is to facilitate optimisation of regular data-parallel applications on GPUs. As an example, we finally describe our CUDA implementations of LBM flow solvers on which our model was able to estimate performance with less than 5% relative error.
Complete list of metadatas

Cited literature [12 references]  Display  Hide  Download

https://hal.inria.fr/inria-00563159
Contributor : Bernard Tourancheau <>
Submitted on : Friday, February 4, 2011 - 10:23:24 AM
Last modification on : Monday, December 10, 2018 - 10:54:03 AM
Long-term archiving on : Thursday, May 5, 2011 - 2:50:25 AM

File

obrecht11a.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00563159, version 1

Citation

Christian Obrecht, Frédéric Kuznik, Bernard Tourancheau, Jean-Jacques Roux. Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units. VECPAR, 2011, Porto, Portugal. pp.151--161. ⟨inria-00563159⟩

Share

Metrics

Record views

400

Files downloads

349