Deriving Efficient Data Movement from Decoupled Access/Execute Specifications

Abstract : On multi-core architectures with software-managed memories, effectively orchestrating data movement is essential to performance, but is tedious and error-prone. In this paper we show that when the programmer can explicitly specify both the memory access pattern and the execution schedule of a computation kernel, the compiler or run-time system can derive efficient data movement, even if analysis of kernel code is difficult or impossible. We have developed a framework of C++ classes for decoupled Access/Execute specifications, allowing for automatic communication optimisations such as software pipelining and data reuse. We demonstrate the ease and efficiency of programming the Cell Broadband Engine architecture using these classes by implementing a set of benchmarks, which exhibit data reuse and non-affine access functions, and by comparing these implementations against alternative implementations, which use hand-written DMA transfers and software-based caching.
Type de document :
Communication dans un congrès
André Seznec and Joel Emer and Mike O'Boyle and Margaret Martonosi and Theo Ungerer. HiPEAC 2009 - High Performance and Embedded Architectures and Compilers, Jan 2009, Paphos, Cyprus. Springer, 2009, 〈10.1007/978-3-540-92990-1_14〉
Liste complète des métadonnées

https://hal.inria.fr/inria-00445952
Contributeur : Ist Rennes <>
Soumis le : lundi 11 janvier 2010 - 16:06:07
Dernière modification le : lundi 20 juin 2016 - 14:10:32

Lien texte intégral

Identifiants

Collections

Citation

Lee W. Howes, Anton Lokhmotov, Alastair F. Donaldson, Paul H.J. Kelly. Deriving Efficient Data Movement from Decoupled Access/Execute Specifications. André Seznec and Joel Emer and Mike O'Boyle and Margaret Martonosi and Theo Ungerer. HiPEAC 2009 - High Performance and Embedded Architectures and Compilers, Jan 2009, Paphos, Cyprus. Springer, 2009, 〈10.1007/978-3-540-92990-1_14〉. 〈inria-00445952〉

Partager

Métriques

Consultations de la notice

10