Skip to Main content Skip to Navigation
New interface
Conference papers

On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors

Abstract : Usual cache optimisation techniques for high performance computing are difficult to apply in embedded VLIW applications. First, embedded applications are not always well structured, and few regular loop nests exist. Real world applications in embedded computing contain hot loops with pointers, indirect arrays accesses, function calls, indirect function calls, non constant stride accesses, etc. Consequently, loop transformations for reducing cache misses are impossible to apply, especially at the back-end level. Second, the strides of memory accesses do not appear to be constant at source code level, because of indirect accesses. Hence, usual prefetching techniques are not applicable. Third, embedded VLIW processors are "cheap" products, they have limited hardware dynamic mechanisms compared to high performance processors : no out-of-order execution, reduced memory hierarchies, small direct mapped caches, lower clock frequencies, etc. Consequently, the code optimisations methods must be simple and take care of code size. This article presents a back-end code optimisation for tolerating non-blocking cache effects at the instruction level (not at the loop level). Our method is based on a robust combination of memory pre-loading with data prefetching, allowing us to optimise both regular and irregular applications at the assembly level. Our experiments with mediabench and SPEC2000 benchmarks suites on the ST231 VLIW processor show a positive performance gain (compared to codes generated with -O3 compiler optimisation flag). Our method induces negligible code size growth (less than 3.9 % in the extreme case).
Document type :
Conference papers
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download
Contributor : Sid Touati Connect in order to contact the contributor
Submitted on : Friday, October 28, 2011 - 2:33:11 PM
Last modification on : Wednesday, October 20, 2021 - 12:24:13 AM
Long-term archiving on: : Monday, January 30, 2012 - 11:17:48 AM


Files produced by the author(s)




Samir Ammenouche, Sid Touati, William Jalby. On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors. 11th IEEE International Conference on High Performance Computing and Communications, 2009 (HPCC '09), Jun 2009, Seoul, South Korea. pp.273 -279, ⟨10.1109/HPCC.2009.32⟩. ⟨inria-00636852⟩



Record views


Files downloads