Loop Unrolling Minimisation in the Presence of Multiple Register Types: a Viable Alternative to Modulo Variable Expansion
Résumé
Modulo Variable Expansion (MVE) [1] used with soft- ware pipelining (SWP) may sacrifice the register optimality (MAXLIVE) and in general may lead to unnecessary spills or move operations negating the benefits of SWP. In con- trast, bigger loop unrolling can be performed to meet the MAXLIVE registers requirement [2, 3]. However, the de- gree of unrolling should be minimised to control code size and hence I-cache performance. In our previous work, we designed a post-pass unrolling algorithm which minimises the unrolling degree while ad- justing the length of reuse circuits through the usage of ad- ditional (free) registers [4]. In this paper, we complete our study with an improved algorithm for minimising kernel loop unrolling resulting from cyclic register allocation in the presence of multiple register types showing that considering all register types in conjunction provides a lower unrolling degree than considering each register type in isolation. In ad- dition, we integrate our solution within a real world embed- ded system compiler: st200cc for the ST2xx family of VLIW embedded processors and compare it to MVE. Our large set of experiments on both high performance and embed- ded benchmarks (SPEC2000, SPEC2006, MEDIABENCH and FFMPEG) demonstrates the practical applicability and the benefits of our approach.
Domaines
Autre [cs.OH]
Origine : Fichiers produits par l'(les) auteur(s)
Loading...