Parallel Copy Motion

Abstract : Recent results on the static single assignment (SSA) form open promising directions for the design of new register allocation heuristics for just-in-time (JIT) compilation. In particular, heuris- tics based on tree scans with two decoupled phases, one for spilling, one for splitting/coloring/coalescing, seem good candidates for de- signing memory-friendly, fast, and competitive register allocators. Another class of register allocators, well-suited for JIT compilation, are those based on linear scans. Most of them perform coalesc- ing poorly but also do live-range splitting (mostly on control-flow edges) to avoid spilling. This leads to a large amount of register-to- register copies inside basic blocks but also, implicitly, on critical edges, i.e., edges that flow from a block with several successors to a block with several predecessors. This paper presents a new back-end optimization that we call parallel copy motion. The technique is to move copy instructions in a register-allocated code from a program point, possibly an edge, to another. In contrast with a classical scheduler that must preserve data dependences, our copy motion also permutes register assign- ments so that a copy can "traverse" all instructions of a basic block, except those with conflicting register constraints. Thus, parallel copies can be placed either where the scheduling has some empty slots (for multiple-issues architectures), or where fewer copies are necessary because some variables are dead at this point. Moreover, to the cost of some code compensations (namely, the reverse of the copy), a copy can also be moved out from a critical edge. This pro- vides a simple solution to avoid critical-edge splitting, especially useful when the compiler cannot split it, as it is the case for abnor- mal edges. This compensation technique also enables the schedul- ing/motion of the copy in the successor or predecessor basic block. Experiments with the SPECint benchmarks suite and our own benchmark suite show that we can now apply broadly an SSA-based register allocator: all procedures, even with abnormal edges, can be treated. Simple strategies for moving copies from edges and locally inside basic block show significant average improvements (4% for SPECint and 3% for our suite), with no degradation. It let us believe that the approach is promising, and not only for improving coalescing in fast register allocators.
Type de document :
Article dans une revue
SCOPES, ACM, 2010, pp.0
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger
Contributeur : Florent Bouchez <>
Soumis le : jeudi 25 février 2010 - 07:00:05
Dernière modification le : jeudi 8 février 2018 - 11:09:25
Document(s) archivé(s) le : jeudi 17 juin 2010 - 18:36:20


Fichiers produits par l'(les) auteur(s)


  • HAL Id : inria-00435844, version 1



Florent Bouchez, Quentin Colombet, Alain Darte, Christophe Guillon, Fabrice Rastello. Parallel Copy Motion. SCOPES, ACM, 2010, pp.0. 〈inria-00435844〉



Consultations de la notice


Téléchargements de fichiers