Skip to Main content Skip to Navigation
Conference papers

Parallel Copy Motion

Abstract : Recent results on the static single assignment (SSA) form open promising directions for the design of new register allocation heuristics for just-in-time (JIT) compilation. In particular, heuris- tics based on tree scans with two decoupled phases, one for spilling, one for splitting/coloring/coalescing, seem good candidates for de- signing memory-friendly, fast, and competitive register allocators. Another class of register allocators, well-suited for JIT compilation, are those based on linear scans. Most of them perform coalesc- ing poorly but also do live-range splitting (mostly on control-flow edges) to avoid spilling. This leads to a large amount of register-to- register copies inside basic blocks but also, implicitly, on critical edges, i.e., edges that flow from a block with several successors to a block with several predecessors. This paper presents a new back-end optimization that we call parallel copy motion. The technique is to move copy instructions in a register-allocated code from a program point, possibly an edge, to another. In contrast with a classical scheduler that must preserve data dependences, our copy motion also permutes register assign- ments so that a copy can "traverse" all instructions of a basic block, except those with conflicting register constraints. Thus, parallel copies can be placed either where the scheduling has some empty slots (for multiple-issues architectures), or where fewer copies are necessary because some variables are dead at this point. Moreover, to the cost of some code compensations (namely, the reverse of the copy), a copy can also be moved out from a critical edge. This pro- vides a simple solution to avoid critical-edge splitting, especially useful when the compiler cannot split it, as it is the case for abnor- mal edges. This compensation technique also enables the schedul- ing/motion of the copy in the successor or predecessor basic block. Experiments with the SPECint benchmarks suite and our own benchmark suite show that we can now apply broadly an SSA-based register allocator: all procedures, even with abnormal edges, can be treated. Simple strategies for moving copies from edges and locally inside basic block show significant average improvements (4% for SPECint and 3% for our suite), with no degradation. It let us believe that the approach is promising, and not only for improving coalescing in fast register allocators.
Document type :
Conference papers
Complete list of metadata

Cited literature [18 references]  Display  Hide  Download
Contributor : Florent Bouchez Connect in order to contact the contributor
Submitted on : Thursday, February 25, 2010 - 7:00:05 AM
Last modification on : Thursday, January 20, 2022 - 4:13:48 PM
Long-term archiving on: : Thursday, June 17, 2010 - 6:36:20 PM


Files produced by the author(s)


  • HAL Id : inria-00435844, version 1



Florent Bouchez, Quentin Colombet, Alain Darte, Christophe Guillon, Fabrice Rastello. Parallel Copy Motion. SCOPES 2010 - 13th International Workshop on Software & Compilers for Embedded Systems, Jun 2010, New York, United States. pp.0. ⟨inria-00435844⟩



Record views


Files downloads