Interpreter Register Autolocalisation: Improving the performance of efficient interpreters

Language interpreters are generally slower than (JIT) compiled implementations because they trade off simplicity for performance and portability. However, they are still an important part of modern Virtual Machines (VMs) as part of mixed-mode execution schema. The reasons behind their importance are many. On the one hand, not all code gets hot and deserves to be optimized by JIT compilers. Examples of cold code are tests, command-line applications, and scripts. On the other hand, compilers are more difficult to write and maintain, thus interpreters are an attractive solution because of their simplicity and portability. In the context of this paper, we will center on bytecode interpreters. Interpreter performance has been a hot topic for a long time, where several solutions have been proposed with different ranges of complexity and portability. On the one hand, some work proposes to optimize language-specific features in interpreters such as type dispatches using static type predictions, quickening [3] or type specializations [18]. On the other hand, many solutions focus on improving general interpreter behavior by minimizing branch miss-predictions of interpreter dispatches and stack caching. Solutions to branch mis-predictions propose variants of code threading [1, 4, 6, 7, 10] and improving it further with selective inlining [14]. Some solutions aim for minimizing branch miss-predictions by modifying the intermediate code (e.g., bytecode) design with super-instructions [15] and register-based instructions [9, 16]. Stack caching [5] proposes to optimize the access of operands by caching the top of the stack. interpreter registers are also related to stack caching: interpreter variables that are critical to the efficient execution of the interpreter loop. Examples of such variables are the instruction pointer (IP), the stack pointer (SP), and the frame pointer (FP). Interpreter registers put pressure on the overall design and implementation of the interpreter: Req1: Value access outside the interpreter loop. VM routines outside of the interpreter loop may require access to interpreter registers. For example, this is the case of garbage collectors that need to traverse the stack to find root objects, routines that unwind or reify the stack, or give access to stack values to native methods. Req2: Efficiency. Interpreter registers are used on each instruction to manipulate the instruction stream and the stack. Under-efficient implementations have negative impacts on performance.

Mots clés

virtual machine interpreter optimisation code transformation

Domaines

Langage de programmation [cs.PL]

Fichier principal

Poli22a-MoreVM22-Autolocalisation.pdf (490.4 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Lse Lse : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03594766

Soumis le : mercredi 2 mars 2022-20:49:08

Dernière modification le : jeudi 1 février 2024-10:04:48

Archivage à long terme le : mardi 31 mai 2022-19:23:48

Dates et versions

hal-03594766 , version 1 (02-03-2022)

Identifiants

HAL Id : hal-03594766 , version 1

Citer

Guillermo Polito, Nahuel Palumbo, Pablo Tesone, Soufyane Labsari, Stéphane Ducasse. Interpreter Register Autolocalisation: Improving the performance of efficient interpreters. MoreVMs 2022, Mar 2022, Porto, Portugal. ⟨hal-03594766⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA IRISA CRISTAL INRIA2 CRISTAL-RMOD UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UNIV-LILLE UR1-MATH-NUM

76 Consultations

167 Téléchargements