Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization

Mircea Namolaru; Albert Cohen; Grigori Fursin; Ayal Zaks; Ari Freund

Communication Dans Un Congrès Année : 2010

Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization

(1) , (2) , (2) , (1) , (1)

1
2

Mircea Namolaru

Fonction : Auteur

IBM Research [HAIFA]

Albert Cohen

Fonction : Auteur
PersonId : 12723
IdHAL : albert-cohen

Architectures, Languages and Compilers to Harness the End of Moore Years

Grigori Fursin

Fonction : Auteur

Architectures, Languages and Compilers to Harness the End of Moore Years

Ayal Zaks

Fonction : Auteur

IBM Research [HAIFA]

Ari Freund

Fonction : Auteur

IBM Research [HAIFA]

Résumé

Iterative search combined with machine learning is a promising approach to design optimizing compilers harnessing the complexity of modern computing systems. While traversing a program optimization space, we collect characteristic feature vectors of the program, and use them to discover correlations across programs, target architectures, data sets, and performance. Predictive models can be derived from such correlations, effectively hiding the time-consuming feedback-directed optimization process from the application programmer. One key task of this approach, naturally assigned to compiler experts, is to design relevant features and implement scalable feature extractors, including statistical models that filter the most relevant information from millions of lines of code. This new task turns out to be a very challenging and tedious one from a compiler construction perspective. So far, only a limited set of ad-hoc, largely syntactical features have been devised. Yet machine learning is only able to discover correlations from information it is fed with: it is critical to select topical program features for a given optimization problem in order for this approach to succeed. We propose a general method for systematically generating numerical features from a program. This method puts no restrictions on how to logically and algebraically aggregate semantical properties into numerical features. We illustrate our method on the difficult problem of selecting the best possible combination of 88 available optimizations in GCC. We achieve 74% of the potential speedup obtained through iterative compilation on a wide range of benchmarks and four different general-purpose and embedded architectures. Our work is particularly relevant to embedded system designers willing to quickly adapt the optimization heuristics of a mainstream compiler to their custom ISA, microarchitecture, benchmark suite and workload. Our method has been integrated with the publicly released MILEPOST GCC.

Domaines

Langage de programmation [cs.PL]

Fichier principal

mpost.pdf (131.23 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Albert Cohen : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00551512

Soumis le : mardi 4 janvier 2011-00:14:08

Dernière modification le : lundi 12 février 2024-10:38:04

Archivage à long terme le : mardi 5 avril 2011-02:40:21

Dates et versions

inria-00551512 , version 1 (04-01-2011)

Identifiants

HAL Id : inria-00551512 , version 1

Citer

Mircea Namolaru, Albert Cohen, Grigori Fursin, Ayal Zaks, Ari Freund. Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization. International Conference on Compilers Architectures and Synthesis for Embedded Systems (CASES'10), Oct 2010, Scottsdale, United States. ⟨inria-00551512⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EC-PARIS CNRS INRIA UMR8623 INRIA2 UNIV-PARIS-SACLAY

152 Consultations

1141 Téléchargements

Practical Aggregation of Semantical Program Properties for Machine Learning Based Optimization

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager