Optimization Space Pruning without Regrets

Ulysse Beaugnon; Antoine Pouille; Marc Pouzet; Jacques Pienaar; Albert Cohen

doi:10.1145/3033019.3033023

Communication Dans Un Congrès Année : 2017

Optimization Space Pruning without Regrets

(1, 2) , (1, 2) , (1, 2) , (3) , (1)

1
2
3

Ulysse Beaugnon

Fonction : Auteur

Parallélisme de Kahn Synchrone

Département d'informatique - ENS Paris

Antoine Pouille

Fonction : Auteur

Parallélisme de Kahn Synchrone

Département d'informatique - ENS Paris

Marc Pouzet

Fonction : Auteur
PersonId : 969936

Parallélisme de Kahn Synchrone

Département d'informatique - ENS Paris

Jacques Pienaar

Fonction : Auteur
PersonId : 1024402

Google Inc.

Albert Cohen

Fonction : Auteur
PersonId : 6894
IdHAL : acohen
ORCID : 0000-0002-8866-5343
IdRef : 067155898

Parallélisme de Kahn Synchrone

Résumé

Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However , the search for a close-to-optimal implementation remains extremely tedious due to the specialization and complexity of GPU architectures. We present a novel approach to automatically discover the best performing code from a given set of possible implementations. It involves a branch and bound algorithm with two distinctive features: (1) an analytic performance model of a lower bound on the execution time, and (2) the ability to estimate such bounds on a partially-specified implementation. The unique features of this performance model allow to aggressively prune the optimization space without eliminating the best performing implementation. While the space considered in this paper focuses on GPUs, the approach is generic enough to be applied to other architectures. We implemented our algorithm in a tool called Telamon and demonstrate its effectiveness on a huge, architecture-specific and input-sensitive optimization space. The information provided by the performance model also helps to identify ways to enrich the search space to consider better candidates, or to highlight architectural bottlenecks.

Mots clés

Compilers Branch and bound algorithms GPU optimization Performance Modeling Search Space Exploration

Domaines

Calcul parallèle, distribué et partagé [cs.DC]

Fichier principal

paper.pdf (875.54 Ko)

Timothy Bourke : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01655602

Soumis le : mardi 5 décembre 2017-19:55:22

Dernière modification le : lundi 11 décembre 2023-11:31:27

Dates et versions

hal-01655602 , version 1 (05-12-2017)

Identifiants

HAL Id : hal-01655602 , version 1
DOI : 10.1145/3033019.3033023

Citer

Ulysse Beaugnon, Antoine Pouille, Marc Pouzet, Jacques Pienaar, Albert Cohen. Optimization Space Pruning without Regrets. CC 2017 - 26th International Conference on Compiler Construction, Feb 2017, Austin, TX, United States. pp.34-44, ⟨10.1145/3033019.3033023⟩. ⟨hal-01655602⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS INRIA INRIA2 PSL

233 Consultations

592 Téléchargements

Optimization Space Pruning without Regrets

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager