Adaptive Algorithms for Shared Cache on Multicore

Abstract : Reordering instructions and data layout can bring significant performance improvement for memory bounded applications. Parallelizing such applications requires a careful design of the algorithm in order to keep the locality of the sequential execution. On one hand, parallel computation tends to create concurrent tasks that work on independent data sets to reduce communication and synchronization. On the other hand, a multicore architecture with shared cache can bring performance benefits due to high-speed communication between cores if concurrent tasks process data close in memory. In this paper, we aim at finding a good parallelization of memory bounded applications on multicore that preserves the advantage of a shared cache. We focus on sequential applications with iteration through a sequence of memory references. Our solution relies on an adaptive parallel algorithm with a dynamic sliding window that constrains cores sharing the same cache to process data close in memory. This parallel algorithm induces the same number of cache misses as the sequential algorithm at the expense of an increased number of synchronizations. We theoretically analyze the synchronization overhead for both static and dynamic load balancing. Experiments with a memory bounded isosurface extraction application confirm that core collaboration for shared cache access can bring significant performance improvements despite the incurred synchronization costs. On quad cores Nehalem processor, our algorithms are 10% to 30% faster than algorithms not optimized for shared cache thanks to a reduced number of last level cache misses.
Type de document :
[Research Report] RR-7256, INRIA. 2010, pp.17
Liste complète des métadonnées

Littérature citée [13 références]  Voir  Masquer  Télécharger
Contributeur : Marc Tchiboukdjian <>
Soumis le : jeudi 15 avril 2010 - 21:48:06
Dernière modification le : jeudi 11 janvier 2018 - 06:22:02
Document(s) archivé(s) le : lundi 22 octobre 2012 - 15:01:33


Fichiers produits par l'(les) auteur(s)


  • HAL Id : inria-00473617, version 1


Marc Tchiboukdjian, Vincent Danjean, Thierry Gautier, Fabien Le Mentec, Bruno Raffin. Adaptive Algorithms for Shared Cache on Multicore. [Research Report] RR-7256, INRIA. 2010, pp.17. 〈inria-00473617〉



Consultations de la notice


Téléchargements de fichiers