A Band-pass Prefetching : An Effective Prefetch Management Mechanism using Prefetch-fraction Metric in Multi-core Systems

Aswinkumar Sridharan 1 Biswabandan Panda 1 André Seznec 1
1 PACAP - Pushing Architecture and Compilation for Application Performance
Inria Rennes – Bretagne Atlantique , IRISA_D3 - ARCHITECTURE
Abstract : In multi-core systems, an application's prefetcher can interfere with the memory requests of other applications using the shared resources, such as last level cache and memory bandwidth. In order to minimize prefetcher-caused interference, prior mechanisms have been proposed to dynamically control prefetcher aggressiveness at run-time. These mechanisms use several parameters to capture prefetch usefulness as well as prefetcher-caused interference, performing aggressive control decisions. However, these mechanisms do not capture the actual interference at the shared resources and most often lead to incorrect aggressiveness control decisions. Therefore, prior works leave scope for performance improvement. Towards this end, we propose a solution to manage prefetching in multi-core systems. In particular, we make two fundamental observations: First, a strong positive correlation exists between the accuracy of a prefetcher and the amount of prefetch requests it generates relative to an application's total (demand and prefetch) requests. Second, a strong positive correlation exists between the ratio of total prefetch to demand requests and the ratio of average last level cache miss service times of demand to prefetch requests. In this paper, we propose Band-pass prefetching that builds on those two observations, a simple and low-overhead mechanism to effectively manage prefetchers in multi-core systems. Our solution consists of local and global prefetcher aggressiveness control components, which altogether, control the flow of prefetch requests between a range of prefetch to demand requests ratios. From our experiments on 16-core multi-programmed workloads, on systems using stream prefetching, we observe that Band-pass prefetching achieves 12.4% (geometric-mean) improvement on harmonic speedup over the baseline that implements no prefetching, while aggressive prefetching without prefetcher aggressiveness control and state-of-the-art HPAC, P-FST, and CAFFEINE achieve 8.2%, 8.4%, 1.4%, and 9.7%, respectively. Further evaluation of the proposed Band-pass prefetching mechanism on systems using AMPM prefetcher shows similar performance trends. For a 16-core system, Band-pass prefetching requires only a modest hardware cost of 239 bytes.
Type de document :
Article dans une revue
ACM Transactions on Architecture and Code Optimization, Association for Computing Machinery, 2017
Liste complète des métadonnées

Littérature citée [41 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01519648
Contributeur : André Seznec <>
Soumis le : mardi 9 mai 2017 - 08:46:19
Dernière modification le : mercredi 16 mai 2018 - 11:24:13
Document(s) archivé(s) le : jeudi 10 août 2017 - 12:22:23

Fichier

Band-passPrefetching_CameraRea...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01519648, version 1

Citation

Aswinkumar Sridharan, Biswabandan Panda, André Seznec. A Band-pass Prefetching : An Effective Prefetch Management Mechanism using Prefetch-fraction Metric in Multi-core Systems. ACM Transactions on Architecture and Code Optimization, Association for Computing Machinery, 2017. 〈hal-01519648〉

Partager

Métriques

Consultations de la notice

471

Téléchargements de fichiers

245