Accelerating Data Movement on Future Chip Multi-Processors

Abstract : Moving data between cores on hardware coherent architectures suffers from memory latency and causes cache misses and o-herence traffic, which are obstacles to achieving high perform-ance. In this paper, we evaluate the potential for hardware opti-mization of message data transfer on chip multiprocessors with a combination of NAS parallel MPI benchmarks, Intel IMB MPI benchmarks, and a few microbenchmarks on a full-system simulator based on Simics and FeS2. We show that while pas-sive hardware driven by cores can reduce cache traffic, it pro-vides limited performance gains. We propose a data move-ment manager (DMM) that uses the on-chip coherence protocols to implement zero-copy message passing between separate ad-dress spaces and to remove synchronization and copy overheads from the processors. We also discuss methods for managing data placement in caches to reduce latency. We show that such a design shows substantial promise for both cache traffic reduc-tion and performance improvements.
Type de document :
Communication dans un congrès
Hisham El-Shishiny and Erven Rohou. IFMT'10 - Second International Forum on Next Generation Multicore/Manycore Technologies, Jun 2010, Saint Malo, France. ACM, 2010
Liste complète des métadonnées

https://hal.inria.fr/inria-00492860
Contributeur : Ist Rennes <>
Soumis le : jeudi 17 juin 2010 - 12:18:15
Dernière modification le : lundi 20 juin 2016 - 14:10:32

Identifiants

  • HAL Id : inria-00492860, version 1

Collections

Citation

Junli Gu, Rakesh Kumar, Steven S. Lumetta, Yihe Sun. Accelerating Data Movement on Future Chip Multi-Processors. Hisham El-Shishiny and Erven Rohou. IFMT'10 - Second International Forum on Next Generation Multicore/Manycore Technologies, Jun 2010, Saint Malo, France. ACM, 2010. 〈inria-00492860〉

Partager

Métriques

Consultations de la notice

23