Skip to Main content Skip to Navigation
Conference papers

Accelerating Data Movement on Future Chip Multi-Processors

Abstract : Moving data between cores on hardware coherent architectures suffers from memory latency and causes cache misses and o-herence traffic, which are obstacles to achieving high perform-ance. In this paper, we evaluate the potential for hardware opti-mization of message data transfer on chip multiprocessors with a combination of NAS parallel MPI benchmarks, Intel IMB MPI benchmarks, and a few microbenchmarks on a full-system simulator based on Simics and FeS2. We show that while pas-sive hardware driven by cores can reduce cache traffic, it pro-vides limited performance gains. We propose a data move-ment manager (DMM) that uses the on-chip coherence protocols to implement zero-copy message passing between separate ad-dress spaces and to remove synchronization and copy overheads from the processors. We also discuss methods for managing data placement in caches to reduce latency. We show that such a design shows substantial promise for both cache traffic reduc-tion and performance improvements.
Document type :
Conference papers
Complete list of metadata
Contributor : Ist Rennes Connect in order to contact the contributor
Submitted on : Thursday, June 17, 2010 - 12:18:15 PM
Last modification on : Tuesday, June 1, 2021 - 2:34:08 PM


  • HAL Id : inria-00492860, version 1



Junli Gu, Rakesh Kumar, Steven S. Lumetta, yihe Sun. Accelerating Data Movement on Future Chip Multi-Processors. IFMT'10 - Second International Forum on Next Generation Multicore/Manycore Technologies, Jun 2010, Saint Malo, France. ⟨inria-00492860⟩



Record views