The impact of cache misses on the performance of matrix product algorithms on multicore platforms

Mathias Jacquelin 1 Loris Marchal 1, * Yves Robert 1
* Corresponding author
1 GRAAL - Algorithms and Scheduling for Distributed Heterogeneous Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : The multicore revolution is underway, bringing new chips introducing more complex memory architectures. Classical algorithms must be revisited in order to take the hierarchical memory layout into account. In this paper, we aim at designing cache-aware algorithms that minimize the number of cache misses paid during the execution of the matrix product kernel on a multicore processor. We analytically show how to achieve the best possible tradeoff between shared and distributed caches. We implement and evaluate several algorithms on two multicore platforms, one equipped with one Xeon quadcore, and the second one enriched with a GPU. It turns out that the impact of cache misses is very different across both platforms, and we identify what are the main design parameters that lead to peak performance for each target hardware configuration.
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/inria-00537822
Contributor : Loris Marchal <>
Submitted on : Friday, November 19, 2010 - 2:32:06 PM
Last modification on : Tuesday, December 11, 2018 - 10:58:05 AM
Long-term archiving on : Friday, October 26, 2012 - 4:02:11 PM

File

RR-7456.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : inria-00537822, version 1

Collections

Citation

Mathias Jacquelin, Loris Marchal, Yves Robert. The impact of cache misses on the performance of matrix product algorithms on multicore platforms. [Research Report] RR-7456, INRIA. 2010, pp.32. ⟨inria-00537822⟩

Share

Metrics

Record views

227

Files downloads

236