Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads

Abstract : Simultaneous Multi-Threading (SMT) is a hardware model in which different threads share the same processing unit. This model is a compromise between high parallelism and low hardware cost. Minimal Multi-Threading (MMT) is one architecture recently proposed that shares instruction decoding and execution between threads running the same program in an SMT processor, thereby generalizing the approach followed by Graphics Processing Units to general-purpose processors. In this paper we propose new ways to expose redundancies in the MMT execution model. First, we propose and evaluate a new thread reconvergence heuristic that handles function calls better than previous approaches. Our heuristic only inspects the program counter and the stack frame to reconverge threads; hence, it is amenable to efficient and inexpensive hardware implementation. Second, we demonstrate that this heuristic is able to reveal the existence of substantial regularity in inter-thread memory access patterns. We validate our results on data-parallel applications from the PARSEC and SPLASH suites. Our new reconvergence heuristic increases the throughput of our MMT model by 7%, when compared to a previous, and substantially more complex approach, due to Long et al. Moreover, it gives us an effective way to increase regularity in memory accesses. We have observed that over 70% of simultaneous memory accesses are either the same for all the threads, or are affine expressions of the thread identifier. This observation motivates the design of newly proposed hardware that benefits from regularity in inter-thread memory accesses.
Document type :
Journal articles
Complete list of metadatas

https://hal.inria.fr/hal-01087054
Contributor : Caroline Collange <>
Submitted on : Tuesday, November 25, 2014 - 2:22:47 PM
Last modification on : Tuesday, October 15, 2019 - 6:02:06 PM

Identifiers

Citation

Teo Milanez, Sylvain Collange, Fernando Magno Quintão Pereira, Wagner Meira, Renato A. Ferreira. Thread scheduling and memory coalescing for dynamic vectorization of SPMD workloads. Parallel Computing, Elsevier, 2014, 40 (9), pp.548-558. ⟨10.1016/j.parco.2014.03.006⟩. ⟨hal-01087054⟩

Share

Metrics

Record views

487