Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization

Alain Ketterlin 1, 2, 3 Philippe Clauss 1, 2, 3
Abstract : This paper describes a tool using one or more executions of a sequential program to detect parallel portions of the program. The tool, called Parwiz, uses dynamic binary instrumentation, targets various forms of parallelism, and suggests distinct parallelization actions, ranging from simple directive tagging to elaborate loop transformations. The first part of the paper details the link between the program's static structures (like routines and loops), the memory accesses performed by the program, and the dependencies that are used to highlight potential parallelism. This part also describes the instrumentation involved, and the general architecture of the system. The second part of the paper puts the framework into action. The first study focuses on loop parallelism, targeting OpenMP parallel- for directives, including privatization when necessary. The second study is an adaptation of a well-known vectorization technique based on a slightly richer dependence description, where the tool suggests an elaborate loop transformation. The third study views loops as a graph of (hopefully lightly) dependent iterations. The third part of the paper explains how the overall cost of data- dependence profiling can be reduced. This cost has two major causes: first, instrumenting memory accesses slows down the program, and second, turning memory accesses into dependence graphs consumes processing time. Parwiz uses static analysis of the original (binary) program to provide data at a coarser level, moving from individual accesses to complete loops whenever possible, thereby reducing the impact of both sources of inefficiency.
Document type :
Conference papers
Complete list of metadatas
Contributor : Philippe Clauss <>
Submitted on : Thursday, January 24, 2013 - 6:03:13 PM
Last modification on : Friday, January 12, 2018 - 1:11:56 AM


  • HAL Id : hal-00780782, version 1



Alain Ketterlin, Philippe Clauss. Profiling Data-Dependence to Assist Parallelization: Framework, Scope, and Optimization. MICRO-45, The 45th Annual IEEE/ACM International Symposium on Microarchitecture, Dec 2012, Vancouver, Canada. ⟨hal-00780782⟩



Record views