Loop-based Modeling of Parallel Communication Traces

Alain Ketterlin 1, 2, 3 Matthieu Kuhn 1 Stéphane Genaud 1 Philippe Clauss 1, 2, 3
3 CAMUS - Compilation pour les Architectures MUlti-coeurS
Inria Nancy - Grand Est, ICube - Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie
Abstract : This paper describes an algorithm that takes a trace of a distributed program and builds a model of all communications of the program. The model is a set of nested loops representing repeated patterns. Loop bodies collect events representing communication actions performed by the various processes, like sending or receiving messages, and participating in collective operations. The model can be used for compact visualization of full executions, for program understanding and debugging, and also for building statistical analyzes of various quantitative aspects of the program's behavior. The construction of the communication model is performed in two phases. First, a local model is built for each process, capturing local regularities; this phase is incremental and fast, and can be done on-line, during the execution. The second phase is a reduction process that collects, aligns, and finally merges all local models into a global, system-wide model. This global model is a compact representation of all communications of the original program, capturing patterns across groups of processes. It can be visualized directly and, because it takes the form of a sequence of loop nests, can be used to replay the original program's communication actions. Because the model is based on communication events only, it completely ignores other quantitative aspects like timestamps or messages sizes. Including such data would in most case break regularities, reducing the usefulness of trace-based modeling. Instead, the paper shows how one can efficiently access quantitative data kept in the original trace(s), by annotating the model and compiling data scanners automatically.
Complete list of metadatas

Cited literature [14 references]  Display  Hide  Download

https://hal.inria.fr/hal-01044636
Contributor : Alain Ketterlin <>
Submitted on : Wednesday, July 23, 2014 - 8:42:48 PM
Last modification on : Saturday, October 27, 2018 - 1:24:02 AM
Long-term archiving on : Tuesday, November 25, 2014 - 3:26:54 PM

File

RR-8562.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01044636, version 1

Citation

Alain Ketterlin, Matthieu Kuhn, Stéphane Genaud, Philippe Clauss. Loop-based Modeling of Parallel Communication Traces. [Research Report] RR-8562, INRIA. 2014, pp.10. ⟨hal-01044636⟩

Share

Metrics

Record views

248

Files downloads

156