Automatic Data Filtering for In Situ Workflows

Clément Mommessin 1 Matthieu Dreher 1 Tom Peterka 1 Bruno Raffin 2
2 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : In situ workflows contain tasks that exchange messages composed of several data fields. However, a consumer task may not necessarily need all the data fields from its producer. For example, a molecular dynamics simulation can produce atom positions, velocities, and forces; but some analyses require only atom positions. The user should decide whether to specialize the output of a producer task for a particular consumer and get better performance or to send more data than required by the consumer. The first option limits task portability, while the second wastes resources. In this paper, we introduce contracts for in situ tasks. A contract specifies for a producer each data field available for output and for a consumer the data fields needed as input. Comparing a producer and consumer contract allows automatic selection of the data fields a producer has to send for that consumer. We integrated our contracts mechanism within Decaf, a middleware for building and executing in situ workflows. Contracts enable to automatically extract at the producer the data the consumer needs. We evaluate the cost and performance of message extraction at runtime with both synthetic examples and a real scientific workflow coupling a molecular dynamics simulation with three different data analytics codes. Our contract-based automatic data extraction removes the need to specialize producers while entailing small overheads.
Type de document :
Communication dans un congrès
IEEE International Conference on Cluster Computing, Sep 2017, Hawai, United States. 2017
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01581032
Contributeur : Clement Mommessin <>
Soumis le : lundi 4 septembre 2017 - 10:32:37
Dernière modification le : lundi 4 septembre 2017 - 11:41:55

Fichier

Automatic Data Filtering for I...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01581032, version 1

Citation

Clément Mommessin, Matthieu Dreher, Tom Peterka, Bruno Raffin. Automatic Data Filtering for In Situ Workflows. IEEE International Conference on Cluster Computing, Sep 2017, Hawai, United States. 2017. 〈hal-01581032〉

Partager

Métriques

Consultations de
la notice

33

Téléchargements du document

21