Instrumental Data Management and Scientific Workflow Execution: the CEA case study

Francieli Zanon Boito 1, 2, 3 Jean-François Méhaut 2 Thierry Deutsch 4 Brice Videau 4 Frédéric Desprez 3, 2
1 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
3 CORSE - Compiler Optimization and Run-time Systems
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
4 LSIM - Laboratory of Atomistic Simulation
MEM - Modélisation et Exploration des Matériaux : DRF/INAC/MEM
Abstract : In this paper, we study a typical scenario in research facilities. Instrumental data is generated by lab equipment such as microscopes, collected by researchers into USB devices, and analyzed in their own computers. In this scenario, an instrumental data management framework could store data in a institution-level storage infrastructure and allow to execute tasks to analyze this data in some available processing nodes. This setup has the advantages of promoting reproducible research and the efficient usage of the expensive lab equipment (in addition to increasing researchers productivity). We detail the requirements for such a framework regarding the needs of our case study of the CEA, review existing solutions and recommend the choice of Galaxy. We then analyze the performance limitations of the proposed architecture, and point to the connection between centralized storage and the processing nodes as the critical point. We also conduct a performance evaluation over an experimental platform to observe the limitations encountered in practice. We finish by pointing issues that are not addressed by existing solutions, and are therefore future work perspectives for the research field.
Complete list of metadatas

https://hal.inria.fr/hal-02076963
Contributor : Francieli Zanon Boito <>
Submitted on : Friday, March 22, 2019 - 2:53:35 PM
Last modification on : Tuesday, April 16, 2019 - 8:49:52 AM

File

mpp2019 (1).pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02076963, version 1

Citation

Francieli Zanon Boito, Jean-François Méhaut, Thierry Deutsch, Brice Videau, Frédéric Desprez. Instrumental Data Management and Scientific Workflow Execution: the CEA case study. IPDPSW 2019 - International Parallel and Distributed Processing Symposium Workshops (MPP - Parallel Programming Model), May 2019, Rio de Janeiro, Brazil. pp.1-8. ⟨hal-02076963⟩

Share

Metrics

Record views

123

Files downloads

70