Instrumental Data Management and Scientific Workflow Execution: the CEA case study

Francieli Zanon Boito 1, 2, 3 Jean-François Méhaut 2 Thierry Deutsch 4 Brice Videau 4 Frédéric Desprez 3, 2
1 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
3 CORSE - Compiler Optimization and Run-time Systems
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
4 LSIM - Laboratory of Atomistic Simulation
MEM - Modélisation et Exploration des Matériaux : DRF/INAC/MEM
Abstract : In this paper, we study a typical scenario in research facilities. Instrumental data is generated by lab equipment such as microscopes, collected by researchers into USB devices, and analyzed in their own computers. In this scenario, an instrumental data management framework could store data in a institutionlevel storage infrastructure and allow to execute tasks to analyze this data in some available processing nodes. This setup has the advantages of promoting reproducible research and the efficient usage of the expensive lab equipment (in addition to increasing researchers productivity). We detail the requirements for such a framework regarding the needs of our case study of the CEA, review existing solutions and recommend the choice of Galaxy. We then analyze the performance limitations of the proposed architecture, and point to the connection between centralized storage and the processing nodes as the critical point. We also conduct a performance evaluation over an experimental platform to observe the limitations encountered in practice. We finish by pointing issues that are not addressed by existing solutions, and are therefore future work perspectives for the research field.
Complete list of metadatas

Cited literature [18 references]  Display  Hide  Download

https://hal.inria.fr/hal-02076963
Contributor : Francieli Zanon Boito <>
Submitted on : Friday, March 22, 2019 - 2:53:35 PM
Last modification on : Tuesday, October 15, 2019 - 4:34:04 PM
Long-term archiving on : Sunday, June 23, 2019 - 3:11:22 PM

File

mpp2019 (1).pdf
Files produced by the author(s)

Identifiers

Citation

Francieli Zanon Boito, Jean-François Méhaut, Thierry Deutsch, Brice Videau, Frédéric Desprez. Instrumental Data Management and Scientific Workflow Execution: the CEA case study. IPDPSW 2019 - International Parallel and Distributed Processing Symposium Workshops, May 2019, Rio de Janeiro, Brazil. pp.850-857, ⟨10.1109/IPDPSW.2019.00139⟩. ⟨hal-02076963⟩

Share

Metrics

Record views

202

Files downloads

129