Skip to Main content Skip to Navigation
Conference papers

Layer Decomposition: An Effective Structure-based Approach for Scientific Workflow Similarity

Johannes Starlinger 1 Sarah Cohen-Boulakia 2, 3, 4, 5, 6 Sanjeev Khanna 7 Susan Davidson 7 Ulf Leser 1
3 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
5 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
6 VIRTUAL PLANTS - Modeling plant morphogenesis at different scales, from genes to phenotype
CRISAM - Inria Sophia Antipolis - Méditerranée , INRA - Institut National de la Recherche Agronomique, UMR AGAP - Amélioration génétique et adaptation des plantes méditerranéennes et tropicales
Abstract : Scientific workflows have become a valuable tool for large-scale data processing and analysis. This has led to the creation of specialized online repositories to facilitate workflow sharing and reuse. Over time, these repositories have grown to sizes that call for advanced methods to support workflow discovery, in particular for effective similarity search. Here, we present a novel and intuitive workflow similarity measure that is based on layer decomposition. Layer decomposition accounts for the directed dataflow underlying scientific workflows, a property which has not been adequately considered in previous methods. We comparatively evaluate our algorithm using a gold standard for 24 query workflows from a repository of almost 1500 scientific workflows, and show that it a) delivers the best results for similarity search, b) has a much lower runtime than other, often highly complex competitors in structure-aware workflow comparison, and c) can be stacked easily with even faster, structure-agnostic approaches to further reduce runtime while retaining result quality.
Complete list of metadata

Cited literature [20 references]  Display  Hide  Download
Contributor : Sarah Cohen-Boulakia Connect in order to contact the contributor
Submitted on : Friday, September 19, 2014 - 10:34:09 AM
Last modification on : Friday, October 22, 2021 - 3:07:18 PM
Long-term archiving on: : Saturday, December 20, 2014 - 11:01:27 AM


Files produced by the author(s)



Johannes Starlinger, Sarah Cohen-Boulakia, Sanjeev Khanna, Susan Davidson, Ulf Leser. Layer Decomposition: An Effective Structure-based Approach for Scientific Workflow Similarity. International Conference on e-Science, Oct 2014, Guarujá, Brazil. pp.169-176, ⟨10.1109/eScience.2014.19⟩. ⟨hal-01066076⟩



Record views


Files downloads