Layer Decomposition: An Effective Structure-based Approach for Scientific Workflow Similarity

Johannes Starlinger 1 Sarah Cohen-Boulakia 2, 3, 4, 5, 6 Sanjeev Khanna 7 Susan Davidson 7 Ulf Leser 1
3 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
5 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
6 VIRTUAL PLANTS - Modeling plant morphogenesis at different scales, from genes to phenotype
CRISAM - Inria Sophia Antipolis - Méditerranée , INRA - Institut National de la Recherche Agronomique, Centre de coopération internationale en recherche agronomique pour le développement [CIRAD] : UMR51
Abstract : Scientific workflows have become a valuable tool for large-scale data processing and analysis. This has led to the creation of specialized online repositories to facilitate workflow sharing and reuse. Over time, these repositories have grown to sizes that call for advanced methods to support workflow discovery, in particular for effective similarity search. Here, we present a novel and intuitive workflow similarity measure that is based on layer decomposition. Layer decomposition accounts for the directed dataflow underlying scientific workflows, a property which has not been adequately considered in previous methods. We comparatively evaluate our algorithm using a gold standard for 24 query workflows from a repository of almost 1500 scientific workflows, and show that it a) delivers the best results for similarity search, b) has a much lower runtime than other, often highly complex competitors in structure-aware workflow comparison, and c) can be stacked easily with even faster, structure-agnostic approaches to further reduce runtime while retaining result quality.
Type de document :
Communication dans un congrès
IEEE e-Science conference, Oct 2014, Guarujá, Brazil. 2014
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01066076
Contributeur : Sarah Cohen-Boulakia <>
Soumis le : vendredi 19 septembre 2014 - 10:34:09
Dernière modification le : mercredi 14 novembre 2018 - 16:08:06
Document(s) archivé(s) le : samedi 20 décembre 2014 - 11:01:27

Fichier

starlingerEscience.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01066076, version 1

Citation

Johannes Starlinger, Sarah Cohen-Boulakia, Sanjeev Khanna, Susan Davidson, Ulf Leser. Layer Decomposition: An Effective Structure-based Approach for Scientific Workflow Similarity. IEEE e-Science conference, Oct 2014, Guarujá, Brazil. 2014. 〈hal-01066076〉

Partager

Métriques

Consultations de la notice

1018

Téléchargements de fichiers

497