DistillFlow: removing redundancy in scientific workflows

Jiuqiang Chen 1, 2, 3 Sarah Cohen-Boulakia 3, 2 Christine Froidevaux 2, 3 Carole Goble 4 Paolo Missier 4 Alan Williams 4
2 AMIB - Algorithms and Models for Integrative Biology
LIX - Laboratoire d'informatique de l'École polytechnique [Palaiseau], LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France
Abstract : Scientific workflows management systems are increasingly used by scientists to specify complex data processing pipelines. Workflows are represented using a graph structure, where nodes represent tasks and links represent the dataflow. However, the complexity of workflow structures is increasing over time, reducing the rate of scientific workflows reuse. Here, we introduce DistillFlow, a tool based on effective methods for workflow design, with a focus on the Taverna model. DistillFlow is able to detect "anti-patterns" in the structure of workflows (idiomatic forms that lead to over-complicated design) and replace them with different patterns to reduce the workflow's overall structural complexity. Rewriting workflows in this way is beneficial both in terms of user experience and workflow maintenance.
Document type :
Conference papers
Complete list of metadatas

Cited literature [6 references]  Display  Hide  Download

Contributor : Sarah Cohen-Boulakia <>
Submitted on : Thursday, December 4, 2014 - 10:06:55 PM
Last modification on : Wednesday, March 27, 2019 - 4:41:29 PM
Long-term archiving on : Monday, March 9, 2015 - 5:59:37 AM


Files produced by the author(s)




Jiuqiang Chen, Sarah Cohen-Boulakia, Christine Froidevaux, Carole Goble, Paolo Missier, et al.. DistillFlow: removing redundancy in scientific workflows. SSDBM '14 Proceedings of the 26th International Conference on Scientific and Statistical Database Management, Jun 2014, Aalborg, Denmark. ⟨10.1145/2618243.2618287⟩. ⟨hal-01091033⟩



Record views


Files downloads