Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

On the Optimization of Iterative Programming with Distributed Data Collections

Sarah Chlyah 1 Nils Gesbert 1 Pierre Genevès 1 Nabil Layaïda 1
1 TYREX - Types and Reasoning for the Web
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : Big data programming frameworks are becoming increasinglyimportant for the development of applications, for which performanceand scalability are critical. In those complex frameworks, optimizing codeby hand is hard and time-consuming, making automated optimizationparticularly necessary. In order to automate optimization, a prerequisite isto find suitable abstractions to represent programs; for instance, algebrasbased on monads or monoids to represent distributed data collections.Currently, however, such algebras do not represent recursive programs ina way which allows analyzing or rewriting them. In this paper, we extenda monoid algebra with a fixpoint operator for representing recursion as afirst class citizen and show how it allows new optimizations. Experimentswith the Spark platform illustrate performance gains brought by thesesystematic optimizations
Complete list of metadatas

Cited literature [32 references]  Display  Hide  Download
Contributor : Tyrex Equipe <>
Submitted on : Friday, October 16, 2020 - 9:15:43 AM
Last modification on : Tuesday, November 24, 2020 - 4:00:19 PM


paper (2).pdf
Files produced by the author(s)


  • HAL Id : hal-02066649, version 2


Sarah Chlyah, Nils Gesbert, Pierre Genevès, Nabil Layaïda. On the Optimization of Iterative Programming with Distributed Data Collections. 2020. ⟨hal-02066649v2⟩



Record views


Files downloads