Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

On the Optimization of Iterative Programming with Distributed Data Collections

Sarah Chlyah 1 Nils Gesbert 1 Pierre Genevès 1 Nabil Layaïda 1
1 TYREX [2020-....] - Types and Reasoning for the Web [2020-....]
LIG [2020-....] - Laboratoire d'Informatique de Grenoble [2020-....], Inria Grenoble - Rhône-Alpes
Abstract : Big data programming frameworks are becoming increasingly important for the development of applications, for which performance and scalability are critical. In those complex frameworks, optimizing codeby hand is hard and time-consuming, making automated optimization particularly necessary. In order to automate optimization, a prerequisite isto find suitable abstractions to represent programs; for instance, algebras based on monads or monoids to represent distributed data collections.Currently, however, such algebras do not represent recursive programs in a way which allows analyzing or rewriting them. In this paper, we extend a monoid algebra with a fixpoint operator for representing recursion as a first class citizen and show how it allows new optimizations. Experiments with the Spark platform illustrate performance gains brought by these systematic optimizations
Complete list of metadatas

Cited literature [32 references]  Display  Hide  Download

https://hal.inria.fr/hal-02066649
Contributor : Tyrex Equipe <>
Submitted on : Friday, October 16, 2020 - 10:58:42 AM
Last modification on : Saturday, October 17, 2020 - 3:33:37 AM

File

paper.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02066649, version 3

Collections

LIG | UGA | CNRS | INRIA

Citation

Sarah Chlyah, Nils Gesbert, Pierre Genevès, Nabil Layaïda. On the Optimization of Iterative Programming with Distributed Data Collections. 2020. ⟨hal-02066649v3⟩

Share

Metrics

Record views

66

Files downloads

28