PigReuse: A Reuse-based Optimizer for Pig Latin

Abstract : Pig Latin is a popular language which is widely used for parallel processing of massive data sets. Currently, subexpressions occurring repeatedly in Pig Latin scripts are executed as many times as they appear, and the current Pig Latin optimizer does not identify reuse opportunities. We present a novel optimization approach aiming at identifying and reusing repeated subexpressions in Pig Latin scripts. Our optimization algorithm, named PigReuse, operates on a particular algebraic representation of Pig Latin scripts. PigReuse identifies subexpression merging opportunities, selects the best ones to execute based on a cost function, and reuses their results as needed in order to compute exactly the same output as the original scripts. Our experiments demonstrate the effectiveness of our approach.
Type de document :
Rapport
[Technical Report] Inria Saclay. 2016
Liste complète des métadonnées

Littérature citée [28 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01353891
Contributeur : Jesús Camacho-Rodríguez <>
Soumis le : jeudi 18 août 2016 - 13:24:47
Dernière modification le : jeudi 11 janvier 2018 - 06:19:44
Document(s) archivé(s) le : samedi 19 novembre 2016 - 20:31:18

Fichier

pigreuse-technical-report.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01353891, version 1

Collections

Citation

Jesús Camacho-Rodríguez, Dario Colazzo, Melanie Herschel, Ioana Manolescu, Soudip Roy Chowdhury. PigReuse: A Reuse-based Optimizer for Pig Latin. [Technical Report] Inria Saclay. 2016. 〈hal-01353891〉

Partager

Métriques

Consultations de la notice

368

Téléchargements de fichiers

229