Exploring Shared State in Key-Value Store for Window-Based Multi-Pattern Streaming Analytics

Abstract : We are now witnessing an unprecedented growth of data that needs to be processed at always increasing rates in order to extract valuable insights. Big Data streaming analytics tools have been developed to cope with the online dimension of data processing: they enable real-time handling of live data sources by means of stateful aggregations (operators). Current state-of-art frameworks (e.g. Apache Flink [1]) enable each operator to work in isolation by creating data copies, at the expense of increased memory utilization. In this paper, we explore the feasibility of deduplication techniques to address the challenge of reducing memory footprint for window-based stream processing without significant impact on performance. We design a deduplication method specifically for window-based operators that rely on key-value stores to hold a shared state. We experiment with a synthetically generated workload while considering several deduplication scenarios and based on the results, we identify several potential areas of improvement. Our key finding is that more fine-grained interactions between streaming engines and (key-value) stores need to be designed in order to better respond to scenarios that have to overcome memory scarcity.
Type de document :
Communication dans un congrès
Workshop on the Integration of Extreme Scale Computing and Big Data Management and Analytics in conjunction with IEEE/ACM CCGrid 2017, May 2017, Madrid, Spain
Liste complète des métadonnées

Littérature citée [16 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01530744
Contributeur : Ovidiu-Cristian Marcu <>
Soumis le : mercredi 31 mai 2017 - 18:29:39
Dernière modification le : mercredi 16 mai 2018 - 11:24:13
Document(s) archivé(s) le : mercredi 6 septembre 2017 - 17:28:14

Fichier

PID4664669.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01530744, version 1

Citation

Ovidiu-Cristian Marcu, Radu Tudoran, Bogdan Nicolae, Alexandru Costan, Gabriel Antoniu, et al.. Exploring Shared State in Key-Value Store for Window-Based Multi-Pattern Streaming Analytics. Workshop on the Integration of Extreme Scale Computing and Big Data Management and Analytics in conjunction with IEEE/ACM CCGrid 2017, May 2017, Madrid, Spain. 〈hal-01530744〉

Partager

Métriques

Consultations de la notice

448

Téléchargements de fichiers

850