Dual-Paradigm Stream Processing

Song Wu 1 Zhiyi Liu 1 Shadi Ibrahim 2 Lin Gu 1 Hai Jin 1 Fei Chen 1
2 STACK - Software Stack for Massively Geo-Distributed Infrastructures
Inria Rennes – Bretagne Atlantique , LS2N - Laboratoire des Sciences du Numérique de Nantes
Abstract : Existing stream processing frameworks operate either under data stream paradigm processing data record by record to favor low latency, or under operation stream paradigm processing data in micro-batches to desire high throughput. For complex and mutable data processing requirements, this dilemma brings the selection and deployment of stream processing frameworks into an embarrass- ing situation. Moreover, current data stream or operation stream paradigms cannot handle data burst e ciently, which probably re- sults in noticeable performance degradation. This paper introduces a dual-paradigm stream processing, called DO (Data and Operation) that can adapt to stream data volatility. It enables data to be pro- cessed in micro-batches (i.e., operation stream) when data burst occurs to achieve high throughput, while data is processed record by record (i.e., data stream) in the remaining time to sustain low latency. DO embraces a method to detect data bursts, identify the main operations a ected by the data burst and switch paradigms accordingly. Our insight behind DO’s design is that the trade-o between latency and throughput of stream processing frameworks can be dynamically achieved according to data communication among operations in a ne-grained manner (i.e., operation level) instead of framework level. We implement a prototype stream pro- cessing framework that adopts DO. Our experimental results show that our framework with DO can achieve 5x speedup over operation stream under low data stream sizes, and outperforms data stream on throughput by 2.1x to 3.2x under data burst.
Type de document :
Communication dans un congrès
ICPP 2018 - 47th International Conference on Parallel Processing, Aug 2018, Eugene, United States
Liste complète des métadonnées

https://hal.inria.fr/hal-01834668
Contributeur : Shadi Ibrahim <>
Soumis le : vendredi 21 septembre 2018 - 14:09:14
Dernière modification le : vendredi 21 septembre 2018 - 15:07:32

Fichier

ICPP2018-Stream.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01834668, version 1

Collections

Citation

Song Wu, Zhiyi Liu, Shadi Ibrahim, Lin Gu, Hai Jin, et al.. Dual-Paradigm Stream Processing. ICPP 2018 - 47th International Conference on Parallel Processing, Aug 2018, Eugene, United States. 〈hal-01834668〉

Partager

Métriques

Consultations de la notice

158

Téléchargements de fichiers

3