Dual-Paradigm Stream Processing

Abstract : Existing stream processing frameworks operate either under data stream paradigm processing data record by record to favor low latency, or under operation stream paradigm processing data in micro-batches to desire high throughput. For complex and mutable data processing requirements, this dilemma brings the selection and deployment of stream processing frameworks into an embarrass- ing situation. Moreover, current data stream or operation stream paradigms cannot handle data burst e ciently, which probably re- sults in noticeable performance degradation. This paper introduces a dual-paradigm stream processing, called DO (Data and Operation) that can adapt to stream data volatility. It enables data to be pro- cessed in micro-batches (i.e., operation stream) when data burst occurs to achieve high throughput, while data is processed record by record (i.e., data stream) in the remaining time to sustain low latency. DO embraces a method to detect data bursts, identify the main operations a ected by the data burst and switch paradigms accordingly. Our insight behind DO’s design is that the trade-o between latency and throughput of stream processing frameworks can be dynamically achieved according to data communication among operations in a ne-grained manner (i.e., operation level) instead of framework level. We implement a prototype stream pro- cessing framework that adopts DO. Our experimental results show that our framework with DO can achieve 5x speedup over operation stream under low data stream sizes, and outperforms data stream on throughput by 2.1x to 3.2x under data burst.
Complete list of metadatas

Cited literature [28 references]  Display  Hide  Download

https://hal.inria.fr/hal-01834668
Contributor : Shadi Ibrahim <>
Submitted on : Friday, September 21, 2018 - 2:09:14 PM
Last modification on : Friday, July 12, 2019 - 11:55:09 AM
Long-term archiving on : Saturday, December 22, 2018 - 4:37:20 PM

File

ICPP2018-Stream.pdf
Files produced by the author(s)

Identifiers

Citation

Song Wu, Zhiyi Liu, Shadi Ibrahim, Lin Gu, Hai Jin, et al.. Dual-Paradigm Stream Processing. ICPP 2018 - 47th International Conference on Parallel Processing, Aug 2018, Eugene, United States. pp.Article No. 83, ⟨10.1145/3225058.3225120⟩. ⟨hal-01834668⟩

Share

Metrics

Record views

404

Files downloads

514