DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Communication Dans Un Congrès Année : 2017

DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems

Résumé

Supporting high throughput in Distributed Stream Processing Systems (DSPSs) has been an important goal in recent years. Current works either focus on automatically increasing the system resources whenever the current setup is inadequate or apply load shedding techniques discarding some of the incoming data. However, both approaches have significant shortcomings as they require on the fly application reconfiguration where the application needs to be stopped and re-uploaded in the cluster with the new configurations, and can lead to significant information loss. One approach that has not yet been considered for improving the throughput of DSPSs is exploiting compression algorithms to minimize the communication overhead between components especially in cases where we have large-sized data like live CCTV camera reports. This work is the first that provides a novel framework, built on top of Apache Storm, which enables dynamic compression of incoming streaming data. Our approach uses a profiling algorithm to automatically determine the compression algorithm that should be applied and supports both lossless and lossy compression techniques. Furthermore, we propose a novel algorithm for determining when profiling should be applied. Finally, our detailed experimental evaluation with commonly used stream processing applications, indicates a clear improvement on the applications’ throughput when our proposed techniques are applied.
Fichier principal
Vignette du fichier
450046_1_En_2_Chapter.pdf (502.99 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01800129 , version 1 (25-05-2018)

Licence

Paternité

Identifiants

Citer

Nikos Zacheilas, Vana Kalogeraki. DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems. 17th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2017, Neuchâtel, Switzerland. pp.19-33, ⟨10.1007/978-3-319-59665-5_2⟩. ⟨hal-01800129⟩
459 Consultations
78 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More