Experimental Study on the Performance and Resource Utilization of Data Streaming Frameworks

Subarna Chatterjee; Christine Morin

doi:10.1109/CCGRID.2018.00029

Communication Dans Un Congrès Année : 2018

Experimental Study on the Performance and Resource Utilization of Data Streaming Frameworks

(1) , (1)

Subarna Chatterjee

Fonction : Auteur

Design and Implementation of Autonomous Distributed Systems

Christine Morin

Fonction : Auteur
PersonId : 1557
IdHAL : christine-morin
IdRef : 059647485

Design and Implementation of Autonomous Distributed Systems

Résumé

With the advent of the Internet of Things (IoT), data stream processing have gained increased attention due to the ever-increasing need to process heterogeneous and voluminous data streams. This work addresses the problem of selecting a correct stream processing framework for a given application to be executed within a specific physical infrastructure. For this purpose, we focus on a thorough comparative analysis of three data stream processing platforms – Apache Flink, Apache Storm, and Twitter Heron (the enhanced version of Apache Storm), that are chosen based on their potential to process both streams and batches in real-time. The goal of the work is to enlighten the cloud-clients and the cloud-providers with the knowledge of the choice of the resource-efficient and requirement-adaptive streaming platform for a given application so that they can plan during allocation or assignment of Virtual Machines for application execution. For the comparative performance analysis of the chosen platforms, we have experimented using 8-node clusters on Grid5000 experimentation testbed and have selected a wide variety of applications ranging from a conventional benchmark to sensor-based IoT application and statistical batch processing application. In addition to the various performance metrics related to the elasticity and resource usage of the platforms, this work presents a comparative study of the " green-ness " of the streaming platforms by analyzing their power consumption – one of the first attempts of its kind. The obtained results are thoroughly analyzed to illustrate the functional behavior of these platforms under different computing scenarios.

Mots clés

Apache Flink Apache Spark Internet of Things Stream processing Twitter Heron

Domaines

Système d'exploitation [cs.OS] Calcul parallèle, distribué et partagé [cs.DC]

Fichier principal

1st.pdf (2.39 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Guillaume Pierre : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01823697

Soumis le : mardi 26 juin 2018-12:56:25

Dernière modification le : vendredi 24 mars 2023-14:53:07

Archivage à long terme le : mercredi 26 septembre 2018-22:10:56

Dates et versions

hal-01823697 , version 1 (26-06-2018)

Identifiants

HAL Id : hal-01823697 , version 1
DOI : 10.1109/CCGRID.2018.00029

Citer

Subarna Chatterjee, Christine Morin. Experimental Study on the Performance and Resource Utilization of Data Streaming Frameworks. CCGrid 2018 - 18th IEEE/ACM Symposium on Cluster, Cloud and Grid Computing, May 2018, Washington, DC, United States. pp.143-152, ⟨10.1109/CCGRID.2018.00029⟩. ⟨hal-01823697⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA GRID5000 CENTRALESUPELEC INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES SILECS UR1-MATH-NUM

424 Consultations

465 Téléchargements

Experimental Study on the Performance and Resource Utilization of Data Streaming Frameworks

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager