Storage and Ingestion Systems in Support of Stream Processing: A Survey

Abstract : Under the pressure of massive, exponentially increasing amounts of heterogeneous data that are generated faster and faster, Big Data analytics applications have seen a shift from batch processing to stream processing, which can reduce the time needed to obtain meaningful insight dramatically. Stream processing is particularly well suited to address the challenges of fog/edge computing: much of this massive data comes from Internet of Things (IoT) devices and needs to be continuously funneled through an edge infrastructure towards centralized clouds. Thus, it is only natural to process data on their way as much as possible rather than wait for streams to accumulate on the cloud. Unfortunately, state-of-the-art stream processing systems are not well suited for this role: the data are accumulated (ingested), processed and persisted (stored) separately, often using different services hosted on different physical machines/clusters. Furthermore, there is only limited support for advanced data manipulations, which often forces application developers to introduce custom solutions and workarounds. In this survey article, we characterize the main state-of-the-art stream storage and ingestion systems. We identify the key aspects and discuss limitations and missing features in the context of stream processing for fog/edge and cloud computing. The goal is to help practitioners understand and prepare for potential bottlenecks when using such state-of-the-art systems. In particular, we discuss both functional (partitioning, metadata, search support, message routing, backpressure support) and non-functional aspects (high availability, durability, scalability, latency vs. throughput). As a conclusion of our study, we advocate for a unified stream storage and ingestion system to speed-up data management and reduce I/O redundancy (both in terms of storage space and network utilization).
Type de document :
Rapport
[Technical Report] RT-0501, INRIA Rennes - Bretagne Atlantique and University of Rennes 1, France. 2018, pp.1-33
Liste complète des métadonnées

https://hal.inria.fr/hal-01939280
Contributeur : Ovidiu-Cristian Marcu <>
Soumis le : vendredi 14 décembre 2018 - 15:22:16
Dernière modification le : jeudi 7 février 2019 - 15:38:47

Fichier

RT-0501v2.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01939280, version 2

Citation

Ovidiu-Cristian Marcu, Alexandru Costan, Gabriel Antoniu, María Pérez-Hernández, Radu Tudoran, et al.. Storage and Ingestion Systems in Support of Stream Processing: A Survey. [Technical Report] RT-0501, INRIA Rennes - Bretagne Atlantique and University of Rennes 1, France. 2018, pp.1-33. 〈hal-01939280v2〉

Partager

Métriques

Consultations de la notice

69

Téléchargements de fichiers

153