F. "apache,

M. Zaharia, T. Das, H. Li, T. Hunter, S. Shenker et al., Discretized streams: Fault-tolerant streaming computation at scale, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP '13, pp.423-438, 2013.

. Inria,

J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Proceedings of the 6th Conference on Symposium on OSDI. USENIX Association, 2004.

, Hadoop

D. Maier, J. Li, P. Tucker, K. Tufte, and V. Papadimos, Semantics of data streams and operators, Proceedings of the 10th International Conference on Database Theory, ser. ICDT'05, pp.37-52, 2005.

, The world beyond batch: Streaming, vol.101

, The world beyond batch: Streaming 102

A. Arasu, S. Babu, and J. Widom, The cql continuous query language: Semantic foundations and query execution, The VLDB Journal, vol.15, issue.2, pp.121-142, 2006.
DOI : 10.1007/s00778-004-0147-z

G. J. Chen, J. L. Wiener, S. Iyer, A. Jaiswal, R. Lei et al., Realtime data processing at facebook, Proceedings of the 2016 International Conference on Management of Data, ser. SIGMOD '16, pp.1087-1098, 2016.
DOI : 10.1145/2882903.2904441

G. Fox, S. Jha, and L. Ramakrishnan, STREAM2016: Streaming Requirements, Experience, Applications and Middleware Workshop, 2016.
DOI : 10.2172/1344785

S. Banerjee and D. O. Wu, Final report from the nsf workshop on future directions in wireless networking, USA, Tech. Rep, 2013.

K. Shvachko, H. Kuang, S. Radia, and R. Chansler, The hadoop distributed file system, Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), ser. MSST '10, pp.1-10, 2010.

J. Hwang, M. Balazinska, A. Rasin, U. Cetintemel, M. Stonebraker et al., High-availability algorithms for distributed stream processing, Proceedings of the 21st International Conference on Data Engineering, ser. ICDE '05, pp.779-790, 2005.

T. Akidau, R. Bradshaw, C. Chambers, S. Chernyak, R. J. Fernández-moctezuma et al., The dataflow model: A practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing, Proc. VLDB Endow, vol.8, issue.12, pp.1792-1803, 2015.

P. J. Desnoyers and P. Shenoy, Hyperion: High volume stream archival for retrospective querying, 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, ser. ATC'07, vol.4, pp.1-4, 2007.

C. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk, Gigascope: A stream database for network applications, Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '03, pp.647-651, 2003.

C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu, A framework for clustering evolving data streams, Proceedings of the 29th International Conference on Very Large Data Bases, vol.29, pp.81-92, 2003.
DOI : 10.1016/b978-012722442-8/50016-1

P. Bailis, E. Gan, S. Madden, D. Narayanan, K. Rong et al., Macrobase: Prioritizing attention in fast data, Proceedings of the 2017 ACM International Conference on Management of Data, ser. SIGMOD '17, pp.541-556, 2017.

B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny, Workload analysis of a large-scale key-value store, Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, ser. SIGMETRICS '12, pp.53-64, 2012.

J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, Internet of things (iot): A vision, architectural elements, and future directions, Future Gener
DOI : 10.1016/j.future.2013.01.010

URL : http://arxiv.org/pdf/1207.0203

, Comput. Syst, vol.29, issue.7, pp.1645-1660, 2013.

B. Nicolae, C. Costa, C. Misale, K. Katrinis, and Y. Park, Leveraging adaptive i/o to optimize collective data shuffling patterns for big data analytics, IEEE Transactions on Parallel and Distributed Systems, vol.28, issue.6, pp.1663-1674, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01531374

G. Cugola and A. Margara, Processing flows of information: From data stream to complex event processing, ACM Comput. Surv, vol.44, issue.3, pp.1-15, 2012.

,

D. J. Abadi, S. R. Madden, and N. Hachem, Column-stores vs. row-stores: How different are they really, Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '08, pp.967-980, 2008.

S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar et al., Dremel: Interactive analysis of web-scale datasets, Proc. VLDB Endow, vol.3, pp.330-339, 2010.

A. "apache,

B. Gedik, Partitioning functions for stateful data parallelism in stream processing, The VLDB Journal, vol.23, issue.4, pp.517-539, 2014.

L. Yang, J. Cao, Y. Yuan, T. Li, A. Han et al., A framework for partitioning and execution of data stream applications in mobile cloud computing, Eval. Rev, vol.40, issue.4, pp.23-32, 2013.

. Inria,

A. Roy, I. Mihailovic, and W. Zwaenepoel, X-stream: Edge-centric graph processing using streaming partitions, Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, ser. SOSP '13, pp.472-488, 2013.

B. Cagri and T. Nesime, Scalable data partitioning techniques for parallel sliding window processing over data streams, 8th International Workshop on Data Management for Sensor Networks, 2011.

L. Cao and E. A. Rundensteiner, High performance stream query processing with correlation-aware partitioning, Proc. VLDB Endow, vol.7, issue.4, pp.265-276, 2013.
DOI : 10.14778/2732240.2732245

URL : http://www.vldb.org/pvldb/vol7/p265-cao.pdf

S. Chandrasekaran and M. J. Franklin, Streaming queries over streaming data, Proceedings of the 28th International Conference on Very Large Data Bases, ser. VLDB '02. VLDB Endowment, pp.203-214, 2002.
DOI : 10.1016/b978-155860869-6/50026-3

URL : http://www.cs.berkeley.edu/~franklin/Papers/psoupVLDB02.pdf

C. Mitch, B. Hari, B. Magdalena, C. Donald, C. Ugur et al., Scalable distributed stream processing, First Biennial Conference on Innovative Data Systems Research, 2003.

M. A. Shah, J. M. Hellerstein, and E. Brewer, Highly available, fault-tolerant, parallel dataflows, Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '04, pp.827-838, 2004.
DOI : 10.1145/1007568.1007662

URL : http://db.cs.berkeley.edu/papers/sigmod04-fluxft.pdf

J. Li, D. Maier, K. Tufte, V. Papadimos, and P. A. Tucker, Semantics and evaluation techniques for window aggregates in data streams, Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '05, pp.311-322, 2005.

B. Lohrmann, D. Warneke, and O. Kao, Nephele streaming: Stream processing under qos constraints at scale, Cluster Computing, vol.17, issue.1, pp.61-78, 2014.
DOI : 10.1007/s10586-013-0281-8

URL : http://arxiv.org/pdf/1308.1031

&. Kafka,

K. Jay, N. Neha, and R. Jun, Kafka: A distributed messaging system for log processing, Proceedings of 6th International Workshop on Networking Meets Databases, ser. NetDB'11, 2011.

A. "apache,

, Rabbitmq

, Amqp

, Zeromq

, Hornetq

. "apache-flume,

L. Qiao, Y. Li, S. Takiar, Z. Liu, N. Veeramreddy et al., Gobblin: Unifying data ingestion for hadoop, Proc. VLDB Endow, vol.8, issue.12, pp.1764-1769, 2015.

, Gobblin Documentation

, Elasticsearch

&. Sqoop,

D. J. Abadi, D. Carney, U. Çetintemel, M. Cherniack, C. Convey et al., Aurora: A new model and architecture for data stream management, The VLDB Journal, vol.12, issue.2, pp.120-139, 2003.

,

, Redis

P. Viotti and M. Vukoli?, Consistency in non-transactional distributed storage systems, ACM Comput. Surv, vol.49, issue.1, pp.1-19, 2016.

J. Ousterhout, A. Gopalan, A. Gupta, A. Kejriwal, C. Lee et al., The ramcloud storage system, ACM Trans. Comput. Syst, vol.33, issue.3, pp.1-7, 2015.

A. Kejriwal, A. Gopalan, A. Gupta, Z. Jia, S. Yang et al., Slik: Scalable low-latency indexes for a key-value store, Proceedings of the 2016 USENIX Conference on Usenix Annual Technical Conference, ser. USENIX ATC '16, pp.57-70, 2016.

H. Lim, D. Han, D. G. Andersen, and M. Kaminsky, Mica: A holistic approach to fast in-memory key-value storage, Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation, ser. NSDI'14, pp.429-444, 2014.

S. Li, H. Lim, V. W. Lee, J. H. Ahn, A. Kalia et al., Architecting to achieve a billion requests per second throughput on a single key-value store server platform, Proceedings of the 42Nd Annual International Symposium on Computer Architecture, ser. ISCA '15, pp.476-488, 2015.

R. Escriva, B. Wong, and E. G. Sirer, Hyperdex: A distributed, searchable key-value store, Proceedings of the ACM SIGCOMM 2012 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication, ser. SIGCOMM '12, pp.25-36, 2012.

, Memcached

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee et al., Scaling memcache at facebook, Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, ser. nsdi'13, pp.385-398, 2013.

, Rocksdb

, Lmdb

K. Florian and S. Michael, Dxram: A persistent in-memory storage for billions of small objects, 2013.

F. Yang, E. Tschetter, X. Léauté, N. Ray, G. Merlino et al., Druid: A real-time analytical data store, Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD '14, pp.157-168, 2014.

, Druid

S. Zoe and M. Kostas, Scalable storage support for data stream processing, 26th Symposium on Mass Storage Systems and Technologies, ser. MSST'10, 2010.

S. Babu and J. Widom, Continuous queries over data streams, SIGMOD Rec, vol.30, issue.3, pp.109-120, 2001.
DOI : 10.1145/603867.603884

URL : http://pages.cs.wisc.edu/~jhuang/qual/continuous-query-data-stream-01.pdf

P. H. Carns, W. B. Ligon, I. , R. B. Ross, and R. Thakur, Pvfs: A parallel file system for linux clusters, Proceedings of the 4th Annual Linux Showcase & Conference, vol.4, pp.28-28, 2000.

I. Botan, G. Alonso, P. M. Fischer, D. Kossmann, and N. Tatbul, Flexible and scalable storage management for data-intensive stream processing, Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, ser. EDBT '09, pp.934-945, 2009.
DOI : 10.1145/1516360.1516467

,

A. Arasu, M. Cherniack, E. Galvez, D. Maier, A. S. Maskey et al., Linear road: A stream data management benchmark, Proceedings of the Thirtieth International Conference on Very Large Data Bases, vol.30, pp.480-491, 2004.

F. Dehne, D. E. Robillard, A. Rau-chaplin, and N. Burke, VOLAP: A scalable distributed system for real-time OLAP with high velocity data, 2016 IEEE International Conference on Cluster Computing, pp.354-363, 2016.

D. Hilley and U. Ramachandran, Persistent temporal streams, Proceedings of the 10th ACM/IFIP/USENIX International Conference on Middleware, ser. Middleware '09, vol.17, 2009.
DOI : 10.1007/978-3-642-10445-9_17

URL : https://link.springer.com/content/pdf/10.1007%2F978-3-642-10445-9_17.pdf

R. C. Fernandez, P. R. Pietzuch, J. Kreps, N. Narkhede, J. Rao et al., Liquid: Unifying nearline and offline big data integration, CIDR 2015, Seventh Biennial Conference on Innovative Data Systems Research, 2015.

&. Samza,

W. Song, T. Gkountouvas, K. Birman, Q. Chen, and Z. Xiao, The freeze-frame file system, Proceedings of the Seventh ACM Symposium on Cloud Computing, ser. SoCC '16, pp.307-320, 2016.

, Timescaledb

, Influxdb

&. Riak and T. S. ,

, Opentsdb

&. Kudu,

. "apache-hbase,

F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach et al., Bigtable: A distributed storage system for structured data, ACM Trans. Comput. Syst, vol.26, issue.2, pp.1-4, 2008.

T. Harter, D. Borthakur, S. Dong, A. Aiyer, L. Tang et al., Analysis of hdfs under hbase: A facebook messages case study, Proceedings of the 12th USENIX Conference on File and Storage Technologies, ser. FAST'14, pp.199-212, 2014.

A. Lakshman and P. Malik, Cassandra: A decentralized structured storage system, SIGOPS Oper. Syst. Rev, vol.44, issue.2, pp.35-40, 2010.

C. "apache,

P. R. Wilson, M. S. Johnstone, M. Neely, and D. Boles, Dynamic storage allocation: A survey and critical review, Proceedings of the International Workshop on Memory Management, ser. IWMM '95, pp.1-116, 1995.

&. Carbondata,

, Apache Parquet

&. Orc,

H. Zhang, G. Chen, B. C. Ooi, K. Tan, and M. Zhang, In-memory big data management and processing: A survey, IEEE Trans. on Knowl. and Data Eng, vol.27, issue.7, 2015.

N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom et al., Towards a streaming sql standard, Proc. VLDB Endow, vol.1, pp.1379-1390, 2008.

D. M. , K. P. , M. G. , and T. M. , State access patterns in stream parallel computations, International Journal of High Performance Computing Applications, 2017.

, Streaming-data algorithms for high-quality clustering, Proceedings of the 18th International Conference on Data Engineering, ser. ICDE '02, p.685, 2002.

J. A. Silva, E. R. Faria, R. C. Barros, E. R. Hruschka, A. C. Carvalho et al., Data stream clustering: A survey, ACM Comput. Surv, vol.46, issue.1, pp.1-13, 2013.

A. Mcgregor, Graph stream algorithms: A survey, SIGMOD Rec, vol.43, issue.1, pp.9-20, 2014.

G. Sijie, D. Robin, and S. Leigh, Distributedlog: A high performance replicated log service, IEEE 33rd International Conference on Data Engineering, ser. ICDE'17. IEEE, 2017.

M. John, A. Cansu, Z. Stan, T. Nesime, and D. Jiang, Data ingestion for the connected world, CIDR, Online Proceedings, 2017.

S. Ghemawat, H. Gobioff, and S. Leung, The google file system, Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, ser. SOSP '03, pp.29-43, 2003.
DOI : 10.1145/945445.945450

S. Niazi, M. Ismail, S. Haridi, J. Dowling, S. Grohsschmiedt et al., Hopsfs: Scaling hierarchical file system metadata using newsql databases, Proceedings of the 15th Usenix Conference on File and Storage Technologies, ser. FAST'17, pp.89-103, 2017.
DOI : 10.1007/978-3-319-77525-8_146

URL : http://arxiv.org/pdf/1606.01588

B. Dong, Q. Zheng, F. Tian, K. Chao, R. Ma et al., An optimized approach for storing and accessing small files on cloud storage, J. Netw. Comput. Appl, vol.35, issue.6, pp.1847-1862, 2012.

S. Yang, Iot stream processing and analytics in the fog, 2017.

C. Perera, Y. Qin, J. C. Estrella, S. Reiff-marganiec, and A. V. Vasilakos, Fog computing for sustainable smart cities: A survey, ACM Comput. Surv, vol.50, issue.3, 2017.
DOI : 10.1145/3057266

URL : http://eprints.hud.ac.uk/id/eprint/31927/8/__nas01_librhome_librsh3_Desktop_acmsmall-sample.pdf