A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery

Abstract : Point time series are a key data-type for the description of real or modelled environmental phenomena. Delivering this data in useful ways can be challenging when the data volume is large, when computational work (such as aggregation, subsetting, or re-sampling) needs to be performed, or when complex metadata is needed to place data in context for understanding. Some aspects of these problems are especially relevant to the environmental domain: large sensor networks measuring continuous environmental phenomena sampling frequently over long periods of time generate very large datasets, and rich metadata is often required to understand the context of observations. Nevertheless, timeseries data, and most of these challenges, are prevalent beyond the environmental domain, for example in financial and industrial domains.A review of recent technologies illustrates an emerging trend toward high performance, lightweight, databases specialized for time series data. These databases tend to have non-existent or minimalistic formal metadata capacities. In contrast, the environmental domain boasts standards such as the Sensor Observation Service (SOS) that have mature and comprehensive metadata models but existing implementations have had problems with slow performance.In this paper we describe our hybrid approach to achieve efficient delivery of large time series datasets with complex metadata. We use three subsystems within a single system-of-systems: a proxy (Python), an efficient time series database (InfluxDB) and a SOS implementation (52 North SOS). Together these present a regular SOS interface. The proxy processes standard SOS queries and issues them to the either 52 North SOS or to InfluxDB for processing. Responses are returned directly from 52 North SOS or indirectly from InfluxDB via Python proxy where they are processed into WaterML. This enables the scalability and performance advantages of the time series database to be married with the sophisticated metadata handling of SOS. Testing indicates that a recent version of 52 North SOS configured with a Postgres/PostGIS database performs well but an implementation incorporating InfluxDB and 52 North SOS in a hybrid architecture performs approximately 12 times faster.
Type de document :
Communication dans un congrès
Ralf Denzer; Robert M. Argent; Gerald Schimak; Jiří Hřebíček. 11th International Symposium on Environmental Software Systems (ISESS), Mar 2015, Melbourne, Australia. Springer, IFIP Advances in Information and Communication Technology, AICT-448, pp.371-379, 2015, Environmental Software Systems. Infrastructures, Services and Applications. 〈10.1007/978-3-319-15994-2_37〉
Liste complète des métadonnées

Littérature citée [16 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01328577
Contributeur : Hal Ifip <>
Soumis le : mercredi 8 juin 2016 - 11:15:21
Dernière modification le : jeudi 11 janvier 2018 - 17:22:02

Fichier

978-3-319-15994-2_37_Chapter.p...
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Citation

Benjamin Leighton, Simon Cox, Nicholas Car, Matthew Stenson, Jamie Vleeshouwer, et al.. A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery. Ralf Denzer; Robert M. Argent; Gerald Schimak; Jiří Hřebíček. 11th International Symposium on Environmental Software Systems (ISESS), Mar 2015, Melbourne, Australia. Springer, IFIP Advances in Information and Communication Technology, AICT-448, pp.371-379, 2015, Environmental Software Systems. Infrastructures, Services and Applications. 〈10.1007/978-3-319-15994-2_37〉. 〈hal-01328577〉

Partager

Métriques

Consultations de la notice

32

Téléchargements de fichiers

11