A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery

Abstract : Point time series are a key data-type for the description of real or modelled environmental phenomena. Delivering this data in useful ways can be challenging when the data volume is large, when computational work (such as aggregation, subsetting, or re-sampling) needs to be performed, or when complex metadata is needed to place data in context for understanding. Some aspects of these problems are especially relevant to the environmental domain: large sensor networks measuring continuous environmental phenomena sampling frequently over long periods of time generate very large datasets, and rich metadata is often required to understand the context of observations. Nevertheless, timeseries data, and most of these challenges, are prevalent beyond the environmental domain, for example in financial and industrial domains.A review of recent technologies illustrates an emerging trend toward high performance, lightweight, databases specialized for time series data. These databases tend to have non-existent or minimalistic formal metadata capacities. In contrast, the environmental domain boasts standards such as the Sensor Observation Service (SOS) that have mature and comprehensive metadata models but existing implementations have had problems with slow performance.In this paper we describe our hybrid approach to achieve efficient delivery of large time series datasets with complex metadata. We use three subsystems within a single system-of-systems: a proxy (Python), an efficient time series database (InfluxDB) and a SOS implementation (52 North SOS). Together these present a regular SOS interface. The proxy processes standard SOS queries and issues them to the either 52 North SOS or to InfluxDB for processing. Responses are returned directly from 52 North SOS or indirectly from InfluxDB via Python proxy where they are processed into WaterML. This enables the scalability and performance advantages of the time series database to be married with the sophisticated metadata handling of SOS. Testing indicates that a recent version of 52 North SOS configured with a Postgres/PostGIS database performs well but an implementation incorporating InfluxDB and 52 North SOS in a hybrid architecture performs approximately 12 times faster.
Document type :
Conference papers
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.inria.fr/hal-01328577
Contributor : Hal Ifip <>
Submitted on : Wednesday, June 8, 2016 - 11:15:21 AM
Last modification on : Thursday, January 11, 2018 - 5:22:02 PM

File

978-3-319-15994-2_37_Chapter.p...
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Benjamin Leighton, Simon Cox, Nicholas Car, Matthew Stenson, Jamie Vleeshouwer, et al.. A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery. 11th International Symposium on Environmental Software Systems (ISESS), Mar 2015, Melbourne, Australia. pp.371-379, ⟨10.1007/978-3-319-15994-2_37⟩. ⟨hal-01328577⟩

Share

Metrics

Record views

86

Files downloads

279