Skip to Main content Skip to Navigation
Poster communications

Inferring Context of Mobile Data Crowdsensed in the Wild

Abstract : Understanding the sensing context of raw data is crucial for assessing the quality of large crowdsourced spatio-temporal datasets. Accelerometer's precision can vary considerably depending on whether the phone is in-pocket or out-pocket, i.e., held in hand. GPS accuracy can be very low in places like underground metro stations. Further, jump-lengths are shorter and have higher frequency when a person is indoor. Hence, we focus on contexts such as in/out-pocket, under/over-ground, and in/out-door that can be essential for reliably inferring human mobility attributes and properties (e.g., location, jump-length, and mobility activity like walking or driving) from crowdsensed data. Our work is motivated by the fact that most of the publicly available crowdsensing datasets (e.g. PRIVA'MOV and Beijing taxi dataset) do not include data from specialized sensors such as light, barometer, etc. considered by state-of-the-art algorithms for detecting the above mentioned contexts. Therefore, we focus on mining context from the limited features available in the publicly available mobility related crowdsensing datasets. Moreover, as ground truth is typically not available in these datasets, we pay special attention to minimizing the training or tuning efforts of the introduced algorithms. Our algorithms are unsupervised binary classifiers with a small memory footprint and execution time. As the lack of certain features prohibits us to consider state-of-the-art algorithms as baselines, we compare the performance of our heuristic algorithms against Machine Learning (ML) models built by an AutoML tool using the same set of features. Our experimental evaluation with a segment of the Ambiciti dataset demonstrates that when compared to the best baseline ML model w.r.t. balanced accuracy (see Table I), our algorithm for in/out-pocket performs equally well, while for under/over-ground and in/out-door contexts, for a specific hyper-parameter, our corresponding algorithms are within 4.3% and 1%, respectively. Concerning memory, our algorithms require 0kB, 4kB, and 0kB, respectively, while they take 0.08sec, 0.17sec and 0.003sec, respectively, for execution. Our algorithms are lightweight enough to be integrated into smartphone applications. Context information mined onboard thus remains private and can be used to annotate users' personal trajectories and incentivize them to participate in crowd-measurement campaigns.
Document type :
Poster communications
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download
Contributor : Agarwal Rachit Connect in order to contact the contributor
Submitted on : Thursday, May 16, 2019 - 8:39:15 PM
Last modification on : Monday, May 20, 2019 - 2:41:55 PM


Files produced by the author(s)


  • HAL Id : hal-02132194, version 1




Rachit Agarwal, Shaan Chopra, Vassilis Christophides, Nikolaos Georgantas, Valérie Issarny. Inferring Context of Mobile Data Crowdsensed in the Wild. NetMob 2019 - Conference on the scientific analysis of mobile phone datasets, Jul 2019, Oxford, United Kingdom. ⟨hal-02132194⟩



Record views


Files downloads