On Privacy of Multidimensional Data Against Aggregate Knowledge Attacks
Résumé
In this paper, we explore the privacy problem of individuals in publishing data cubes using SUM queries, where a malicious user is expected to have an aggregate knowledge (e.g., average information) over the data ranges. We propose an efficient solution that maximizes the utility of SUM queries while mitigating inference attacks from aggregate knowledge. Our solution combines cube compression (i.e., suppression of data cells) and data perturbation. First, we give a formal statement for the privacy of aggregate knowledge based on data suppression. Next, we develop a Linear Programming (LP) model to determine the number of data cells to be removed and a heuristic method to effectively suppress data cells. To overcome the limitation of data suppression, we complement it with suitable data perturbation. Through empirical evaluation on benchmark data cubes, we show that our solution gives best performance in terms of utility and privacy.