Abstract : Frequent itemset mining is a fundamental data analytics task. In many cases, due to privacy concerns, only the frequent itemsets are released instead of the underlying data. However, it is not clear how to evaluate the privacy implications of the disclosure of the frequent itemsets. Towards this, in this paper, we define the k-distant-IFM-solutions problem, which aims to find k transaction datasets whose pair distance is maximized. The degree of difference between the reconstructed datasets provides a way to evaluate the privacy risk. Since the problem is NP-hard, we propose a 2-approximate solution as well as faster heuristics, and evaluate them on real data.
Sabrina De Capitani di Vimercati; Fabio Martinelli. 32th IFIP International Conference on ICT Systems Security and Privacy Protection (SEC), May 2017, Rome, Italy. Springer International Publishing, IFIP Advances in Information and Communication Technology, AICT-502, pp.506-519, 2017, ICT Systems Security and Privacy Protection. 〈10.1007/978-3-319-58469-0_34〉
https://hal.inria.fr/hal-01649007
Contributeur : Hal Ifip
<>
Soumis le : lundi 27 novembre 2017 - 10:31:49
Dernière modification le : lundi 27 novembre 2017 - 10:34:05
Edoardo Serra, Jaideep Vaidya, Haritha Akella, Ashish Sharma. Evaluating the Privacy Implications of Frequent Itemset Disclosure. Sabrina De Capitani di Vimercati; Fabio Martinelli. 32th IFIP International Conference on ICT Systems Security and Privacy Protection (SEC), May 2017, Rome, Italy. Springer International Publishing, IFIP Advances in Information and Communication Technology, AICT-502, pp.506-519, 2017, ICT Systems Security and Privacy Protection. 〈10.1007/978-3-319-58469-0_34〉. 〈hal-01649007〉