Probabilistic k$^m$-anonymity

Acs Gergely; Jagdish Prasad Achara; Claude Castelluccia

Communication Dans Un Congrès Année : 2015

Probabilistic k$^m$-anonymity

(1) , (1) , (1)

Acs Gergely

Fonction : Auteur
PersonId : 932210

Privacy Models, Architectures and Tools for the Information Society

Jagdish Prasad Achara

Fonction : Auteur
PersonId : 946815

Privacy Models, Architectures and Tools for the Information Society

Claude Castelluccia

Fonction : Auteur
PersonId : 868662

Privacy Models, Architectures and Tools for the Information Society

Résumé

Set-valued dataset contains different types of items/values per individual, for example, visited locations, purchased goods, watched movies, or search queries. As it is relatively easy to re-identify individuals in such datasets, their release poses significant privacy threats. Hence, organizations aiming to share such datasets must adhere to personal data regulations. In order to get rid of these regulations and also to benefit from sharing, these datasets should be anonymized before their release. In this paper, we revisit the problem of anonymizing set-valued data. We argue that anonymization techniques targeting traditional \emph{k\textsuperscript{m}}-anonymity model, which limits the adversarial background knowledge to at most \emph{m} items per individual, are impractical for large real-world datasets. Hence, we propose a probabilistic relaxation of \emph{k\textsuperscript{m}}-anonymity and present an anonymization technique to achieve it. This relaxation also improves the utility of the anonymized data. We also demonstrate the effectiveness of our scalable anonymization technique on a real-world location dataset consisting of more than 4 million subscribers of a large European telecom operator. We believe that our technique can be very appealing for practitioners willing to share such large datasets.

Mots clés

Data Privacy set-valued data km-anonymity sampling MCMC

Domaines

Autre [cs.OH]

Fichier principal

paper.pdf (945.34 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Acs Gergely : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01205533

Soumis le : vendredi 25 septembre 2015-16:10:55

Dernière modification le : mercredi 15 mars 2023-08:53:38

Archivage à long terme le : mardi 29 décembre 2015-09:22:16

Dates et versions

hal-01205533 , version 1 (25-09-2015)

Identifiants

HAL Id : hal-01205533 , version 1

Citer

Acs Gergely, Jagdish Prasad Achara, Claude Castelluccia. Probabilistic k$^m$-anonymity: Efficient Anonymization of Large Set-Valued Datasets. IEEE Internation Conference on Big Data (BigData) 2015, Oct 2015, Santa Clara, United States. ⟨hal-01205533⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INRIA INSA-LYON INRIA2 INSA-GROUPE UDL

231 Consultations

311 Téléchargements

Probabilistic k$^m$-anonymity

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager