The Pitfalls of Hashing for Privacy - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue Communications Surveys and Tutorials, IEEE Communications Society Année : 2018

The Pitfalls of Hashing for Privacy

Résumé

Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.
Fichier principal
Vignette du fichier
pitfalls.pdf (311.59 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01589210 , version 1 (01-05-2019)

Identifiants

Citer

Levent Demir, Amrit Kumar, Mathieu Cunche, Cédric Lauradoux. The Pitfalls of Hashing for Privacy. Communications Surveys and Tutorials, IEEE Communications Society, 2018, 20 (1), pp.551-565. ⟨10.1109/COMST.2017.2747598⟩. ⟨hal-01589210⟩
1258 Consultations
2072 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More