The Pitfalls of Hashing for Privacy - Archive ouverte HAL Access content directly
Journal Articles Communications Surveys and Tutorials, IEEE Communications Society Year : 2018

The Pitfalls of Hashing for Privacy

(1, 2) , (1, 3) , (1) , (1)
1
2
3

Abstract

Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.
Fichier principal
Vignette du fichier
pitfalls.pdf (311.59 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

hal-01589210 , version 1 (01-05-2019)

Identifiers

Cite

Levent Demir, Amrit Kumar, Mathieu Cunche, Cédric Lauradoux. The Pitfalls of Hashing for Privacy. Communications Surveys and Tutorials, IEEE Communications Society, 2018, 20 (1), pp.551-565. ⟨10.1109/COMST.2017.2747598⟩. ⟨hal-01589210⟩
1199 View
1780 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More