The Pitfalls of Hashing for Privacy

Abstract : Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.
Document type :
Journal articles
Liste complète des métadonnées

https://hal.inria.fr/hal-01589210
Contributor : Cédric Lauradoux <>
Submitted on : Monday, September 18, 2017 - 12:30:17 PM
Last modification on : Tuesday, October 30, 2018 - 1:30:05 PM

Identifiers

Collections

Citation

Levent Demir, Amrit Kumar, Mathieu Cunche, Cédric Lauradoux. The Pitfalls of Hashing for Privacy. Communications Surveys and Tutorials, IEEE Communications Society, Institute of Electrical and Electronics Engineers, 2018, 20 (1), pp.551-565. ⟨10.1109/COMST.2017.2747598⟩. ⟨hal-01589210⟩

Share

Metrics

Record views

493