The Pitfalls of Hashing for Privacy

Abstract : Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.Boosted by recent legislations, data anonymizationis fast becoming a norm. However, as of yet no generic solutionhas been found to safely release data. As a consequence, datacustodians often resort to ad-hoc means to anonymize datasets.Both past and current practices indicate that hashing is oftenbelieved to be an effective way to anonymize data. Unfortunately,in practice it is only rarely effective. This paper is a tutorialto explain the limits of cryptographic hash functions as ananonymization technique. Anonymity set is the best privacymodel that can be achieved by hash functions. However, thismodel has several shortcomings. We provide three case studiesto illustrate how hashing only yields a weakly anonymized data.The case studies include MAC and email address anonymizationas well as the analysis of Google Safe Browsing.
Type de document :
Article dans une revue
Communications Surveys and Tutorials, IEEE Communications Society, Institute of Electrical and Electronics Engineers, 2018, 〈10.1109/COMST.2017.2747598〉
Liste complète des métadonnées

https://hal.inria.fr/hal-01589210
Contributeur : Cédric Lauradoux <>
Soumis le : lundi 18 septembre 2017 - 12:30:17
Dernière modification le : mercredi 18 juillet 2018 - 09:16:57

Identifiants

Collections

Citation

Levent Demir, Amrit Kumar, Mathieu Cunche, Cédric Lauradoux. The Pitfalls of Hashing for Privacy. Communications Surveys and Tutorials, IEEE Communications Society, Institute of Electrical and Electronics Engineers, 2018, 〈10.1109/COMST.2017.2747598〉. 〈hal-01589210〉

Partager

Métriques

Consultations de la notice

370