Data Anonymization as a Vector Quantization Problem: Control Over Privacy for Health Data

Abstract : This paper tackles the topic of data anonymization from a vector quantization point of view. The admitted goal in this work is to provide means of performing data anonymization to avoid single individual or group re-identification from a data set, while maintaining as much as possible (and in a very specific sense) data integrity and structure. The structure of the data is first captured by clustering (with a vector quantization approach), and we propose to use the properties of this vector quantization to anonymize the data. Under some assumptions over possible computations to be performed on the data, we give a framework for identifying and “pushing back outliers in the crowd”, in this clustering sense, as well as anonymizing cluster members while preserving cluster-level statistics and structure as defined by the assumptions (density, pairwise distances, cluster shape and members...).
Complete list of metadatas

Cited literature [8 references]  Display  Hide  Download

https://hal.inria.fr/hal-01635008
Contributor : Hal Ifip <>
Submitted on : Tuesday, November 14, 2017 - 4:06:34 PM
Last modification on : Friday, August 23, 2019 - 11:12:02 AM
Long-term archiving on : Thursday, February 15, 2018 - 1:58:07 PM

File

430962_1_En_13_Chapter.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Yoan Miche, Ian Oliver, Silke Holtmanns, Aapo Kalliola, Anton Akusok, et al.. Data Anonymization as a Vector Quantization Problem: Control Over Privacy for Health Data. International Conference on Availability, Reliability, and Security (CD-ARES), Aug 2016, Salzburg, Austria. pp.193-203, ⟨10.1007/978-3-319-45507-5_13⟩. ⟨hal-01635008⟩

Share

Metrics

Record views

88

Files downloads

164