Image Pixelization with Differential Privacy

. Ubiquitous surveillance cameras and personal devices have given rise to the vast generation of image data. While sharing the image data can benefit various applications, including intelligent transportation systems and social science research, those images may capture sensitive individual information, such as license plates, identities, etc. Existing image privacy preservation techniques adopt deterministic obfuscation, e.g., pixelization, which can lead to re-identification with well-trained neural networks. In this study, we propose sharing pixelized images with rigorous privacy guarantees. We extend the standard differential privacy notion to image data, which protects individuals, objects, or their features. Empirical evaluation with real-world datasets demonstrates the utility and efficiency of our method; despite its simplicity, our method is shown to effectively reduce the success rate of re-identification attacks.


Introduction
There is a massive amount of image data captured by personal and commercial cameras nowadays. Every second 835 photos are uploaded on Instagram [1]. Over 18,000 traffic cameras spanning more than 200 cities in US are reported on TrafficLand [2]. Sharing image data widely would benefit various research communities. For instance, traffic images can be shared with third-party researchers to study vehicle behaviors toward intelligent transportation systems [3]; images uploaded on social media can be utilized by computer vision researchers to test their algorithms for social relation recognition [4] and early screening of mental illnesses [5]. However, publishing the aforementioned image data would raise privacy concerns. In fact, traffic cameras can capture the vehicle license plate; and personal images may capture objects or text that may indicate religious belief, health, habits, and location [6].
A number of studies proposed cryptography-based solutions for image sharing [7,8], retrieval [9,10], and feature extraction [11,12] using untrusted service providers. While those solutions secure the image data with encryption, they exhibit a few drawbacks which make them inapplicable in our setting. Firstly, crypto-based image sharing explicitly trusts the data recipients, i.e., does not account for malicious recipients, and usually requires a secure channel to exchange secrets/keys. It can be challenging in both efficiency and security for sharing data with a wide range of recipients. Secondly, the features computed by the untrusted server also need to be protected, such as shape positions and scale-invariant feature transform (SIFT), as those features often disclose sensitive information. Existing studies resort to more expensive cryptographic tools, such as homomorphic encryption and garbled circuit [11], or multiple independent servers [12], which potentially limit the feasibility of extracting complex features and enabling time-critical applications.
The sanitization of private content in image data has been studied in computer vision. Standard image obfuscation techniques, such as pixelization and blurring, are used by most privacy enhancing approaches to obscure the regionsof-interest (ROIs), including faces and texts. However, recent studies have shown that pixelization [13], blurring [13], and the P3 system [7] are not effective in privacy preservation. Given sufficient training data and the obfuscation technique, various models can be built to associate the obfuscated images to the ground truth, which can be used to decode redacted documents [13], and to re-identify faces and handwritten digits [14]. Therefore, we are in need of image obfuscation methods that can provide rigorous privacy guarantees.
The goal of this study is to ensure a rigorous privacy notion, differential privacy [15], for image data sharing. By definition, the adversary cannot effectively distinguish between secrets by observing the output of a differentially private mechanism, thus privacy is protected. To our best knowledge, our study is the first attempt of providing differential privacy guarantees for multimedia data publication. The specific contributions of the paper are as follows: (1) To extend the standard differential privacy notion to image data, we propose the m-neighborhood notion, which allows for the protection of any sensitive information represented by up to m pixels.
(2) Given the high sensitivity of direct image publication, we propose a pixelizationbased method with grid cells of b x b pixels, to achieve a utility-privacy trade off. We show that it provides differential privacy guarantees.
(3) We empirically evaluate the utility and efficiency of the differentially private pixelization with real-world image datasets with different resolutions. Two utility metrics are adopted to measure the absolute error and the perceptual quality, respectively. We show that our private method can yield similar output to the non-private pixelization.
(4) We simulate the re-identification attacks via deep learning and the results show that the differentially private pixelization significantly reduces the reidentification risk, even with low privacy requirements, i.e., ϵ ≥ 0.1 and m = 16.
The rest of the paper is organized as follows: Section 2 reviews recent and related literature; Section 3 and 4 provide the preliminaries and technical details of the differentially private pixelization; Section 5 presents the empirical evaluation; Section 6 concludes the paper and states future directions.

Related Work
Image Privacy Classification. Several studies (e.g., [16,17,6]) utilized image content features to predict the privacy settings for image sharing on online social networks (OSN). In particular, those studies explored classification models to predict whether an image is private or public: private images or ROIs should not be shared publicly or with OSN providers so as to stop the flow of information. While those studies show promise to understand the sensitivity of image data, the selected features often lack interpretability, e.g., after PCA projection or deep neural network features. Moreover, the classification models may not be perfectly accurate and images classified as private will not be shared with the public, preventing further utilization.
Image Obfuscation. Two popular image obfuscation techniques are pixelization (also referred to as mosaicing) and blurring. Pixelization [13] can be achieved by superposing a rectangular grid over the original image and averaging the color values of the pixels within each grid cell. On the other hand, blurring, i.e., Gaussian blur, removes details from an image by convolving the 2D Gaussian distribution function with the image. YouTube provides its own face blur implementation [18] for video uploads. McPherson et al. [14] studied pixelization and YouTube face blur and concluded the obfuscated images using those methods can be re-identified. In addition, a secure image sharing method named P3 [7] was also studied in [14] which encrypts the significant Discrete Cosine Transform (DCT) coefficients of the image. As YouTube face blur and P3 are not available/applicable in our study, we will focus on the pixelization technique and design a quantifiable privacy model for obfuscating image data.
Differential Privacy. Differential privacy [15] has become the state-of-the-art privacy paradigm for sanitizing statistical databases. While it provides rigorous privacy guarantees for each individual data record in the database, it is challenging to apply the standard differential privacy notion to non-aggregated data. Several variants of the privacy notion have been proposed. For instance, eventlevel privacy [19] aims to protect the presence of individual events in one person's data when releasing aggregated data. Local privacy [20] enables answering aggregate queries without a trusted data curator. Geo-indistinguishability [21] was proposed to release anonymized locations in a trajectory by sampling according to geo-distance in a randomized fashion. Although briefly mentioned in [22], there have not been any studies on ensuring differential privacy for image data. The goal of our work is to study the feasibility of differential privacy in image data sanitization by proposing an extended privacy model and an efficient mechanism to achieve it.

Preliminaries
Setting. We consider the problem setting where a data owner wishes to share one or more images with a wide range of untrusted recipients, e.g., researchers or the greater public. The data owner must sanitize the image data prior to its publication, in order to protect the privacy of individuals or objects captured in the images.
Image Data. In the paper we focus on grayscale images: an input image I is regarded as an M x N matrix with integer values between 0 and 255 (0 is black and 255 is white). I(x, y) denotes the "pixel" value at position (x, y) in the matrix. We note that the proposed privacy model and algorithm can be extended to RGB (red-green-blue) and HSV (hue-saturation-value) representations by considering each channel separately. We assume the sensitivity of each image is independent of other images to sanitize. Therefore we defer the extension of our study to inter-dependent images, such as a sequence of video frames, to future work in Section 6.
Pixelization. The pixelization technique renders the source image using larger blocks. It is achieved by partitioning the image using a two-dimensional grid, and the average pixel value is released for each grid cell. Similar to [13], we adopt a "square" grid where the pixel width is equal to the pixel height in the grid cells, i.e., each grid cell contains b x b pixels. In general, a smaller b value yields better approximation and visual quality, as is shown in Figure 1. Standard Differential Privacy. The widely adopted Differential Privacy [15] definition operates in statistical databases.
Definition 1. [ϵ-Differential Privacy] A randomized mechanism A gives ϵ-differential privacy if for any neighboring databases D 1 and D 2 differing on at most one record, and for any possible outputD ∈ Range(A), where the probability is taken over the randomness of A.
The parameter ϵ specifies the degree of privacy offered by A, i.e., a smaller ϵ implies stronger privacy and vice versa. It has been shown [15] that ϵ-differential privacy can be achieved with the Laplace mechanism, by adding i.i.d. noiseÑ to a function f , i.e.,f (D) = f (D) +Ñ . Specifically,Ñ is drawn from a Laplace distribution with 0 mean and ∆f ϵ scale, and ∆f denotes the global sensitivity [15], which captures the maximum difference of f between any neighboring databases.
In this study, we extend the above definition to images, e.g., I 1 and I 2 , and define neighboring images in the next section.

Differentially Private Pixelization
In this section we first propose the notion of neighborhood for image data, and then describe an effective privacy-preserving image publication algorithm.
Privacy Model. The concept of "neighboring images" is the key to the differential privacy notion, which should clearly define the private content under the protection of differential privacy. In this paper, we propose the following notion of image neighborhood.
Definition 2. [m-Neighborhood] Two images I 1 and I 2 are neighboring images if they have the same dimension and they differ by at most m pixels.
Allowing up to m pixels to differ enables us to protect the presence or absence of any object, text, or person, represented by those pixels in an image. For instance, each red rectangle in Figure 2a illustrates sensitive information which can be represented by ∼ 360 pixels, such as a pedestrian, a van, an object on grass, and a signage. One example neighboring image is shown in Figure 2b, differing only at the left-most pedestrian. By differential privacy, an adversary cannot distinguish between any pair of neighboring images by observing the output image. The privacy of the pedestrian, and any other sensitive information represented by at most m pixels, can thus be protected. The m-Neighborhood notion can also be applied to protect features of an object or person. For instance, the rectangle in Figure 2c contains ∼ 120 pixels and encloses the area of the eyes which is reportedly the optimal feature for a range of face recognition tasks [24].
When adopting the above definition, the data owner can choose an appropriate m value in order to customize the level of privacy protection, i.e., achieving indistinguishability in a smaller or larger range of neighboring images. We assume that removing those pixels is sufficient to protect the privacy of the underlying information, by definition of differential privacy [15].  Another advantage of our proposed privacy model is that it does not require annotated or detected sensitive regions-of-interest (ROIs). But rather, we sanitize the given image 1 to protect any ROIs of size m. A straight-forward application of differential privacy is to apply Laplace perturbation to each pixel. As up to m pixels can change and each pixel can change by at most 255, the global sensitivity of direct image perturbation is very high, i.e., ∆I = 255m, leading to high perturbation noise. Therefore, we propose differentially private pixelization, which achieves differential privacy while reducing the amount of perturbation noise added to the image.
Differentially Private Pixelization (Pix). In a nutshell, our algorithm first performs pixelization on an input image, and applies Laplace perturbation to the pixelized image. Specifically, let c k denote the k-th grid cell over an M x N image. As shown in Figure 3 The pixelization of an image I can be denoted as a vector of length K, i.e., The global sensitivity of P b is thus ∆P b = max I1,I2 |P b (I 1 ) − P b (I 2 )| = 255m b 2 , as the difference between any two pixels is at most 255 and up to m pixels can differ between any neighboring images I 1 and I 2 .
Proof. Since the ∆P b = 255m b 2 , by definition [15] applying the Laplace mechanism to P b achieves differential privacy.
Note that each pixel inP b (I) is truncated to the range of [0, 255]. This postprocessing ofP b does not affect its privacy guarantee.

Experiments
Below we present the empirical evaluation of differentially private pixelization. Datasets: We considered the Multiple Object Tracking Benchmark [25], which contains video frame sequences widely used in the MOT community. Among those, two datasets adopted in this study are : PETS dataset, i.e., PETS09-S2L1, showing walking pedestrians on a university campus with 795 images and 768 x 576 resolution; and Venice dataset, i.e., Venice-2, showing walking pedestrians   Table 1, unless specified otherwise. The utility of our method can be measured by the standard Mean Square Error (MSE), which is defined between the input image and the sanitized image. We also adopted a widely used perceptual quality measure named Structural Similarity (SSIM) [27], which considers the perceived similarity in structural information in addition to luminance and contrast. One example of SSIM's advantage over MSE, is that an image derived by subtracting a certain value from every pixel in the input image would exhibit high structural similarity to the input at a significant absolute error. Due to this consideration, both utility measures were evaluated. In each experiment, we reported the average result among all the images in each dataset.

Impact of b
We first varied the grid cell length b to empirically evaluate its impact on the utility of the sanitized image. Note that in addition to our differential private method Pix, we included the non-private pixelization method, i.e., Pix np, which is parameterized with the same b value, as a reference for utility.   As can be seen, by increasing b, the nonprivate baseline yields a higher MSE and a lower SSIM in each dataset, as a result of the coarser approximation by pixelization. SSIM drops significantly from b = 2 to b = 6. On the other hand, our private method generates higher utility images when b increases, approaching the utility of the non-private baseline. This is due to a lower Laplace perturbation error, the magnitude of which is governed by 255m b 2 . As shown in Figure 4, our private method outputs an image closely resembles the non-private pixelization, except for a few grid cells. Note that in Figure 5d Pix shows an increasing trend in MSE for 2 ≤ b ≤ 12. The reason is that MNIST depicts white (255) digits on a black (0) background, and when b is small the large Laplace noise does not significantly affect those extreme pixel values.

Impact of m
In the following experiment, we varied m, the number of pixels allowed to change between any pair of neighboring images, characterizing the indistinguishability requirements of the differentially private method. Intuitively, a larger m value ensures indistinguishability on a wider range of images, hence stronger privacy. The utility results are depicted in Figure 7 and 8. Note that the non-private method Pix np should not be affected by the variation of m values. As m increases, the utility of our private method Pix drops, as the Laplace perturbation noise is larger. This shows the tradeoff between utility and privacy. For the MNIST dataset, we observe a lower MSE when m > 32 in Figure 7d. The increased Laplace perturbation noise "helped" with sharing images are composed of black and white pixels. However, the increased privacy requirement has a clearer manifest on the perceptual quality, i.e., a steady decreasing trend in Figure 10d, as SSIM captures the image structural information in addition to pixel values. To further illustrate the utility of the differentially private pixelization, sample images generated under the default parameter setting are provided in Table 4. As can be seen, for images of larger size, e.g., the PETS and Venice datasets, setting b = 16 and m = 16 would allow the viewer to recognize the street scene and the number of pedestrians in the sanitized images. For smaller sized images, e.g., the AT&T and MNIST datasets, the pixelization grid size b = 16 yields a very coarse approximation, and with m = 16 the private perturbation mechanism inflicts a higher visual quality loss, due to smaller image sizes. Therefore, m can be adjusted by the user of our private method depending on the input image size and the privacy requirement. However, we note that when any obfuscation is applied to faces and digits, the goal is usually to reduce the identifiability of the resulting image; the example AT&T and MNIST images show promising visual results of our method.

Impact of ϵ
We also studied the impact on utility by varying the privacy parameter ϵ. Intuitively, lower ϵ value ensures stronger privacy, and yields lower utility. As can be seen in Figure 9 and 10, our private method Pix shows a lower MSE and a higher SSIM when increasing ϵ. An expected exception is observed for MNIST dataset in Figure 9d, where smaller ϵ values, e.g., 0.1, can benefit sharing extreme pixel values. Again, the SSIM measure is shown to be more robust than MSE, exhibiting a consistently increasing trend when ϵ increases in Figure 10d.

Runtime
Another important performance index is the efficiency of the proposed method.
To this end, we summarized the average runtime to process one image in each dataset in Table 2. As can be seen, our private method is very efficient, taking only 66 milliseconds to sanitize a 1920 x 1080 image. In every dataset, the process time per pixel is around 10 −5 milliseconds.

Mitigation of CNN Attacks
While differential privacy provides a rigorous indistinguishability guarantee, we conducted a study similar to [14] in order to understand whether the differentially private pixelization can mitigate intelligent re-identification attacks. For this study, we partitioned the 10 images for each individual in the AT&T dataset (40 individuals in total) by randomly selecting 8 images for training and using the remaining 2 for testing, as in [14]. The MNIST dataset is pre-partitioned with 50, 000 for training and 10, 000 for testing. Assume the adversary has access to the training set obfuscated by a given method, as well as the label of each training image, i.e., individual identity (1-40) and digits (0-9). The goal of the re-identification attack is to breach the privacy of the testing set, i.e., predicting the label for each testing image produced by the same obfuscation method. In this study, we compared our differentially private pixelization with a random guessing baseline and the non-private pixelization method, i.e., mosaicing. Random guessing method predicts the label of a testing image by randomly picking a label, without considering the training set. Our method was applied with the default parameter values, i.e., b = 16 and m = 16, when varying ϵ. We generated the training set and testing set for each ϵ value. A convolutional neural network (CNN) was trained for each dataset with the suggested architecture [14]. We reported the classification results 2 of our differentially private method in Table 3. The results for "Mosaicing" were taken from the original study [14]. As can be seen, with the same grid cell length b = 16, our differentially private method significantly reduces the attack success rate compared to the non-private method. For the AT&T dataset, recall that   with ϵ = 0.5 the differentially private pixelization yields similar output to that of the non-private method as illustrated in Figure 4. But the re-identification risk is lowered by more than 52%, from 96.25% to 43.75%, thanks to the randomized mechanism. As for the MNIST dataset, our private method also significantly reduces the success rate of the attack. Dominated by black and white pixels and at a lower resolution, the re-identification risk of MNIST images is less sensitive to the privacy parameter ϵ. It is worth mentioning that when ϵ = 0.1, our private method is very hard to breach, and the risk is close to that of random guessing.

Conclusion and Discussion
We have presented a private image pixelization method, which was the first attempt at extending differential privacy to image data publication. We proposed the m-neighborhood notion to define the indistinguishability requirement, i.e., roughly the same output for any images differing at up to m pixels. Given the high sensitivity of direct image perturbation, pixelization with grid cells of b x b pixels was adopted to achieve a utility-privacy trade off. We empirically evaluated the utility and efficiency of differentially private pixelization with multiple realworld image datasets, and showed that our private method can yield similar output to that of the non-private pixelization. In addition, an intelligent reidentification attack was simulated and the results showed that differentially private pixelization significantly reduces the attack success even at low privacy requirements, i.e., ϵ ≥ 0.1 and m = 16. Therefore, we concluded that our method is simple yet powerful.
As a new research endeavor, a number of directions can be explored for future work: 1) the design of post-processing techniques to further improve the utility of the differentially private method, e.g., removing sharp differences; 2) the study of application-specific utility such as crowd and vehicle counting; 3) the evaluation of human users on the perceived privacy and utility; 4) the extension to correlated images, e.g., video frame sequences.  Table 4: First row lists sample images in each dataset and second row is the corresponding differentially private pixelization, under the default parameter setting. Note that when obfuscation is applied to faces in AT&T and digits in MNIST, the desired outcome is to reduce identifiability.