Using Neural Networks for Fake Colorized Image Detection

Modern colorization techniques can create artiﬁcially-colorized images that are indistinguishable from natural color images. As a result, the detection of fake colorized images is attracting the interest of the digital forensics research community. This chapter tackles the challenge by introducing a detection approach that leverages neural networks. It analyzes the statistical diﬀerences between fake colorized images and their corresponding natural images, and shows that signiﬁcant diﬀerences exist. A simple, but eﬀective, feature extraction technique is proposed that utilizes cosine similarity to measure the overall similarity of normalized histogram distributions of various channels for natural and fake images. A special neural network with a simple structure but good performance is trained to detect fake colorized images. Experiments with datasets containing fake colorized images generated by three state-of-the-art colorization techniques demonstrate the performance and robustness of the proposed approach.


Introduction
Digital image forensics is the process of collecting, identifying, analyzing and presenting evidence derived from digital image resources [2,6].Rapid advancements in image tampering techniques have made it increasingly difficult to distinguish between natural and fake images.Farid [6] divides image tampering techniques into six categories: (i) compositing; (ii) morphing; (iii) re-touching; (iv) enhancing; (v) computergenerating; and (vi) painting.While these categories cover most image tampering techniques, other more specific image tampering techniques such as colorization [13] and splicing [1,5] have been proposed.Colorization is the process of transforming grayscale images to colorized images by adding color features.Colorization techniques are frequently used to add color to greyscale photographs or black-and-white films to restore historical scenes.These techniques are also used to colorize black-and-white CT, X-ray and MRI images to enhance medical diagnosis and treatment.However, colorization can also be used for malicious purposes, for example, to create doctored photographs and videos that appear legitimate to the naked eye.
Guo et al. [8] were among the first researchers to focus on fake colorized image detection.They proposed two classification techniques, FCID-HIST and FCID-FE, that rely on support vector machines [3].Difficulties in choosing appropriate kernel functions for the support vector machines limit the performance of the techniques.Additionally, the computing resources required by support vector machines render them infeasible for large datasets.
To address these challenges, this research employs the ColorDet-NN neural network [15] to detect fake colorized images.Figure 1 shows the ColorDet-NN approach.An initial feature analysis step compares the statistical differences in the color distributions of natural images and fake colorized images.A feature extraction step then captures contributing features from raw image data for pre-processing.The final training step utilizes the extracted features to create a ColorDet-NN neural network that detects fake colorized images.

Background
Colorization is the process of transforming grayscale images to colorized images by adding color features.Several colorization techniques have been proposed over the past two decades.The colorization tech-niques differ in how they obtain and handle the data used to model the correspondence between grayscale and colorized images.As a result, colorization techniques are broadly divided into three categories: (i) scribble-based; (ii) transfer-based; and (iii) fully automated.
Scribble-based methods require users to specify the colors in grayscale images in advance based on their experience.The first scribble-based method, developed by Levin et al. [16], utilizes a quadratic cost function of the differences between a pixel and its neighboring pixels under the assumption that adjacent pixels with similar intensities should have similar colors.Several researchers have developed more effective techniques.For example, Luan et al. [17] have developed an interactive system for colorizing natural images that uses texture similarity to obtain effective color propagation.Sykora et al. [20] have created a flexible, interactive tool for painting hand-drawn cartoons.However, these techniques rely on -and are therefore limited by -the user's experience, and require a large number of experiments to achieve good performance.
Transfer-based colorizing techniques establish mappings between reference colorized images and grayscale images, following which they transfer colors to the target greyscale images from analogous regions of the reference colorized images.Reinhard et al. [18] have done pioneering research on transferring colors between images.Ironi et al. [12] have presented a novel color transfer technique that analyzes the low-level feature space using a robust supervised classification scheme.However, in transfer-based colorization, the choice of appropriate reference colorized images is crucial to obtaining good performance.
Several researchers have applied deep learning techniques [14] to colorization.These fully-automated techniques have better performance than scribble-based and transfer-based methods.Larsson et al. [13] have developed a fully-automated image colorization technique that predicts per-pixel color histograms utilizing low-level and semantic representations.Iizuka et al. [11] have employed a neural network that combines global priors and local image features to automatically colorize grayscale images.Zhang et al. [22] have proposed a fully-automated technique that increases the diversity of colors in images by posing colorization as a classification problem.
Guo et al. [8] were among the first researchers to leverage machine learning to detect fake colorized images.They proposed two classification methods, FCID-HIST and FCID-FE, that compute the statistical differences in the hue, saturation, dark and bright channels in different ways; they then employ support vector machines to distinguish between natural and fake colorized images.

Detection Methodology
Research in deep learning has significantly enhanced colorization techniques.It has become very difficult for humans to distinguish fake colorized images from natural images.The proposed ColorDet-NN approach for detecting fake colorized images effectively analyzes the statistical differences between natural images and fake colorized images generated by three state-of-the-art techniques developed by: (i) Larsson et al. [13]; (ii) Iizuka et al. [11]; and (iii) Zhang et al. [22].The ColorDet-NN neural network is then trained to detect fake colorized images.

Statistical Analysis and Testing
Statistical differences exist in the color distributions of natural images and fake colorized images.The RGB color space is defined by three chromaticities of the red, green and blue primary color channels (range is from 0 to 255), which can produce any chromaticity in the triangle defined by the primary colors.The HSV color space is an alternative representation of the RGB color space, which has hue, saturation and value channels.The RGB color space has more redundant information, which leads to insufficient feature differentiation.Therefore, the HSV color space is employed to obtain more features.
Normalized histograms were computed for the red, green, blue, hue, saturation and value channels in 10,000 natural images from the Ima-geNet LSVRC 2012 Validation Set [19].The corresponding fake colorized images were generated using the colorization techniques of Larsson et al. [13], Iizuka et al. [11] and Zhang et al. [22].
The absolute differences between the distribution values of natural images and those of fake colorized images divided by the distribution values of the natural images were computed for the red, green, blue, hue, saturation and value channels.Table 1 shows the maximum values of the percentages obtained for the six channels.Clearly, a statistical difference exists in each channel between the natural and fake colorized images generated by each of the three colorization techniques.

Colorization Hue Saturation Value Technique
Larsson et al. [13] 1 1 1 Iizuka et al. [11] 1 1 1 Zhang et al. [22] 0 1 0 Note that significant differences exist in the saturation channel.In this channel, all the percentages are more than 1,600%, which means that significant color biases exist at some channel values between the natural and fake colorized images.In addition, the minimum percentage reached 32%, which suggests that there are statistical differences that can be utilized for detection.
The two-sample Kolmogorov-Smirnov test [7] is employed to determine whether the distributions of natural images and fake colorized images are different.The test checks whether the two data samples have the same distributions in order to measure their differences.
The null hypothesis H 0 is defined as: The two data samples satisfy the same distribution.
Let KST est c m be the two-sample Kolmogorov-Smirnov test result between the distribution of natural images and the distribution of fake colorized images generated by a colorization method m in channel c.Then, the null hypothesis is rejected at the 0.05 level of significance if KST est c m = 1.Tables 2 and 3 show that at least one channel will reject the null hypothesis for each colorization method in each color space.In the case of fake colorized images generated using the technique of Iizuka et al. [11], the red, green, blue, hue, saturation and value channels all reject the null hypothesis.On the other hand, for fake colorized images generated using the technique of Zhang et al. [22], only the blue and saturation channels reject the null hypothesis, but this still means that the features of at least two channels can be used to distinguish between natural and fake colorized images.Simply put, there are statistical differences in the color distributions of natural and fake colorized images.

Feature Extraction
The statistical differences in the red, green, blue, hue, saturation and value channels are used for feature extraction.
Specifically, to distinguish between natural and fake colorized images the following six features are employed: (i) red channel feature F r ; (ii) green channel feature F g ; (iii) blue channel feature F b ; (iv) hue channel feature F h ; (v) saturation channel feature F s ; and (vi) value channel feature F v .
Each channel feature is computed in a similar manner.For each feature F ch , let Hist total n,ch denote the normalized histogram distribution for all the natural images for channel ch.Let Hist α ch denote the ch channel histogram distribution for an input image α.The feature computation also leverages the first-order derivative of the normalized channel histogram distributions.These first-order derivatives are Deri total n,ch for natural images.The Deri c h α for an input image α is computed as: where Hist α ch (i) and Deri α ch (i) are components of the vectors Hist α ch and Deri α ch , respectively.Since a natural image has a closer similarity to the natural image distributions than a fake colorized image, the cosine similarity cos is used to measure the overall similarity between Hist α ch and Hist total n,ch and Deri α ch and Deri total n,ch : where A i and B i are components of vectors A and B, respectively.The feature computations F α r , F α g , F α b , F α h , F α s and F α v for input image α are given by: After all the features are obtained, the feature vector F α HIST for an input image α is: Let L α HIST denote the binary label of F α HIST .L α HIST has a value of one if input image α is a fake colorized image and L α HIST has a value of zero if input image α is a natural image.
Thus, the final detection data D α HIST is:

Neural Network Construction
An artificial neural network is an algorithm that models computations using graphs of artificial neurons, mimicking how neurons work in the brain.Artificial neural networks are well-suited to solving complex nonlinear problems.Unlike traditional machine learning algorithms such as support vector machines, artificial neural networks have flexible structures that can be adapted according to the problem that is to be solved.This work uses an artificial neural network to differentiate natural images from fake colorized images.The artificial neural network employed for detecting fake colorized images is based on the dense convolutional network (DenseNet) model [10].DenseNet has a relatively simple structure, in which every layer of the network is connected to every other layer in a feed-forward manner.Compared with other neural network models, DenseNet strengthens feature propagation while reducing the number of parameters.

Hidden
Figure 2 shows the structure of the neural network used for fake colorized image detection.The neural network has six layers -an input layer, an output layer and four hidden layers.Each hidden layer is fully connected to the previous layers.For each hidden layer, the input of the layer is the sum of the outputs of the other hidden layers.
The relationships of the hidden layers are given by: where X i and Y i are the input and output of layer i, respectively.The selection of an appropriate activation function is an important aspect when designing a neural network.The proposed technique employs a parametric rectified linear unit (PReLU) [9], an activation function with parameters that can be trained.This activation function is used in the hidden layers of the network.Table 4 shows the details of the neural network.Hidden layers 1 through 3 have 32 neurons each whereas hidden layer 4 has 128 neurons.
The joint supervision of the softmax loss function and center loss function [21] was used to train the neural network.The softmax loss function is one of the most widely used loss functions.The center loss function has been demonstrated to minimize intra-class variations while keeping the features of different classes separable.
The softmax loss function L S is: where x i is the i th deep feature, which belongs to the class y i ; m is the mini-batch; and n is the number of classes.The center loss function is: where c y i is the center of y i of the deep feature and is updated as the deep feature changes.The joint supervision of the softmax loss function and center loss function are used to train the neural network.
The final loss function is:

Experiments and Results
This section describes the datasets used in the experiments, the experimental measurements and the performance evaluation results.

Datasets
Six benchmark datasets based on the ImageNet LSVRC 2012 Validation Set [19] were employed in the experiments.The datasets, which are widely used in image colorization and fake image detection research, contain many categories of images, including images of people, animals, buildings and landscapes.
The D1 dataset corresponds to the ctest10k dataset [13], which has 10,000 fake colorized images and their corresponding 10,000 natural images from the ImageNet LSVRC 2012 Validation Set.Datasets D2 and D3 each contain the 10,000 natural images in dataset D1 as well as 10,000 fake colorized images generated from the natural images using the colorization techniques of Iizuka et al. [11] and Zhang et al. [22], respectively.Thus, datasets D1, D2 and D3 each have 20,000 images.
The D4 dataset contains 2,000 fake colorized images randomly selected from the ctest10k dataset [13] and their corresponding 2,000 natural images from the ImageNet LSVRC 2012 Validation Set, resulting in a total of 4,000 images.The D5 dataset also has 4,000 images -2,000 natural images selected randomly from dataset D1 and their corresponding fake colorized images generated by the colorization technique of Iizuka et al. [11].The D6 dataset also has 4,000 images -2,000 natural images selected randomly from dataset D1 and their corresponding fake images generated by the colorization technique of Zhang et al. [22].

Measurements
The accuracy, precision, recall and F1 score were used to evaluate the performance of ColorDet-NN.In addition, the half total error rate (HTER) and area under the curve (AUC) measurements were used to compare the performance of ColorDet-NN against the performance of FCID-HIST and FCID-FE developed by Guo et al. [8].

Performance Evaluation
Several experiments were designed to evaluate the performance of ColorDet-NN.The experiments use all six datasets, D1 through D6.
The first set of experiments evaluated the ability of ColorDet-NN to detect fake colorized images.Datasets D1, D2 and D3 were used to assess the performance of ColorDet-NN at detecting fake colorized images generated using the colorization techniques of Larsson et al. [13], Iizuka et al. [11] and Zhang et al. [22].Each dataset D1, D2 and D3 was randomly divided into a training set corresponding to 75% of the dataset and a testing set corresponding to 25% of the dataset.
Nine cross-validation experiments were conducted using the three training sets and three testing sets.The results in Table 5 demonstrate that ColorDet-NN can effectively distinguish between natural images and the fake colorized images generated by the three colorization techniques.All the accuracy values are greater than 88% when the training and testing sets come from the same original dataset.However, the accuracy values fall when the training and testing sets come from different datasets.Most of the experiments have accuracy values greater than 73%, except for the third experiment; this is likely due to large differences in the image features for fake images generated by the colorization techniques.Table 6 shows the area under the curve results in the cross-validation experiments.All the area under the curve results are greater than 95% when the training and testing sets come from the same dataset.The results imply that ColorDet-NN is effective at detecting the fake colorized images.
The next set of experiments were conducted to compare the detection performance of ColorDet-NN against state-of-the-art techniques for detecting fake colorized images.The FCID-HIST and FCID-FE fake colorized image detection techniques developed by Guo et al. [8] were used in the comparisons.Datasets D4, D5 and D6 were divided equally into training sets and testing sets in order to evaluate the performance of ColorDet-NN versus FCID-HIST and FCID-FE.
Nine experiments were performed using testing and training sets drawn from the same and different datasets.Table 7 compares the area under the curve results for ColorDet-NN, FCID-HIST and FCID-FE.ColorDet-NN has better performance than FCID-HIST and FCID-FE in most situations, especially when the training and testing sets are drawn from the same dataset (area under the curve values greater than 93%).A small decline in performance is seen when the training and testing sets  Table 8 compares the half total error rate results for ColorDet-NN, FCID-HIST and FCID-FE.ColorDet-NN has lower values than those of FCID-HIST and FCID-FE, which implies that ColorDet-NN outperforms FCID-HIST and FCID-FE in detecting fake colorized images.
In summary, the experiments demonstrate that ColorDet-NN has better performance than FCID-HIST and FCID-FE in distinguishing natural images from fake colorized images.

Conclusions
The ColorDet-NN neural-network-based technique for detecting fake colorized images has three steps.The first step analyzes and validates the statistical differences existing between fake colorized images and their corresponding natural counterparts.The second step employs the cosine similarity of normalized histogram distributions between fake and natural images in various channels to extract features for detection.The third step designs and trains ColorDet-NN to detect fake colorized images.Experiments with six datasets containing fake colorized images generated by three state-of-the-art colorization techniques demonstrate that ColorDet-NN significantly outperforms existing detection methods.
The ColorDet-NN technique exhibits reduced performance when its training and testing sets are drawn from different datasets.This occurs because different colorization techniques with large differences in the statistical information of color distributions significantly impact the extraction of features used for fake image detection.Future research will focus on the common features of colorization techniques and leveraging auxiliary features such as texture to enhance detection.Additionally, efficient neural network structures will be investigated as a means to improve performance.

Table 1 .
Maximum absolute differences for natural and fake image distributions.

Table 2 .
Two-sample Kolmogorov-Smirnov test results for the RGB channels.

Table 5 .
Detection results in the cross-validation experiments.

Table 6 .
Area under curve results in the cross-validation experiments.

Table 7 .
Comparison of area under the curve results.FE, except in the third and seventh experiments.The negative results in these two cases arise because dataset D6 has more complex features than dataset D4 and the features extracted by FCID-HIST and FCID-FE are more sensitive than the features extracted by ColorDet-NN.

Table 8 .
Comparison of half total error rate results.