Delegating Biometric Authentication with the Sumcheck Protocol

. In this paper, we apply the Sumcheck protocol to verify the Euclidean (resp. Hamming) distance computation in the case of facial (resp. iris) recognition. In particular, we consider a border crossing use case where, thanks to an interactive protocol, we delegate the authentication to the traveller. Veriﬁable computation aims to give the result of a computation and a proof of its correctness. In our case, the traveller takes over the authentication process and makes a proof that he did it correctly leaving to the authorities to check its validity. We integrate privacy preserving techniques to avoid that an eavesdropper gets information about the biometric data of the traveller during his interactions with the authorities. We provide implementation ﬁgures for our proposal showing that it is practical.


Motivation
In order to increase the throughput in border crossing, controls operated by officers could be replaced with automated systems. Such systems often use biometrics to authenticate the travellers: a comparison is made between an official document such as a biometric passport and the traveller who needs to prove his identity. However biometric data need to be collected from the traveller to be compared with data stored on the official document and this step of the process can be time consuming. Delegating a part of the process to the traveller can save time but raises a confidence problem: how can the authority be sure that the traveller really ran the computation?
We use verifiable computing as a tool to address this problem. A verifiable computation system allows a verifier to delegate the computation of a function to a prover. Upon completion of the computation, the prover returns the result and a proof of the computation. In our use case, the traveller's smart device has the role of the prover and has thus restricted computational power and storage capacity. We stress that this reverses the classical roles played by the verifier and the prover in most of verifiable computing scenarios, where a weak verifier usually delegates computations to a powerful but untrusted prover. The choice of the underlying verifying system has thus been driven according to this configuration. In particular, the requirements for the prover and the targeted computation led us to choose an interactive proof protocol, namely the sumcheck protocol [15].

Background on Biometrics
A biometric system is a pattern recognition system, which makes biometric data acquisition from an individual, then extracts a feature set from the acquired data which gives a biometric template. In an authentication scheme,the template is then compared against a referenced template and in an identification scheme it is compared against a database of templates. Due to external conditions such as light, moisture or the sensor used for the capture, two templates computed from the same individual can vary. However, the variation is expected to be small enough to be able to discriminate two templates coming from the same person from two templates coming from different individuals. This is why the comparison of two templates is usually a matching score, reflecting a similarity rate between the two data. A matching threshold has to be defined to discriminate the templates belonging to the same individual or not. Ideally, if the score of two templates is lower than the threshold, they belong to the same individual. However, in biometric systems, two different individuals can have a matching score lower than the threshold, which leads to the definition of the false acceptance rate (FAR) and the false rejection rate (FRR), see [13] for details.
In our scenario, we need an automated face recognition system. Today, many systems performing face recognition use machine learning techniques to transform a face picture into a biometric template. The model called convolution neural network (CNN) [14] has shown excellent results [18,22]. CNNs have millions of parameters that are tuned in a learning phase, using a face database for the training. Once the training phase is over, the CNN can embed a picture in a Euclidean space where two vectors representing the same face are closer than two vectors that come from different faces, enabling face recognition.

Background on Verifiable Computation
Although the problem of verifying computations has been theoretically solved with tools from complexity theory and cryptography [1,16], new challenges raised by verifiability in the setting of cloud computing recently attracted the interest of researchers. Huge progresses have been made and several research teams succeeded in implementing verifiable computing systems. All these systems start by turning the function to verify into a circuit composed of multiplication and addition gates and then perform verification on the circuit.
A first line of work has built on a refined version of probabilistically checkable proofs (PCP) [12] and resulted in a verifiable system called Pepper [20], which has been refined since [19,24]. The second line was opened by Gennaro et al. [9], who achieved a breakthrough by building efficient objects to verify computations called quadratic arithmetic programs (QAPs), resulting in an efficient system called Pinocchio [17]. Pinocchio and its refined version [7] allow public verifiability: anyone who has access to the public verification key can verify proofs. Moreover, the prover can make his proof zero-knowledge: he supplies a private input to the computation and builds a proof of the correctness of the result without revealing his input to the verifier. Finally, a system called TinyRAM and designed by Ben-Sasson et al. [3] uses QAPs and has the ability to verify a larger class of computations by modelling programs using RAM. The third line of work relied on the notion of interactive proofs, which was introduced by Goldwasser et al. [11]. In the verifiable computing setting, the verifier checks that the result of the computation is correct during a sequence of interactions with the prover. The more the verifier asks queries, the less the prover has chance to cheat. Goldwasser et al. [10] introduced an efficient protocol, later optimized and implemented by Cormode et al. [5]. The last version of this protocol, due to Thaler [23], is currently one of the fastest scheme for verifiable computing. Furthermore, Thaler proposed an implementation of matrix multiplication and also showed that the the main tool of interactive proofs protocols, namely the sumcheck protocol [15], can be used to design an efficient protocol for matrix multiplication verification. However all the systems described above are only nearly practical for generic computations. The different systems all have advantages and drawbacks, depending on the type of computations to be verified. One important thing is that all systems building upon PCPs and QAPs need precomputations and amortize their costs by using the verified function several times. The fastest system needs no precomputation and uses the cmt protocol but it cannot handle general computations. Systems based on QAPs and on PCPs have better expressiveness and allow fast verification but the prover's overhead costs compared to native execution of the same computation is consequent. See [25] for comparisons between the different existing systems.
Cormode et al. [6] suggested that the sumcheck protocol could be used to verify an inner product in the setting of data streaming, where the verifier cannot store the inputs and has to update his computations while he is parsing the data. The recent work of [4] studies the use of verifiable computing for biometric verification in a non-interactive setting i.e. where the prover computes a proof without interacting with the verifier. In contrast, we focus on interactive proofs to design and implement a protocol which aims at verifying several distances used in biometric matchings and adapt the sumcheck protocol [15].

Use-Case : Fast Border Control
In many places, people living next to another country frequently cross the border with their cars to go to work. We want here to design an automated system to reduce the waiting time, taking profit of the waiting time in the cars queuing line. Our idea is to let the driver perform himself his face verification against his passport photo while crossing the border. Such operations could be performed  Fig. 1. The biometric matching process by a dedicated application installed on his smartphone. At the end, the customs authority will get from this application: a fresh photograph, the official one and a proof that both belong to the same person ( Figure 2). A high-level description of our solution (see also Figure 1): -The traveller (who plays the role of the prover) uses a wireless communication device of the mobile to get the picture stored in his biometric passport. -The picture is turned into a reference template using a CNN.
-The traveller takes a fresh picture of his face and uses the same CNN to create a biometric template. -A biometric matching is performed on the traveller's mobile and interactions between the traveller and the automated border control device lead to a proof that the matching was correctly computed. The interaction begins with the prover sending two templates and the result of the distance computation to the verifier. The proof is stored on the mobile for a later examination.
We emphasize that our contribution is limited to the application of verifiable computation on distance computations involved in biometric matchings. This is only a part of what is needed to address the whole problem. For instance, liveness detection or verifying the CNN output seem necessary but those topics are outside the scope of this paper. Our purpose here is to deal with a realistic use case for a delegation of a verifiable face matching algorithm. Since a CNN embeds a pictures in a Euclidean space, the verifiable biometric matching involves a distance computation which is compared to a threshold. We first show how to verify an inner product and then extend the verification to euclidean distance computing.
where g is a multivariate polynomial defined over a finite field. If g has n variables, the protocol has n rounds. In each round, the verifier picks a random value  . . .
To increase the possibility of catching a lie from a cheating prover, the sumcheck protocol uses polynomials defined over a large finite field, which agree with a and b over {0, 1} d and called low-degree extensions and denotedã andb. The above relation still holds with low-degree extensions of a and b.
Squared Euclidean distance The protocol described in Section 3 can be adapted straightforwardly to verify Euclidean distance. Indeed, given two ncomponents biometric templates a and b, their squared Euclidean distance is: Denoting d = log 2 n, we have to verify with the sumcheck protocol the evaluation of the polynomial g over {0, 1} d : The same ideas can be adapted to verify the distance involved in iris recognition, which is a weighted Hamming distance [8].

Adding Data Privacy to the Protocol
At the beginning of the protocol described in Section 2, the driver has to send his reference and his fresh templates to the authorities for the verification process. Since biometric template cannot be revoked, we propose to add masking techniques for the templates [2]. In our context, this means that the driver has to pick a random permutation of the template coordinates and a random vector of the same size than the template. More precisely, a template t = (t 1 , . . . , t n ) masked becomes t masked = π(t) + (r 1 , . . . , r n ) where π is a random permutation of the n coordinates and (r 1 , . . . , r n ) is a n components vector of F n p . So if t ref and t are masked with the same permutation and random vector, computing their distance involves computing their difference: And the scalar product of this difference has the same value than the scalar product computed on the vectors without masks: since π permutes the same coordinates on t and t ref , the difference vector masked is the permutation of the original difference vector and computing the scalar product on this masked vector will give the same result.
The distance computation with masked templates gives information about the distance between the templates and the differences between the coordinates of the templates. But linking these differences coordinates to the unmasked template coordinates is hard because of the number of possible permutations and vectors.
We also stress that the driver has to store the permutation and the random vector. Therefore if the authorities have a doubt about the identity of the driver, the driver has everything on his phone to unmask the templates and compute the distance between them. Similar techniques can be used for iris recognition.

Experimental Results
We implement a verified inner product using the sumcheck protocol, the computations being over the prime finite field F p where p = 2 61 −1. The size of a field element is thus inferior to the machine word size and the probability of being fooled by a dishonest prover is small, see Table 1. Note that optimizations are possible for the verifier but since in our use case the verifier has computational power, we did not implement them.
We run our benchmarks on random vectors of different sizes composed of natural numbers. Dealing with negative numbers or with floating-point rationals is possible with an additional step, e.g. the computations over negative numbers are mapped to computations over a finite field large enough so that the mapping is a one-to-one function [21]. This step is done before the prove and verify steps. The protocol has therefore to be implemented in a larger field at the cost of a decrease of performances.
Communication costs and security For input vectors of size n, the sumcheck protocol has log 2 n rounds, the verifier sends one field element per round (the random challenge, see Section 3) and the prover three (the three values needed to interpolate round k polynomial). Not taking into account the sending of the input values, we obtain that the total communication during the protocol is 4 log 2 (n) + 1 field elements.
The security of the sumcheck protocol is the probability that a cheating prover builds a proof of a false result that will be accepted by the verifier, this value is given in Table 1 for different input sizes.

Benchmarks
We run experiments on a laptop with a 2 GHz Intel Core i5 processor with 8 GB of RAM. The implementation is written in C++. Table  1 gives the average times of 1000 computations for each vector size. We note that this technique does not need the notion of arithmetic circuits. Using the optimized version of the cmt protocol (see Section 1) would lead to a slower protocol with two times more communication costs.  20 3 122 600 2 −51 648 B Table 1. Benchmark of the verified inner product of two n-components vectors