Reasoning About Privacy Properties of Architectures Supporting Group Authentication and Application to Biometric Systems

. This paper follows a recent line of work that advocates the use of formal methods to reason about privacy properties of system architectures. We propose an extension of an existing formal framework, motivated by the need to reason about properties of architectures including group authentication functionalities. By group authentication, we mean that a user can authenticate on behalf of a group of users, thereby keeping a form of anonymity within this set. Then we show that this extended framework can be used to reason about privacy properties of a biometric system in which users are authenticated through the use of group signatures


Introduction
The privacy-by-design approach promotes the consideration of privacy requirements from the early design stage of a system. As an illustration of the importance of this topic, the General Data Protection Regulation adopted by the European trilogue (the European Commission, the European Parliament and the Council) in December 2015 [7] introduces privacy-by-design and privacyby-default as legal obligations. Architectural choices have a strong effect on the privacy properties provided by a system. For this reason, the authors of [1] argue that key decisions regarding the design of a system should be taken at the architecture level. They introduce a formal framework for reasoning about privacy properties of architectures. The description of an architecture within this framework specifies the capacities of each component, the communications between them, the location of the computations and the data, and the trust relationships between the stakeholders. A dedicated privacy logic is used to express the privacy properties of the architectures. The use of formal methods enables precise definitions of properties and comparisons between architectures. It also makes it possible to provide a rigorous justification for the design choices.
As a first contribution of this paper, we propose an extension of this formal framework and show that it can be used to reason about properties of architectures supporting group authentication. By group authentication, we mean that a user can authenticate on behalf of a group of users. Several cryptographic primitives have been designed to achieve this goal. Our work provides the formal tools needed to reason about the properties of architectures involving these primitives, especially the guarantees that are provided in terms of privacy.
As a second contribution of this paper, we apply our extended framework to biometric systems. In a biometric system, users are authenticated with their biometric traits. The work of [3] uses the formal framework of [1] to reason about privacy properties of biometric architectures but it cannot deal with group signatures. We show that the extended framework can be used to reason about privacy properties of a biometric system in which users are authenticated by group signatures.
The interest of group signature in the context of biometrics has been shown in different contexts. For example, the biometric system architecture analysed in this paper was proposed in TURBINE [16], a European project which aimed at solving privacy concerns regarding the use of fingerprint biometrics for ID management. The application of this architecture was a pharmacy product research system. Pharmacists, for instance working at their selling desks, authenticate themselves to a pharmacy administration system. Authentication is based on a card owned by the employee, as well as its fingerprint. Thanks to the use of group signatures, a remote server (which does not get the fingerprint) is convinced that a valid enrolled user authenticates without knowing precisely who he is among the set of valid users (aka the employees).
Organization of the paper. Section 2 supplies an overview of the formal framework of [1]. Section 3 introduces our extension of this model. Section 4 presents the biometric architecture we are interested in, describes it within the architecture language of the formal framework, and analyses its privacy properties. Finally, we discuss in Section 5 some variants of the biometric architecture, before concluding in Section 6.

Reasoning about privacy properties of architectures
In this section, we provide an overview of the framework introduced in [1] which is the foundation for our work. The interested reader can refer to [1] for a more complete description of the framework.
This framework relies on a dedicated epistemic logic for expressing privacy properties. Epistemic logics are good candidates to express privacy properties since they deal with the notion of knowledge. However, the standard possible worlds semantics for these logics lead to a well-known issue called the logical omniscience problem [9]. In a nutshell, any agent knows all the logical consequences of his knowledge. To get around this issue, the authors of [1] adopt an approach based on deductive algorithmic knowledge [13]. In this context, each component of an architecture is endowed with its own deductive capabilities.
Architectures are described with a dedicated architecture language. Then the semantics of a privacy property is defined as the architectures in which the property holds.

A privacy architecture language
First of all, the functionality of a system is described by a set Ω = {X = T } of equations over the following term language.
A term T might be a variable X (X ∈ V ar), a constant c (c ∈ Const) or F a function applied to some variables (F ∈ F un).
Then the architecture of a system is described by the following architecture language.
St ::= P ro | Att Att ::= Attest i ({Eq}) P ro ::= P roof i ({P }) Eq ::= P red(T 1 , . . . , T m ) P ::= Att | Eq An architecture A is associated to a set of components C = {C 1 , . . . , C |C| }. In the architectural primitives, i and j stand respectively for C i , C j and G ⊆ C denotes a set of components.
In the above syntax, {Z} denotes a set of elements of category Z. P red denotes a predicate, the set of predicates depending on the architectures to be considered. Has i (X) denotes the fact that component C i possesses (or is the origin of) the value of X, which may correspond to situations in which X is stored on C i or C i is a sensor collecting the value of X. Receive i,j ({St}, {X}) means that C i can receive the values of variables in {X} together with the statements in {St} from C j .
Attest i ({Eq}) is the declaration by C i that the properties in {Eq} hold and P roof i ({P }) is the delivery by C i of a set of proofs of properties. V erif y i is the verification by component C i of the corresponding statements (proof or authenticity). Compute G (X = T ) means that the set of components G can compute the term T and assign its value to X and T rust i,j represents the fact that component C i trusts component C j .
Graphical data flow representations can be derived from architectures expressed in this language. For the sake of readability, we use both notations in the next sections.
All architectures are assumed to satisfy minimal consistency assumptions, in order to restrict the analysis to those which make sense. For instance, if a component sends a variable, we assume that this variable can be sent, computed or received by the component.
Events are instantiations of the architectural primitives (trust relations excepted). Traces are sequences of events, defined according to the following trace language.
θ ::= Seq( ) ::= Has i (X : Seq( ) denotes an ordered sequence of events . When instantiating a primitive containing a variable X, the notation X : V means that the variable X receives the value V . Let V al be the set of values that the variables can take. T is a term where values have been assigned to variables. The set V al ⊥ is defined as V al ∪ {⊥} where ⊥ ∈ V al is a specific symbol used to denote that a variable has not been assigned.
As for architectures, only traces satisfying consistency assumptions are considered. denotes the empty trace (with no event).
A trace θ of events is said compatible with an architecture A if each event in θ (except the computations) can be obtained by instantiation of an element of A (Receive, Verify, etc.). Let T (A) be the set of traces which are compatible with an architecture A.
Each component C i is associated with a dependence relation Dep i . For a variable Y and a set X of variables, Dep i (Y, X ) -equivalently (Y, X ) ∈ Dep imeans that the value of Y can be obtained by the component C i if it gets access to the value of X, for each X ∈ X .
Each component C i is also associated with a deductive system, noted i , allowing it to derive new knowledge. i is defined as a relation between equations {Eq 1 , . . . , Eq n } i Eq 0 , where equations over terms are defined according to the following syntax.
Eq ::= P red(T 1 , . . . , T m ) | Eq ∧ Eq By a slight abuse of notations, Eq is an overloaded notation of the Eq definition in the language architecture, where conjunctions of equations are also possible.
Finally, the semantics of an architecture is defined from the traces of events. Each component is associated with a state. Each event in a trace of events affects the state of each component involved in the event. The semantics S(A) of an architecture A is defined as the set of states reachable by compatible traces.

A privacy logic
Privacy properties of architectures are expressed with the following language.
The knowledge operator K i represents the knowledge of the component C i . The formula Has i represents the fact that C i can get access to variable X.
The semantics S(φ) of a property φ is defined as the set of architectures where φ is satisfied. The fact that a property φ is satisfied by a (consistent) architecture A is defined for each property as follows.
Based on the semantics of properties, [1] introduces a set of deductive rules which can be used to reason about privacy properties of architectures. This deductive system is shown correct and complete with respect to the semantics of the properties.
A φ denotes that φ can be derived from A -in other words, that there exists a derivation tree such that each step belongs to the axiomatics and the leaf is A φ. A subset of this axiomatics, useful for this paper, is presented in Figure (

Adding a group attestation to the formal model
As a first step to extend the architecture language of [1], we introduce the primitive Attest G (E) where G is a group of components and E a set of equations. This primitive generalizes Attest i (E) which involves a single component C i . Section 3.1 defines the semantics of the traces containing these events and Section 3.2 extends the set of deductive rules.

Semantics of traces
The semantics of a trace is defined by specifying, for each event, its effect on the states of the components.
The state of a component is either the Error state or a pair consisting of: (i) a variable state assigning values to variables, and (ii) a property state defining the current knowledge of a component. In the initial state of an architecture A, denoted Init A = Init A 1 , . . . , Init A |C| , the variables are undefined and the knowledge state only contains the trust primitives.
Let σ denote the global state, and σ i denote the state of component i. The semantics of traces, denoted S T , is defined recursively over sequences of events.
The function S E , which defines the effect of the events, is defined for each type of event. The modification of a state is noted σ[σ i /(v, pk)] the variable and Restricting our attention to the events which contains a group attestation leads us to consider the events V erif y i (Attest G (E)) and V erif y i (P roof j (E)). The semantics of the verification events are defined according to the (implicit) semantics of the underlying verification procedures. In both cases, the knowledge state of the component is updated if the verification passes, otherwise the component reaches an Error state. The variable state is not affected. Informally, a verification event containing a generalized attestation statement generates new knowledge only if all possible authors of the attestation are trusted by the verifying component C i .
where the new knowledge new pk P roof is defined as: (1) and the new knowledge new pk Attest is defined as:

Axiomatics
The next challenge to deal with group attestation is the extension of the set of deductive rules and the proof of the correctness and completeness properties still hold. Our axioms for group attestation are presented in Figure (1b). In the remaining of this section, we show that the correctness and the completeness of the axiomatics still hold with these new axioms.
Correctness. Let A be a consistent architecture and φ a property. The correctness theorem states that if there exists a derivation tree for this property (A φ), then this property holds in the architecture (A ∈ S(φ)). The proof is made by induction on the depth of the tree A φ. Let us restrict our attention to the cases where (K4 + ) and (K5 + ) are used. That is, let us assume that A K i (Eq), and that the derivation tree is of depth 1. By definition of the set of axioms, such a proof is obtained by application of (K1), (K3), (K4 + ) or (K5 + ). Let us focus on the K4 + and K5 + cases.
K4 + . Let us assume that V erif y i (P roof j (E)) ∈ A, Attest G (E ) ∈ E and ∀k ∈ G: T rust i,k ∈ A for some i, j and G. Our goal is to prove that ∀Eq ∈ E : A ∈ S(K i (Eq)).
Let us consider a given state σ ∈ S i (A). By the architecture semantics, there exists a consistent trace θ , compatible with A, such that σ = S T (θ , Init A ). Two cases may happen. Either θ contains an event V erif y i (P roof j (E)) such that Attest G (E ) ∈ E, and we let θ := θ , or it is not. In the latter case, we extend θ into a trace θ such that θ contains such an event without breaking the consistency of the trace.
In either cases, there exists a trace θ which extends θ and contains an event V erif y i (P roof j (E)) such that Attest G (E ) ∈ E. Let σ = S T (θ, Init A ). Since an Error state has not been reached (we have σ ∈ S i (A)), and since ∀k ∈ G : T rust i,k ∈ σ pk i by definition of the initial state, then by the semantics of the group attestation (Equation (1)) we have ∀Eq ∈ E: Eq ∈ σ pk i . By the definition of the architectures semantics, we deduce that σ ∈ S(A). The prefix order over the traces together with the definition of the semantics of the trace induce a prefix order over the states, hence σ ≥ i σ . By the reflexivity of the deductive algorithmic knowledge, we have ∀Eq ∈ E : σ pk i i Eq. By the semantics of the properties, we conclude that ∀Eq ∈ E : A ∈ S(K i (Eq)).
K5 + . Let us assume that V erif y i (Attest G (E)) ∈ A and ∀k ∈ G: T rust i,k ∈ A. We must show that ∀Eq ∈ E: A ∈ S(K i (Eq)). Adaptation of the K4 + to the K5 + case is straightforward, invoking Equation (2) of the trace semantics instead of Equation (1).
Completeness. Let A be a consistent architecture and φ a property. The completeness theorem states that if the property holds in the architecture (A ∈ S(φ)), then there exists a derivation tree for this property (A φ).
The proof is made by induction over the definition of the property φ. We restrict our attention here to the knowledge operator K i . Let us assume that A ∈ S(K i (Eq)) for a given component C i and equation Eq. We must show that A K i (Eq).
By the semantics of properties, A ∈ S(K i (Eq)) means that ∀σ ∈ S i (A): ∃σ ∈ S i (A): σ pk i i Eq. By the semantics of architectures, ∃θ ∈ T (A) such that (σ = S T (θ, Init A ) and σ pk i i Eq). By the semantics of the traces, this implies one among the following statements: either there exists Compute G (X = T ) ∈ θ where Eq := (X = T ) and C i ∈ G and T is obtained from T (by assigning values to variables); or there exists V erif y i (P roof j (E)) ∈ θ where Eq ∈ E; or there exists V erif y i (P roof j (E)) ∈ θ where Attest G (E ) ∈ E, Eq ∈ E and ∀k ∈ G: T rust i,k ∈ σ pk i and Eq ∈ E ; or there exists V erif y i (Attest G (E)) ∈ θ, Eq ∈ E and ∀k ∈ G: T rust i,k ∈ σ pk i . By the compatibility of the traces, we deduce that: either Compute G (X) ∈ A where Eq := (X = T ) and C i ∈ G; or V erif y i (P roof j (E)) ∈ A where Eq ∈ E; or V erif y i (P roof j (E)) ∈ A where Attest G (E ) ∈ E, Eq ∈ E and ∀k ∈ G: T rust i,k ∈ A and Eq ∈ E ; or V erif y i (Attest G (E)) ∈ A, Eq ∈ E and ∀k ∈ G: T rust i,k ∈ A. We conclude that A K i (Eq) by applying (respectively) (K1), (K3), (K4 + ) or (K5 + ).

A biometric architecture using group signatures
Biometric systems involve two main phases: enrolment and verification (either authentication or identification) [10]. Enrolment is the registration phase, in which the biometric traits of a person are collected and recorded within the system. In the authentication mode, a fresh biometric trait is collected and compared with the registered one by the system to check that it corresponds to the claimed identity. In the identification mode, a fresh biometric data is collected and the corresponding identity is searched in a database of enrolled biometric references.
A group signature scheme [2] is an advanced cryptographic mechanism. It enables a user to sign messages on behalf of a group of users while staying anonymous inside this group. With a (public) verification algorithm, anyone can be convinced, given a group public key, a message, and a signature, that a certain member of the group authenticates the message.
The biometric system introduced in [4] aims at achieving some anonymity from the server's point of view. The server is convinced that the authentication was successful for a certain enrolled user, but has no information about which among them. During the enrolment, a biometric reference is registered by the issuer. The issuer derives a user secret key from the biometric template and computes a group secret key, that is, a certificate attesting the enrolment inside the group. The user gets a card containing its biometric reference and the group certificate.
During the verification phase, the terminal gets a fresh capture of the biometric trait and computes a fresh template. A match between the fresh template and the reference is performed by the terminal. In case of success, the terminal derives the user secret key from the reference, produces a group signature thanks to the user secret key and the certificate (both are needed to produce a valid signature), and sends the signature to the server. The server checks the signature attesting that a registered user authenticates. If the signature is valid, the server is convinced of the correctness of the matching. However, it has no access to the biometric templates, neither to the identity of the user who authenticates.

Description within the formal framework
For the sake of clarity, let us distinguish the biometric system and its formalization. We denote by B gs the biometric system introduced in [4] and A gs its definition within the formal framework, which we present below.
Upper case sans serif letters in A gs denote components. Components of the A gs architecture are a set of N enrolled users U := {U 1 , . . . , U N } (each user U i owning a card C i ), a server S, an issuer I and a terminal modelled by two components TM and TS. The issuer I enrols the users. The server S manages a database containing the enrolled templates. The terminal is equipped with a sensor used to acquire biometric traits. Formally, the terminal is split into two components TM and TS, corresponding respectively to its two functionalities. The matcher TM, acquires the fresh template and performs the comparison, and the signer TS authenticates on behalf of the group of users. As shown by the variants below, this distinction is motivated by the different trust assumptions a designer may consider.
Type letters denote variables. br i denotes the biometric reference template of the user U i built during the enrolment phase. rd denotes a raw biometric data provided by the user during the verification phase. bs denotes a fresh template derived from rd during the verification phase. A threshold thr is used during the verification phase as a closeness criterion for the biometric templates. The output dec of the verification is the result of the matching between the fresh template bs and the enrolled templates br, considering the threshold thr. db denotes the database of the registered biometric templates.
As in [3], we focus on the verification phase and assume that enrolment has already been done. The database db is computed by the issuer from all the references, using the function DB ∈ F un. A verification process is initiated by the terminal receiving as input a raw biometric data rd from the user. The terminal, more precisely the TM component, extracts the fresh biometric template bs from rd using the function Extract ∈ F un. The matching is expressed by the function µ ∈ F un which takes as arguments two biometric templates and the threshold thr. The terminal reads in the card the biometric template br. The user receives the final decision dec of the matching from the terminal TM. Then the terminal, here the TS component, attests that the fresh template belongs to the set of enrolled templates.
The complete description of A gs within the architecture language is as follows. Figure 2 sketches this description. When indices i are used, it is assumed that the corresponding primitive exists in A gs for all users. For instance Has I (br i ) ∈ A gs implicitly means that ∀U i ∈ U: Has I (br i ) ∈ A gs .

Trusting a group of users
In the biometric system architecture A gs , the group of users is trusted by the server, which is denoted ∀U i ∈ U: T rust S,Ui . However, the formalization does not define which cryptographic primitive is used in the concrete B gs system. Let us discuss this point in more detail.
In a group signature scheme, users are typically not trusted, but a group manager, called the issuer, is trusted. When it enrols a user, the issuer provides a group secret key, aka a membership certificate -concretely, a signature of some secret user-specific data. In other words, it attests that the user is enrolled. Then the untrusted user proves that it is enrolled (by supplying a zero-knowledge proof of her user secret data and the corresponding membership certificate). In our case, the server does not trust the card, but trusts the issuer of the card. The card contains an attestation that the user was indeed enrolled by the issuer, here a certificate for a group signature, i.e., a group secret key.
The point to be noticed is that we do not model its internal machinery in our formal architecture. We only express the fact that the group is trusted. Whether this trust assumption is justified or not in practice is not part of the reasoning about architecture: it rather regards the justification of the choice of certain primitives to achieve the functionality. With the same trust assumption (all users are trusted), other primitives can be used, as ring signatures [14], where a member authenticates on behalf of a group without group manager.
The use of group signatures is a choice made at the protocol level. Checking the conformity between the protocols and the architecture is out of scope of this paper. This line of work has been initiated in [15].

Application of the axiomatics
We now reason about the privacy properties of the A gs architecture from the server point's of view. A gs should enable the server to be sure that a certain enrolled user authenticates, but the authenticated user is anonymous from the server's point of view: A gs K S (br i ∈ db). But the server should have no access to the templates: A gs Has none S (br i ). Regarding the template protection, the statement A gs Has none S (br i ) is shown using rule HN. A subtlety here is the presence of the dependence between the biometric template br i and the database db. Therefore we first need to show A Has S (db).
Now HN can be applied.
A gs Has none S (bs) is also shown by an application of HN. Since the server trusts the users, an application of K5 + shows that the server is ensured that some enrolled user authenticates.

Variants
Several variants [4] of the biometric system B gs can be expressed and analyzed in our formal framework.

Lowering the trust on the group signing functionality
If the server trusts the matching functionality TM of the terminal but does not trust its signer functionality TS, then the component TS must supply a proof that some user is enrolled. The architecture, denoted A p gs , becomes: An application of the new rule K4 + enable to prove that the server is ensured that some enrolled user authenticates.

Combination with match-on-card
In the A gs architecture, the card is a plastic card. The biometric reference is just printed on it, together with a group secret key. To enhance the protection of the reference, a smart-card can be used instead of a plastic card, as in the Match-On-Card (MOC) technology [12,11,8]. The card stores the reference template, and the reference never leaves the card. During a verification, the card receives the fresh biometric template, carries out the comparison with its reference, and sends the decision back. The terminal trusts the smart card for the correctness of the matching. This trust is justified by the fact that the card is a tamper-resistant hardware element. The A gs architecture in which the plastic card is replaced by a smart-card performing a MOC is modelled as follows. In addition to the comparison, the card also computes the group authentication. Using rule HN, it is easy to show that no component apart from I and C i gets access to br i . The terminal should be convinced that the matching is correct: A moc gs K TM (dec = µ(br i , bs, thr)). The proof relies on the trust placed by the server in the matching component TM of the terminal.
V erif y TM (Attest Ci (dec = µ(br i , bs, thr))) ∈ A moc gs T rust TM,Ci ∈ A moc gs K5 + A moc gs K TM (dec = µ(br i , bs, thr)) Regarding the group authentication, an application of K5 + shows that the server is ensured that some enrolled user authenticates.

Anonymity revocation
As shown in [4], an additional mechanism can be used to revoke the anonymity of a group authentication if there is any legal need to do so. After the matching phase, the terminal has to encrypt the fresh template under the public key of a specific tracing authority, to sign all messages together, and to send the authentication result to the server. Then, at a later stage, the tracing authority may decrypt the template and check, with an access to the database of the issuer, that the templates were indeed close. This a posteriori check ensures a form of accountability which can be requested in certain contexts. The formal model introduced in [1] includes an additional architectural primitive, called SpotCheck, which can be used to carry out a posteriori checks and therefore to describe the above variant. However, the model including the  SpotCheck primitive is proven complete only when all the functions of the term language are at most unary. Since the comparison between templates, an essential operation of biometric systems, is inherently binary, we would then obtain a correct but incomplete system. We leave for future work the definition of a formal model with a posteriori verifications which would be both correct and complete and would not suffer this arity restriction in the term language.

Conclusion
In this paper, we have analysed the privacy properties of a biometric system in which users can remain anonymous from the point of view of a remote server, while the server is still convinced that a valid user authenticates. Table 1 sums up the properties of the different architectures considered here. Architecture A moc gs provides the best guarantees in terms of privacy. However, its deployment has a cost, since it requires that each user owns a card with powerful capabilities. Although quite demanding, these assumptions are not out of reach of the current technology [5]. The main variant A gs is more realistic. The choice between A gs and A p gs depends on the trust placed on each component in a specific deployment. The possibility to express these trust assumptions in a formal way and to study their consequences is one of the main benefits of the framework presented here because it provides rigorous justifications to make well-informed design choices for the architecture of a system.