Designing a GDPR-Compliant and Usable Privacy Dashboard

,


Introduction
In the age of digitalization, the data privacy of an individual can be severely violated by technology.Cases like Google Spain v AEPD and Mario Costeja González 1 highlight the extent of harm technology can do to an individual person by simply providing inaccurate (in this case outdated) information about the data subject.Its controversy had to be eventually decided by the European Court of Justice (ECJ), the highest court of the EU.While the case was solved with a verdict in favor of individuals' data privacy, doubts remained, which were fueled by the revelations of Edward Snowden in 2013, also called the Snowden Effect 2 , and underlined by the invalidation of the Safe Harbor Privacy Principles by the ECJ 3 in 2015.The EU addresses these concerns with the General Data Protection Regulation (GDPR) 4 , which comes into force in May 2018.The GDPR replaces the Data Protection Directive 5 of 1995 by extending the data privacy rights of data subjects in the EU with the goal to adapt to modern data privacy challenges.
A major change of the GDPR, among others, is the explicit requirement of transparency when processing personal information. 6In the recitals of the GDPR the lawmakers explain that "[t]he principle of transparency requires that any information and communication relating to the processing of those personal data be easily accessible and easy to understand, and that clear and plain language be used". 7Taking it literally, this would mean data subjects should be able to obtain any information they want, including the time a controller (i.e. a legal entity that processes personal information) accessed their personal data, from which source, to which processors (i.e.legal entities that process personal information on behalf of the controller) it has been forwarded, which data has been derived from it, and so on.However, in times of Big Data and Cloud Computing, providing this information can be very complex, considering the sheer amount of data a controller might process of a single data subject.Moreover, the processing often involves external third parties, since controllers might use the infrastructure of one or multiple service providers.
The personal data in question is mostly processed digitally, thus it is accessed and assessed by technical means.Granting the privacy rights of the GDPR should be realized by the same means.For this reason, we propose a privacy dashboard, which aims to offer and manage these data privacy rights.To tackle the complexity of the task and achieve a user-friendly result, a usability engineering methodology is applied.
The remainder of the paper is structured as follows.Section 2 discusses requirements for the privacy dashboard imposed by the GDPR.In Section 3, we give an overview of related work in the field of transparency-enhancing tools (TETs) as which privacy dashboards are classified.Section 4 presents the methodology, which is adapted to design the privacy dashboard.In Section 5, we analyze the potential users of the dashboard and the tasks they are supposed to fulfill with it.Based on the analysis, a design is derived that is presented and discussed in Section 6.The development of a prototype and its evaluation are presented in Section 7 and 8 respectively.Finally, we conclude our work in Section 9.

GDPR
The GDPR will be law in 28 countries, but more will be affected by it due to its territorial scope.Controllers from abroad will be subject to it if they offer goods or services to European data subjects or monitor behavior, which happened in the Union. 8The GDPR consists of 99 articles and 173 recitals.It is a comprehensive regulation covering multiple scenarios in which personal data is processed.This can be seen in Article 6 of the GDPR, which defines conditions for lawful processing of personal data.Given informed consent by the data subject9 is only one out of a number of bases, including processing personal data to fulfill legal obligations 10 or for tasks carried out in the public interest 11 .To narrow the scope, we only focus on processing of personal data based on consent given by the data subject.
To access, review, and manage personal data in a digital format, technological means are necessary.Thus, compliance with the GDPR requires technology to adapt to it.Furthermore, new means must be introduced to grant and use the data privacy rights of the GDPR.Bier et al. [2] draw the same conclusion.
As stated above, the explicit requirement of transparency is one of the major changes of the GDPR compared to its predecessor, the Data Protection Directive of 1995.It required personal information to be "processed fairly and lawfully" 12 , which is extended by the GDPR by adding the expression "and in a transparent manner "13 to it.As mentioned in the previous section, the recitals attempt to narrow the transparency principle down, however, it remains debatable which information has to be provided to the data subject to meet the transparency requirement.The data subject can be provided with an overwhelming amount of meta information that is measured whenever personal data is processed.The meta data could give answers to the questions: When was the data collected?From which device was it obtained?To whom was it forwarded?What is the physical location of the processing servers?A first step towards transparency is to grant the right of access14 .Siljee's [14] Personal Data Table fulfills all requirements to realize the execution of this right.The Personal Data Table should be extended by an element to depict data flows to involved processors.
Articles 16 and 17 of the GDPR grant data subjects the right to request rectification 15 and erasure 16 of data without undue delay.Moreover, the controller is obliged to respond to these requests within one month.This time period is extendable by two additional months with regard to the complexity of the task and the number of requests. 17For our design of the dashboard, this means the Personal Data Table must offer the possibility for each data item to request rectification or erasure of the corresponding information.
The Data Protection Directive required consent to be given unambiguously 18 , while the GDPR now requires the informed consent to be given for one or more specific purposes 19 .The recitals advise that if data is used for multiple purposes, consent shall be given for each purpose separately. 20Furthermore, the data subject shall have the right to withdraw consent at any time and as easy as it was to give consent. 21The dashboard must include a possibility to review consents given, the purposes they were given for, and a functionality to withdraw them at any time.
The dashboard is supposed to work as interface between data subject and controller.Requests for rectification, erasure or withdrawal of consent cannot be expected to be responded to immediately.Thus, a message section to obtain status information about pending requests is reasonable.The controller may approach the data subject via the dashboard to ask for consent of processing personal data for additional purposes.This way the privacy dashboard may be extended by ex ante capabilities, while being mainly designed as ex post TET.

Related work
Since decades there are numerous and manifold tools that address data privacy issues.Hedbom [5] provides a classification of TETs in 2008.The criteria to classify the tools include the possibilities of control and verification, the target audience and the scope of the tool, the information it presents, technologies it uses, and its trust and security requirements.Hedbom discusses his classification by applying it to examples.For this reason, the Transparent Accountable Data Mining (TAMI) system [16], the Privacy Bird 22 , the PRIME project [4], the approach to obtain privacy evidence in case of privacy violations by Sackmann et al. [12], and Amazon's book recommendations service [17] are presented and explained.
Based on his work, Janic et al. [6] further develop the classification and extend its definitions of TETs by identifying and discussing 13 tools.According to them, tools like the Mozilla Privacy Icons 23 and Privacy Bird fall under tools that address the complexity of privacy policies of websites.The PrimeLife Privacy Dashboard24 and the Google Dashboard 25 are ex post TETs, which provide information on collected and stored data by service providers.Lightbeam 26 and Netograph 27 visualize user tracking that is realized via third party cookies.The tool Web of Trust 28 ranks websites according to their trustworthiness, which bases on a reputation system.Janic et al. classify Me & My Shadow 29 , Firesheep 30 , Panopticlick 31 and Creepy 32 as tools that aim to raise privacy awareness by informing the user about techniques commonly used to violate their data privacy.The tool Privacy Bucket 33 and the Online Interactive Privacy Feature Tool by Kani et al. [8] have been released after the paper of Janic et al. was published, but fit in the previous described category.
To the best of our knowledge, the most recent privacy dashboards under development are GenomSynlig, which was merged into the Data Track project 34 by Angulo et al. [1] published in 2015, and the tool PrivacyInsight by Bier et al. [2] presented in 2016.While Data Track visualizes data disclosure in a socalled trace view and thus realizes the transparency principle of the GDPR, PrivacyInsight aims to address the GDPR as whole including the transparency principle, right to rectification and erasure, and the withdrawal of consent.Bier et al. identify legal and usability requirements for a privacy dashboard.In total they present 13 constraints, eight that are legal and five that are usability requirements.A brief summary of the legal prerequisites is given below, while the usability requirements are left out due to page limitations.

R1
The right to access must not be formally or technically constrained.R2 A privacy dashboard must be accessible by every data subject.R3 Access to all data must be provided.R4 All data must be downloadable in machine-readable format.R5 Data flows to all processors and internal data flows must be visualized.R6 All sources of personal data must be named.R7 For all processing steps a purpose must be given.R8 Means to request rectification, erasure, or restriction must be provided.
The requirement R2 includes in particular design strategies that enable access for data subjects with disabilities like visually impaired people.The privacy dashboard must implement accessibility interfaces like the WAI-ARIA 35 standard by the World Wide Web Consortium.The requirements R3, R5, R6, R7, and R8 impose a usability challenge with respect to the sheer amount of data taken into consideration.Internal and external data flows, as demanded by R5, can be complex to be visualized depending on the number of internal entities and external processors.Designing these data flows as graph in a comprehensible manner can be challenging.However, the information it depicts is fundamental in order to enable transparency.To support the data subject and to improve the intelligibility of this graph, it is reasonable to categorize and label personal data.A data subject might not be able to review each data flow to all processors in detail, but is interested in certain data categories.

Methodology
For the design and implementation of the dashboard, we adapt Nielsen's Usability Engineering Lifecycle [11].It is considered fundamental in the field of usability engineering.In addition, it suits the design of systems well which address inexperienced users that desire to solve complex tasks [15].For the following summary of the Usability Engineering Lifecycle Möller's notation [10] is used.
The development process starts with the Analysis phase, which examines the users, the tasks to be solved with the system, and the context of use.In the Design phase, the system is designed iteratively, however there may be parallel design versions, which are tested separately.In the Prototyping phase, the system is partly implemented.In this phase a differentiation is made between horizontal, vertical, or scenario-based prototypes.Horizontal prototypes present all functional capabilities of the system to the user, but do not provide the actual functionality.Vertical prototypes implement a certain feature of the system in depth, but do not include and present all planed functionalities to the user.The presentation but not full implementation of a certain feature is called scenario-based prototype.
The resulting prototype is evaluated in the Expert Evaluation phase by socalled usability experts in contrast to the Empirical Testing phase, which involves real users of the system, who are invited to test the tool under laboratory conditions.In the context of software engineering, this means a specific environment is set up including a predefined and tested device, a certain network connection, specific input tools, and so on.Various user studies can be conducted in both phases to either measure the overall quality of the system, or to identify flaws in the design.One of them is the cognitive walkthrough, which was first introduced by Lewis et al. [9] in 1990.After this phase, the next iteration starts, beginning with the Design phase.If the system is eventually deployed, feedback from real users in real-life scenarios can be collected and evaluated to further improve the system.

Analysis
Users of the privacy dashboard are potentially all natural persons in the EU.
According to the statistics provider Eurostat of the European Commission, over 500 million humans lived in the Union in 2016. 36These millions of people live in 28 countries, speak 24 official languages and almost the same amount of migrant languages, while using three different writing systems.Consequently, it can be inferred that a technological mean like a privacy dashboard reaches the majority of the user base, since it is rather familiar with technology and with the Internet.Web applications, which are optimized for mobile devices, suit well as platform.The privacy dashboard is intended to be used to execute data privacy rights granted by the GDPR.These rights are identified as the following tasks the tool should be used for: The Analysis phase also includes the investigation on how the identified tasks would be or are solved without the tool.To the best of our knowledge, there is no dedicated tool to exercise any of these data privacy rights.Consequently, the execution of these rights heavily depends on the context of the controller.If the controller processes personal information digitally and offers the data subject a user interface, then the right to access, rectify, and erase data can be expressed or realized via this user interface.However, to inform about involved processors or to review and withdraw previously given informed consent, data subjects have to revert to written correspondence with the controller or to long privacy policies that nobody reads [3], but may give all required information on how data is forwarded to external third parties or the formal procedure to withdraw consent.It often remains uncertain how and whether controllers respond to these written requests of data subjects.In cases of severe privacy violations with social or economic damage, legal actions need to be taken. 42

Design
This section discusses two possible architectures to deploy and operate the privacy dashboard and presents a first design approach, which serves as a basis for the development of the prototype.

Architecture
We ideally envision one privacy dashboard to manage all privacy rights with regard to all controllers a data subject is concerned with.As Figure 1 shows, Approach 1 requires each controller to deploy and operate their own instance of the tool, which the data subjects can access individually, while Approach 2 allows data subjects to access one instance of the dashboard to manage all controllers they deal with.A controller-operated instance of the privacy dashboard is easier to integrate into the data processing infrastructure of the controller.Consequently, no conversion of the personal data in question is necessary to adapt to an interface of an external third party.The controller would be able to modify and extend the privacy dashboard, for instance, to implement the visualization of customized or proprietary data formats.Security vulnerabilities are avoided, since the personal data in its entirety does not leave the boundaries of the controller, but queried chunks of it are transmitted to the data subject.The proximity of the privacy dashboard to the infrastructure of the controller eases the immediate and automated application of requests to rectify or erase inaccurate personal data.Requests made by the data subject could directly trigger internal processes providing all necessary parameters to take instant action.If the controller uses authentication mechanisms to authenticate data subjects in order to provide a service, the same technique can be used by the privacy dashboard to authenticate a data subject before delivering personal data.
While the data subject might benefit from a single end point to address all privacy concerns to, Approach 2 also implies a series of challenges.This approach is more challenging from an architectural perspective, since personal data from all controllers needs to be aggregated and served by a dedicated component.This would either require the standardization of a common data format or an agreement on an existing one.Interestingly, the right to data portability 43 granted by the GDPR may force controllers to develop or agree upon a common data format to exchange personal data.Still a transformation of the personal data is necessary, to adapt to the visualization logic of the external-operated privacy dashboard.A single machine that stores personal data of one or more individuals from multiple controllers is a security and privacy risk itself.Therefore, programmable interfaces should be defined by each controller to allow querying certain chunks of data.These interfaces require an authentication mechanism to ensure that personal data is transmitted to the right data subject.In this architecture distributed authentication techniques have to be used to solve the task.Consequently, the dashboard is ideally executed on the data subject's device, so no third party has to be involved, however, this comes along with hardware requirements that could violate Requirement R1 (The right to access must not be formally or technically constrained.) of Section 3.
In general, the adoption of the privacy dashboard by all controllers appears as a more likely approach, if it saves controllers the development of an individual privacy dashboard from scratch.Again, the assumption is made here that compliance with the GDPR implies the introduction of a privacy dashboard (see Bier et al. [2] R2).

Data taxonomy
The GDPR's explicit requirement of personal data to be processed transparently highlights the significance of the right to access.In order to execute T1 (as 43 GDPR art.20 to the domain of the controller, a simple information can be given that no data for this category is available.This might also confirm expectations of the data subject with regard to data collection practices of certain controllers.By applying the data taxonomy and offering separated views for each data category, the dashboard allows the data subject to easily find out whether a controller collects behavioral data of him or her or whether another user disclosed information about him or her.

Prototype
A prototype was developed with the JavaScript framework React 44 and the library Material-UI 45 to comply with Google's design standard Material Design 46 .The prototype has been made publicly available online 47 .With respect to the chosen methodology, a horizontal prototype has been developed that implements and presents all features to the user, however, provides reduced or no actual functionality.In practice, this means the scenario of our prototype is completely artificial.
We therefore define an online social network provider as our made-up controller that processes personal data of its users similar to popular services like Facebook or Twitter.All data presented in the dashboard is fake and does not belong to a natural person.However, to simulate a person's personal profile as accurate as possible with regard to the amount of data, we adapt an existing model from a study of the advertising agency Jung von Matt 48 .Furthermore, requests to rectification, erasure, or withdrawal of consent are not processed by a controller's backend.The filtering of data according to its processing context, data type, or time of its processing is implemented.
As it can be seen in Figure 2, we designed a three-column layout for the dashboard.We define general functionalities like reviewing given consent, displaying the privacy policy, and obtaining information about involved third parties, which are presented in the left column.Also in the left column and under the general functionalities, filter options are provided allowing the user to display personal data processed in a specific context, of a certain data type, and in a defined time range.The meaning of each processing category and each data type is visually supported by an icon, which is used in other components of the dashboard as well.In the center of the layout, the queried personal data is listed vertically in chronological order beginning with oldest entry.Each entry is furnished with an icon that gives information on its processing context.Under the actual date of when the processing took place, a short descriptive text about it is presented in the header of an entry, which is displayed above the actual personal data.On the right-hand side, general information about the controller are given, such as name, physical address, and email address to directly contact the controller.
In order to use the dashboard to execute task T2, a graph is displayed that shows the user data flows between controllers and involved processors (see Figure 3).In real-life scenarios often many processors are involved in the processing of personal data.There can be multiple controllers as well (so-called joint controllers49 ).Depending on the number of involved processors in the processing of the data subject's personal data, the complete graph can be shown as whole or processors can be clustered into groups according to their business domain for instance.Edges are annotated with data categories giving information on which data is exchanged.The arrows denote the direction of the data flow to clarify whether parties are just provided with data or if parties are actively exchanging data with each other.For the implementation of this graph the JavaScript library vis.js50 has been used.Angulo et al. [1] propose a similar but more detailed approach with the trace view.To reduce complexity, data categories instead of specific data items are used in our approach.Task T3 requires the privacy dashboard to offer a possibility to request rectification or erasure of the data item in question (see Figure 4).Additionally, for each data item information on the purpose of its collection and processing is given (see Figure 4).Multiple purposes can be listed here, if data is processed for more than one purpose.With the help of this component the data subject can answer the question: For what reason does the controller collect and process this data?A redirection to a separate section allows the user to review given consent and the possibility to withdraw it (see Figure 5).Since consent is supposed to be bound to a specific purpose, there is a label and a short description text to give more details about the purpose in question.With a simple interaction, like a click, it is possible to withdraw consent as easily as it was to give it. 51Fig. 5.A list of purposes for which consent has been given by the data subject.For each purpose a label and a short descriptive text is given.Consent can be withdrawn by simply clicking the toggle on the right.

Evaluation
To evaluate the design approach presented in this paper, an expert evaluation has been carried out according to Nielsen's Usability Engineering Lifecycle.The usability of the data categories is in focus of this evaluation.Möller [10] proposes a formative analysis consisting of a so-called Thinking Aloud test [7] with three to five participants to identify design flaws in a system.In the test, participants are asked to solve one or more specific tasks by interacting with the system while thinking aloud.An analysis of the participants' thoughts and remarks is conducted subsequently.In an expert evaluation so-called usability experts instead of real users are used, since the system might be in a too early stage to present it to external users.For this reason, three fellow researchers were given the following task consisting of multiple questions that have definitive answers.
"European law gives you the right to request from any entity that processes your personal data access to it.Imagine you requested access to your personal data from a company and you're confronted with the tool in front of you.Please answer the following questions:" -Which data did you have to provide when creating an account for this service?-Did you provide any voice recordings to the service?-Have you disclosed your location voluntarily?-Has anyone provided the controller with photos of you? -Does this service provider track your location?-Has the service provider knowledge about your gender?-Does the service provider know your income?-Does the service provider know which websites you visit?
All participants struggled to answer the questions at the beginning, but managed to improve quickly answering the last questions rather fast and confidently.All participants answered the first question using the chronological order instead of using the respective data category assuming the data provided first is the data required for the registration.This is a clear indicator that the data category Service data is redundant and can be categorized as intentional data ("Data I provided").The so-called AppBar at the top also contributed to confusion.The participants understood the privacy dashboard as a service itself, therefore tried to answer the first question with regard to required information in order to use the privacy dashboard itself.The participants found that the filter options were not visible enough and should be placed more prominent, considering that they are an essential part in the task solving process.Another concern of the participants is the technical feasibility of the data categories.This applies to incidental data ("Data of me provided by others") and derived data ("Inferred data about me") in particular.Generally, the scenario of the privacy dashboard is important.The participants were interested whether the system is operated by the controller or as a separate service, and if it can be used offline or if an Internet connection is required.The evaluation reveals that refining the data categories is necessary in order to improve the usability of the dashboard.However, it also shows that the developed prototype can be used by data subjects to answer questions relating to their data privacy.

Conclusion
This work presents the design and implementation of a privacy dashboard, which addresses the requirements of the GDPR and enables the data subject to execute data privacy rights with the tool.To substantiate the dashboard's design, its potential users and the tasks they are supposed to fulfill with it were analyzed and discussed.A prototype has been developed and evaluated.The results of the evaluation indicate that our design approach is worth pursuing and reasonable, yet needs further improvements and user tests.The redefinition of the data categories and their technical feasibility will be researched in future work.Furthermore, architectures for the deployment of the privacy dashboard need more investigation.Comprehensive user studies are necessary to refine the current design of the dashboard and to develop alternative approaches.

Fig. 1 .
Fig. 1.Architectural alternatives for the deployment of the privacy dashboard.Either as single point to manage all controllers, or as data privacy management tool for every controller separately.

Fig. 2 .
Fig. 2. The layout of the developed prototype.General functionalities and filter options are presented on the left-hand side.The queried data is in the center sorted chronologically beginning with the oldest entry.General information about the controller are presented on the right-hand side.

Fig. 3 .
Fig. 3.A graph visualizing internal and external data flows between controller and processors.Edges are labeled with data categories indicating which data is exchanged with whom.

Fig. 4 .
Fig.4.For each specific data item the user is given information on the purpose of its processing, where applicable the possibility to withdraw consent, and the possibility to request rectification or erasure of the data.
37In 2016, 15.6% of the European population were younger than 14 years, 11.1% of them were between the age of15-24, 34.1% between 25 and 49, 20.1% between 50-64, 13.8% between 65-79, and 5.4% older than 80 years.38Thesenumbershighlight the challenge a uniform interface for this user base will be, however, it is further reasonable to investigate the user base's affiliation with information and communication technology.In 2016, about 71% of all individuals in the EU and 92% between the age of 16 to 24 accessed the Internet on a daily basis. 39eover, 8 out of 10 users use a mobile device to access the Internet.40In2012,80% of individuals between the age of 16 and 24 used the mobile Internet to participate in social networks.41 Execute the right of access T2 Obtain information about involved processors T3 Request rectification or erasure of data T4 Consent review and withdrawal 36 Eurostat -Population.http://ec.europa.eu/eurostat/tgm/table.do?tab=table& init=1&language=en&pcode=tps00001&plugin=1, last accessed: 07/18/2017.