Group Development Stages in Open Government Data Engagement Initiatives A Comparative Case Studies Analysis

. Citizens are increasingly using Open Government Data (OGD) and engaging with OGD by designing and developing applications. They often do so by collaborating in groups, for example through self-organized groups or government-induced open data engagement initiatives, such as hackathons. The successful use and engagement of OGD by groups of citizens can greatly contribute to the uptake and adoption of OGD in general. However, little is known regarding how groups of citizens develop in OGD engagement. This study aims at exploring and understanding the development stages of citizen groups in OGD engagement. To attain this objective, we conducted a comparative case study of group development stages in two different types of OGD engagement. Our cases show that leadership and diversity of capabilities significantly contribute to the success of citizen groups in OGD engagement. These findings suggest that connecting citizens having a diversity of expertise prior to the OGD engagement event helps to improve its effectiveness. This research is among the first to apply group development stages model in open data engagement studies and thus opening up new research opportunities concerning group developments in the open data literature.


Introduction
Governments at different administrative levels (e.g., national, regional, local) are progressively opening up data to the public in the hope that citizens will use it [1]. Indeed, successful and sustainable use of Open Government Data (OGD) that contributes to solving societal problems hinges on citizens engagement [2]. We argue that citizen engagement is one move further than OGD use. Such engagement requires not only OGD use, e.g., locating, downloading, distilling, scrutinizing, and refining data [3], but also designing and developing OGD-based applications. The development of applications by citizens based on OGD is often done by groups of people who collaborate [4]. Such groups can be self-organized, where the content and processes of engagement are determined by citizens who organize themselves and engage in forms of collective action [5]. Examples of self-organized engagement include the Dutch's Open Spending [6] and the Indonesian's Kawal Pemilu [7] initiatives. In contrast, citizen engagement in OGD can also be governmentinduced. An example of such a government-induced initiative is a hackathon. In a hackathon initiative, governments determine when and where engagement takes place, and under which conditions citizens can engage [8].
In the open data literature, research in the socio-technical conditions of OGD utilization, both enabling and disabling factors, has widely been provided [1]. However, studies in the area of OGD engagement are lacking [1], especially regarding the development of groups of citizens who engage in these initiatives. Although individual citizens engaged in a group are motivated by different drivers [9], they strive to be successful in achieving the group's shared objectives. For example, in a hackathon, groups may not only attempt to win a competition and earn a prize but also indirectly contribute to solving problems raised by the hackathon organizers. Whereas in a selforganized OGD engagement initiative, groups may aim to contribute to solving a reallife problem they may face in daily life. However, there is scant knowledge of the group development and underlying factors that contribute to a group's success in the OGD engagement literature.
This study aims at exploring and understanding the development stages of citizen groups in OGD engagement. To attain this objective, we formulate the following research questions: "How do citizen groups develop in open government data engagement initiatives?" We conduct a comparative case study that involves two cases of OGD engagement in different settings. This study is among the first to apply group development stages model in comprehending citizen engagement in OGD initiatives. The results of our study advance the understanding of how policymakers should prepare and precondition the engagement initiative to stimulate more engagement groups.

Open Government Data Engagement
Open data researchers usually define citizen engagement as open data use by citizens [10] that concerns various processes carried out to convert data to other outputs such as fact, information, data, interface, and service [11]. However, we argue that citizen engagement is one step ahead, involving not only OGD use, but also designing and developing OGD-based applications.
In public administration studies, researchers distinguish initiatives of citizen engagement with government policy between those that are self-organized and government-induced [5]. We argue that this distinction also applies to open data engagement because governments may operate using different models of data provision [8]. In the government as a platform model, the government limits its role only to the provider of open data infrastructure comprised of a web or portal offering access to data and tools for previewing, visualizing, or downloading data [8]. The government acts passively in this mode and presumes that others will use OGD, create applications on top of it and generate value [12]. This government mode seems to breed self-organized OGD engagement initiatives. On the other hand, government-induced OGD engagement concerns the government as open data activist model in which the government not only provides the open data infrastructure but also promotes its use to citizens, the private sector, or the government itself [8]. In promoting open data use, such governments frequently organize supportive activities framed as a hackathon contest where citizens and businesses compete with each other to pitch an idea or the design of an application or an application prototype.

Self-organized open data initiatives.
Current open data literature is substantially lacking an overview of self-organized initiatives, and only little is known about this type of engagement. Self-organized engagement is somewhat a reaction to government-led processes or structures but utilizes the states' instruments (e.g., OGD portals and services) to attain citizens' objectives [5]. Organizing and sustaining such engagement requires the availability of two primary resources, time and money [13]. Therefore, only organized civil society that has access to sponsorships or donations can initiate and maintain self-organized engagement. Citizens initiated engagement such as Kawal Pemilu moved forward successfully because the initiators could radically minimize the costs incurred by using free open source software/platforms, utilizing social media platforms and applying crowdsourcing strategy [7,14].
Government-induced open data events. This type of engagement typically takes form as hackathon events and aims to deliver economic value [15]. Since there is no agreement on the definition of an open data hackathon, we synthesize it based on selected literature [16][17][18][19] as follows. An open data hackathon refers to offline/faceto-face ideation competition sponsored by government agencies in a centralized location that brings together citizens with different backgrounds (e.g., programmers, designers, others) to intensively work collaboratively in small teams for a short amount of time (e.g., 12 hours, 24 hours, 2 days) to create artifacts (e.g., mockups, design, prototypes, applications) using OGD. Typically, at the end of the contest, each team presents/pitches the final idea in front of juries and sponsors, and a winning team earns a prize (e.g., money, investment).
In an open data hackathon, organizers and sponsors provide nearly all resources and support needed by the teams to work efficiently [16,19,20], including catering services, sleeping bags/area, comfortable facilities (gaming device, sports hall), internet connection, electricity (cables), and stationaries. Provision of technical support from open data providers or event organizers or sponsors is also common. These amenities are intentionally provided to support group development in the hackathon event.

Group Development Stages
Either in self-organized initiatives or hackathon events, the development processes of a citizen group/team would determine how they conceptualize a problem, brainstorm potential solutions, develop the preferred solution collaboratively and ultimately deliver it at the end. Self-organized initiatives might produce a ready-to-use application for society, whereas hackathons might offer various outputs based on the event's objectives (e.g., mockup, design, prototype, application, visualization). While current literature does not signal cue for group development in self-organized initiatives, on the contrary, a small number of hackathon studies has started discussing the theme [21,22]. However, both works do not specifically focus on how teams progress throughout the hackathon. Studies on group development incorporate the investigation of group activities and how these activities evolve over the life of a group [23]. Stages or phases of group development are defined as the categorization of "the periods of time during which an identifiable set of activities occurs" [23, p. 122]. Although numerous models of group development have been proposed, Tuckman's [24] classical sequential stages is one of the most influential models recognized in the human resource development studies [25]. In this model, Tuckman [24], focusing on interpersonal relationships and task activity, postulated a four stage of group development namely forming, storming, norming, and performing (see Figure 1). Tuckman [24] further posited that effective group functioning requires successful formation of each stage and transformation from one stage to another.

Fig. 1. Tuckman [24] model (adapted from Bonebright [25])
Forming. Tuckman [24] described the first stage as testing and dependence of interpersonal relationships (group structure) among group members and orientation to the task activity. Group members attempt to discover acceptable behaviors based on the reactions of the group leader and other members. Once the boundaries are discovered, a member becomes dependent on the guidance and support from the leader(s) and preexisting norms. Group members attempt to identify relevant tasks and ways to accomplish the tasks by determining information required to deal with the tasks and how the information can be acquired.
Storming. The second stage is characterized by intragroup conflict related to group structure and emotional response to task demands that lead to the lack of unity. Group members express their individuality and oppose the formation of group structure by becoming resistant toward one another and group leader(s). The discrepancy between individual's interest and orientation demanded by the tasks leads to emotional reactions and resistance to the tasks. However, Tuckman [24] considers that this stage would be less visible in groups working on intellectual tasks.
Norming. The third group structure stage is identified as the development of group cohesion, and the task activity development is characterized as the open exchange of relevant interpretations. A member accepts group structure and the individuality of fellow members. New group-generated norms endorsing harmony to ensure the group's existence are the results of the acceptance. Group members are open to discussing themselves and others' and their opinions to generate an alternative interpretation of tasks.
Performing. In the fourth stage, the development of group structure is labelled as functional role-relatedness, and the development of task-activity is identified as the emergence of solutions. Members adopt and play roles after learning from one another socially in the preceding stage. Role structure becomes an instrument that can direct the group as a problem-solving entity. Constructive actions that lead to successful tasks accomplishment (solutions) are seen in this stage.

3
Research Methodology

Case Study Design
The research aims at understanding and exploring the citizen's group developmental stages which are presently little understood in contemporary open data engagement context. As a result, the aim might be attained using qualitative approaches and cannot be achieved using quantitative inquiries such as a survey. Although the study was informed by a prior model of group development stages [24], it is unclear whether the seminal model applies to different types of OGD engagement. Therefore, the case selection aimed at finding cases that concern the citizen's group developed in OGD engagement initiatives and providing variation in contextual factors (self-organized and government-induced) that enable polar cases. Case studies are appropriate for research trying to answer "how" or "why" questions about contemporary events over which the researcher has little or no control [26]. We selected cases that concern OGD and groups of citizens engaging in the OGD initiatives. The cases must involve groups representing different types of OGD engagement. The cases should also include groups that accomplish a set of contextual objectives. To enable comparison and contrast between cases, we selected two cases that are varying contextually: the Kawal Pemilu group that exemplifies the selforganized engagement and the PacMan team that epitomizes the OGD governmentinduced engagement (hackathon).
The first case involves a group of citizens who voluntarily developed an OGDbased application and used it to digitize the results of Indonesian's 2014 presidential election. The group comprises two teams, a developer team of five technologists who built the application and a volunteer team of 700 persons who used the application. The successful digitization of election results, covering 97.91% of 478,829 votes, in only six days made Kawal Pemilu a prominent example of citizen engagement [7].
The second case concerns a team of citizens who participated in a Dutch's open education data hackathon, Hack de Valse Start, held on 3 March 2018 for twelve hours (from 8 AM until 8 PM). PacMan comprises five persons with diverse back-grounds and capabilities who worked in a collocated room of a high school building situated in the outskirt of the Amsterdam city. The group, competing with six other teams, won the second prize for visualizing averaged national exam scores data against averaged teacher advice data at the school level and providing an analysis of the visualization.

Data Collection and Analysis
We collected various types of qualitative data from multiple sources of evidence at several points in time, to enhance construct validity as much as possible [26]. In both groups, the first author conducted participant-observations by actually participating in the engagement: as a volunteer in the Kawal Pemilu group and as a member of the PacMan team. Gaining actual access to these teams provide a distinctive opportunity to understand the group development from the perspective of an insider since postfactum comprehension of interpersonal relationships and task activities is non-trivial [26]. The researcher used online observation through the Facebook (FB) platform because the Kawal Pemilu group was developed entirely using the platform. Table 2 provides an overview of the case information sources, including documents, interviews, participant observations and tangible artifacts. Fifteen semistructured interviews were conducted with the Kawal Pemilu group members from October 2017 until February 2018. All interview sessions were recorded as agreed by the interviewees and transcribed. The author also conducted four unstructured, informal interviews with the PacMan team members during the hackathon. Since the data collected include personal data from both groups concerning privacy and confidentiality, the first author was only approved by group leaders to disclose their data. We divided our analysis into two phases. First, we analyze the data using provisional manual coding to capture the development process of both groups based on Tuckman's [24] stages. Finally, we categorized the codes into two groups: 1) interpersonal relationships and 2) task activities associated with the developmental stages as indicated by the model.

The Development of the Kawal Pemilu Group
The Kawal Pemilu group was invented by Ainun; an Indonesian-national data scientist lived in Singapore on 9 July 2014 immediately few hours after competing presidential candidates declared their victories. Ainun recruited four Indonesian developers living in different countries (i.e., Australia, the Netherlands, and United States) to build the digitization application and 700 Indonesian volunteers around the world to digitize election results using the application. Since two teams were involved in the Kawal Pemilu, we presented the results as separated but connected processes of both teams. Pre-existing norms were still in place.

Performing
Roles were established and adopted.
Efforts based on roles were taken to develop, sustain and maintain the application until most ballots were digitized.
Two roles (inputter and verifier) were established and adopted.
Volunteers strived to digitize all ballots and verifiers validate the digitization results and report errors.
Two of the developers were Ainun's close friends, and both trusted Ainun's integrity. The other two developers were invited by one of Ainun's friends. Social relationships were well developed among members of this group. Contrary, a volunteer might know several other volunteers but rarely knew all of them due to a large number of persons involved. No guidance was determined other than the due date, 22 July 2014, set by the Election Commission of Indonesia to officially announce the election victors.
Conflicts among volunteers arose as a form of distrust towards each other's political stance and interests. Some volunteers, siding with one of the election candidates, suspected that other volunteers, supporters of the opposed candidate, would damage the digitization initiative by deliberately inputting an incorrect number of ballots. Volunteers resisted the tasks distribution due to two issues as follows. First, some volunteers prioritized inputting the results from the regional area where they or their families or friends were living in. Second, the number of voting booth varied across regional areas and might lead to imbalance tasks distribution. The densely populated area was likely to have more booths and thus more ballots to be digitized. On 9 July 2014, the developer team started brainstorming and discussing the idea and design of the application, using an online collaboration tool. An external expert was invited to the discussion sessions. The discussions occurred until 14 July 2014 and were entirely positive and technical towards choosing the right algorithms for verifying errors, incentivizing volunteers, and preventing incorrect data service invocation. Although at some point members disagreed with other's opinions, the disagreements were seen as intellectual dialogs, not interpersonal conflicts. The preexisting norms evolved into new norms as a result of the discussions: the due date was relaxed, and new technological decisions were made. New role, verifier, was established and followed up by recruitment among volunteers. Verifiers were grouped into small teams and tasked to examine input made by volunteers and correct erroneous inputs. A verifier team's results were further re-examined by another team to improve data reliability.
Team (developer, volunteer, and verifier) members quickly understood their respective roles and performed tasks accordingly. All efforts were made to sustain the Kawal Pemilu's website until the digitization of election results finished on 18 July 2014.

The Development of the PacMan Team
The PacMan team was initiated by Johannes, an educational journalist working for De Correspondent, a Dutch news website. Johannes randomly asked nearby participants to join his team and further asked interested participants to get to know each other's strength by explaining their background and specialization. Four participants including the first author agreed to form a team with Johannes. The first member was a data scientist from Russia, working for a Dutch travel aggregator company, who has participated in numerous hackathon events. Another member was a Dutch and an employee of a municipality in the Netherlands who worked in the education field. The third member was a workshop organizer from Romania, working for promoting open data use through "maker" arts. Johannes, henceforth the team leader, initiated the brainstorming of interesting societal problems that can be explored and exploited as the team's final product. Although three members were not Dutch persons, they contributed to the discussions. The data scientist viewed the topic proposal from technical viewpoints and sometimes disagreed with the leader since the topics were not supported by available data. The first author clarified the current government's educational policy and the data visualization that will be pitched. The municipality employee added several local social issues to consider in the visualization. At the end of the discussion sessions, new team norms were added: a visualization to compare national exam scores against school advice and to provide a preliminary indication of the causes of deviation between scores and advice. Members tried to understand the team goal informed by the scheduled pitch session at the due time.

Norming
Members expressed their opinions in intellective discussions guided by the leader.
New norms were agreed.

Performing
Members understood their and others' respective roles.
Members acted according to their roles to meet the deadline.
Roles were understood and performed accordingly. Johannes searched for relevant data and handed over the first author. The first author examined the data and supplied relevant data (e.g., statistical socioeconomic data) to the data scientist who coded the visualization. The municipality employee helped translate the metadata written in the Dutch language to English and explain the meaning to the data scientist. The workshop organizer prepared online collaboration tools and design the presentation for the pitch session. Fifteen minutes before the pitch started, the final presentation file was completed and submitted to the hackathon organizers. The leader delivered the final presentation in Dutch to provide contextual meaning to the team's output.

Discussion and Conclusion
As indicated by Tuckman [24], the Kawal Pemilu developer and PacMan teams, working on intellective tasks namely developing an application and designing a visualization, progressed through forming, norming and performing stages. In contrast, the Kawal Pemilu volunteer teams evolved around four stages including the storming stage. Digitizing election results seemed to be personal tasks because some volunteers preferred to digitize specific regions and tended to take sides in the election. Despite different time duration of the engagements under study, these results signal the relevance of the stages in both virtual and face-to-face groups development.
Different interaction factors appeared in both cases. While the impact of duration on these interactions needs to be studied further, the verbal and tangible presence of nonverbal cues in the PacMan team seemed to enhance the communication among its members. In addition to the conflicts of personal interests seen in the storming stage, communicating virtually with strangers could hamper the interactions of the Kawal Pemilu group members. The use of emoticons in FB platform might help improve participants' perceptions towards others' emotion, attitude, and attention intention [27], thus decreasing the communication barrier. Nevertheless, further studies are needed to test these propositions since literature suggest that computer-mediated interaction lacks cues to reduce the communication perception problems [27].
We propose three non-exhaustive underlying factors that appear to contribute to the success of group development stages. Leadership roles, naturally played by Ainun and Johannes who actively sought for personnel that might help them achieve the group's objectives, contribute to quicken the group formation. Beforehand interaction of participants may help reduce communication issues in forming the group and identify roles needed to perform tasks. Diverse capabilities, technical (e.g., programming) and domain-related (e.g., election, education systems) skills and knowledge, enhance progress in tasks performance of OGD engagement and its context.
Policymakers should consider the above factors in promoting OGD engagement. Although locating a leader is non-trivial since open data users are commonly unknown to policymakers [10], surveying open data communities may lead to potential champions and enable informing them early about OGD initiatives. Providing an online platform that connects open data user groups and enables them to interact with each other may facilitate interactions and help them know other's profiles before OGD engagement is actually carried out.
In addition to the discussion above, we are aware of the limitations of this study concerning the use of a participant-observation strategy. In the Kawal Pemilu case, the first author was able to be an external observer since nearly all activities were performed virtually and involved a large number of participants. Contrary, the author's participation in the PacMan case might lead to advocacy roles that contradict the practice of good social science [26]. However, the researcher was able to play observer role until the performing stage that requires more technical activities than social relationships.
This study is an initial step in understanding how citizens engage in OGD initiatives from a linear group development stages perspective. Although Gersick [28]