Policy Informatics in the Social Media Era: Analyzing Opinions for Policy Making

. In order to address the complexity of the modern social problems and needs through effective public policies, government agencies have started experimenting with policy informatics methods, adopting various approaches that increase citizens’ and stakeholders’ participation in the public policy formulation processes. Such approaches allow the exploitation of their opinions, which incorporate valuable perceptions of them, as well as knowledge, proposals and ideas. This paper outlines three advanced methods of social media (SM) exploitation in public policy making processes for citizen-sourcing, which are based on the concepts of active citizen-sourcing, passive citizen-sourcing and passive expert-sourcing respectively, as well as the conclusions from some first applications of them. Based on them a comparison of these methods is conducted, and then a maturity model is developed concerning the use of SM for citizen-sourcing in order to support policy making.


Introduction
With our society becoming more and more heterogeneous and pluralistic in terms of culture, values, concerns and lifestyles, the social needs and problems become more complex and 'wicked', creating needs for new approaches in order to cope with them [1][2].These approaches necessitate government agencies to collect and process a large amount of external information concerning the different issues perceived by different problem stakeholder groups for the specific social problem under investigation, as well as the different solutions they propose and arguments in favor and against them, and in general their different concerns.Contemporary governments are responding to these challenges, by moving away from the 'elitist model' of public policy development, in which managers and experts are the basic source of policies, towards a new more 'democratic model', in which the citizens have an active role and voice as well in public policies' formulation.This has resulted into the adoption of the 'participative democracy' ideas, which are based on the extensive involvement of stakeholder groups in the formulation of public policies [3][4].In this landscape, policy informatics has emerged as a field studying how information and communication technologies (ICT) can be leveraged in order to understand better complex social problems and needs, develop public policies for addressing them, and realize innovations in governance processes and institutions [5][6].Policy informatics uses modern computational methods to process vast quantities of data, mine data from single and multiple sources, seek patterns in multidimensional data, and develop models of various phenomena.
In parallel, the increasing availability of online user-generated content and the new ICT-based means of interactions between decision-makers and citizens has brought new potentials for collecting and analyzing citizens opinions, which incorporate valuable perceptions of them, as well as knowledge, proposals and ideas.Web 2.0 and Social Media (SM), constitute a 'paradigm shift in communication', which lowers the barriers of communication for individuals and groups, and brings new potentials to foster and support e-participation.This has led to the emergence of new opportunities for the 'policy informatics' field, based on approaches, methods and processes that incorporate Web 2.0 functionalities and architectures, and social networking tools, in combination with advanced text processing techniques for analyzing the huge amount of collected policy-related textual content.However, there is limited knowledge on how these ideas can be efficiently and effectively performed in the special context of the public sector, and supported by appropriate ICT platforms.This necessitates extensive research for the development of methods for the effective exploitation of SM in government, in combination with advanced text processing techniques, for supporting problem solving and policy making.
This paper makes a contribution in this direction, by outlining and comparing three advanced methods of SM exploitation in public policy making processes, developed as part of European projects, and synthesizing the results of their application and evaluation from various perspectives in order to develop new knowledge in the "Policy Informatics" area.Finally, based on our conclusions a maturity model is developed concerning the exploitation of SM by government agencies for policy oriented citizen-sourcing.
The paper is structured in six sections.In the following section 2 the background of our research is presented.Then, the three SM exploitation methods and their underlying ICT platforms are briefly presented in section 3, while their pilot applications are outlined in section 4. A comparison of the proposed methods is presented in section 5. Finally, in section 6 the conclusions are summarised.

Background
The great potential of the 'collective intelligence', defined as a 'form of universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills' [7], to contribute to difficult problem solving and design activities has triggered the interest in the adoption of crowdsourcing in the public sector.While many government organizations do not explicitly use the term, they increasingly attempt to use crowdsourcing ideas and practices in order to encourage collective problem solving in co-operation with external stakeholders (e.g.citizens, professional and sectoral associations, etc.).However, much less research has been conducted on the application of crowdsourcing in the public sector, focusing mainly on 'citizen-sourcing', than for the private sector crowdsourcing [8][9][10].Citizen-sourcing can lead to the application of open innovation ideas in the public sector, as it changes government's perspective from viewing citizens as "users and choosers" of government services to "makers and shapers" of them.
The first citizen-sourcing initiatives aimed at the collection of policy-related information, knowledge and ideas from the general public, in order to support the development of better, more effective and acceptable public policies.So most of the initial government citizen-sourcing research is focusing on the 'active citizen-sourcing' paradigm, which uses government agencies' web-sites or social media accounts in order to pose 'actively' a particular social problem or public policy (existing or under development), and solicit relevant information, knowledge, opinions and ideas from the citizens (the general public) [11,12].
Later, there has been research interest in the 'passive citizen-sourcing' paradigm, which aims to exploit 'passively' policy-related content that has been generated by citizens freely, without any direct stimulation or direction by government, in various external (i.e.not belonging to government agencies) web-sites or social media, such as political fora, news web-sites, political blogs, Facebook, Twitter, etc. accounts; the analysis of this content can provide useful information, knowledge and ideas concerning important social problems and public policies [13][14][15].
The assessment of the first citizen-sourcing initiatives revealed that they can provide useful insights about the perceptions of the general public concerning important societal problems and existing or prospective public polices for addressing them.However, they concluded that due to the high complexity of modern social problems and needs that had to be addressed through effective public policies, it would be highly beneficial if this general public oriented citizen-sourcing could be combined the collection of information, knowledge and ideas from experts as well.This lead to the emergence of the 'expert-sourcing' paradigm, which is in line with previous political sciences research on the role and importance of both 'democracy' (democratic processes and consultation with stakeholder groups) and 'technocracy' (specialized knowledge of experts) for the development of effective public policies [16][17].
However, these different types of citizen-sourcing and expert-sourcing practices, aiming at the collection and analysis of public policy related information, public opinion, knowledge and ideas from experts' and citizens' communities, constitute innovations in the Policy Informatics field, and there is limited knowledge concerning their advantages, disadvantages and application in policy formulation processes in general.So, extensive further research is required in this area, in order to improve existing and develop new citizen-sourcing and expert-sourcing paradigms.The following sections outline some research that has been conducted in this direction, and attempt to synthesize their findings.

Three SM-based Citizen-sourcing Methods
For reasons of completeness of this paper, the three following subsections provide an outline of three SM-based methods that have been developed as part of European projects: an active citizen-sourcing method (3.1), a passive citizen-sourcing method (3.2), and a passive expert-sourcing method (3.3).Also, in each subsection references are provided that describe in more detail the corresponding method.

An Active Citizen-sourcing Method
The first method aims to conduct centrally managed online consultations on public policies, or social problems/needs, which are defined by the organizer government agency (so it performs 'active' citizen-sourcing), in multiple accounts of it in various SM.A central ICT platform is used in order to initiate, manage and monitor a policy consultation in multiple SM accounts of a government: initially are publish relevant messages on them, which define the topic/question of the consultation (it can be a public policy, existing or under development, or a social problem/need), and then the citizens interact with these messages through their accounts in the underlying SM [18][19].Both messages/content posting in these multiple SM accounts and continuous retrieval of citizens interactions with them (e.g.comments, likes, shares, etc.) are performed in a automated manner using the API of these SM from the above central ICT platform, in which also processing of these interactions (using advanced text analysis techniques) and results presentation takes place.The results include advanced analytics, based on advanced processing of citizens' textual inputs (e.g.blog postings, comments, opinions, etc.) using text analysis and opinion mining techniques.In particular, the following tasks are performed: (i) sentiment analysis, which classifies opinionated texts (e.g.blog posts, comments) as expressing positive, negative or neutral opinions, as well as the overall sentiment of citizens' comments submitted within a policy consultation, and (ii) issues detection, which identifies specific issues frequently posed by the citizens.This advanced processing is used to discover the public stance on the various issues of a policy topic.Another sub-component performs simulation modelling (Decision Support Engine), having mainly two objectives: estimation of the outcomes of various citizens' proposals on the public policies under discussion, and also forecasting the future levels of citizens' interest in and awareness of these policies.This method has been developed as part of the PADGETS project (www.padgets.eu)

A Passive Citizen-sourcing Method
The 'passive citizen-sourcing' method aims to exploit the vast amount of citizens-generated content beyond the SM accounts of government agencies, in 'external' Web 2.0 sources (i.e.not owned by government agencies, such as various political blogs, newspaper discussion forums, etc.), in order to provide to governments a better understanding of public needs, wishes and perceptions of citizens, as well as ideas, to be taken into account in the policy making process [14,20].An ICT platform has been designed for supporting the application of this method within the NOMAD project (www.nomad.eu),which consists of services that: (i) create and maintain domain models, i.e. graphical representations incorporating the main entities-terms of the domain of government activity in which the specific policy aims to intervene (e.g.energy, education), as well as policy models incorporating the main elements of the public policies under investigation (policy modelling); (ii) then use such policy models in order to mine relevant citizen generated data from a variety of pre-defined online external sources (through crawling services), (iii) perform linguistic analysis of them to transform free text into a set of structured data, (iv) discover and extract main issues discussed, as well as arguments from free text (argument extraction), (v) perform sentiment analysis to classify text segments according to their "tone" (positive, neutral, negative), (vi) cluster arguments, based on calculated similarities, and present automatically-generated summaries (argument summarization), and (vii) visualize a structured view of citizens' opinions on a policy related topic (through word-clouds and other kinds of charts), providing insights on what about, how much and when citizens are discussing concerning this topic (visual analytics).In this approach government does not define topics/questions of consultations; it remains passive, and just 'listens' to what citizens discuss on a specific policy, and analyze the content they freely produce in order to extract relevant knowledge (so it performs 'passive' citizen-sourcing).

Passive Expert-Sourcing Method
This third method provides the main capabilities of the previous one (outlined in section 3.2), but combined with filtering of the retrieved content, based on creator's reputation (enabling a focus on more reliable content created by high reputation authors) as well relevance with our pre-defined topic of interest.In particular, it is a 'passive expertsourcing' method, based on the automated retrieval from multiple online sources at regular time intervals of information about experts on various policy related topics, as well as relevant online texts, documents and postings already published by such experts in multiple social media and web-sites.Data about individuals possessing high levels of knowledge, expertise and credibility in one or more predefined topics are collected and included in the corresponding database automatically, or even can be entered manually by interested individuals through self-registration.In addition, rankings of the expert profiles on one or more topics, based on their relevant expertise, through 'reputation scores' are calculated by a reputation management algorithm based on several criteria with different weights.Another component of the ICT platform supporting this method, crawls relevant documents (blog posts, social media content, online comments, word/pdf documents, web pages, etc.) concerning the above predefined topics of interest.These documents are associated with the most relevant policy topic and subtopics, and possibly linked to one or more authors of the above individual experts'.Next, for each document its quality is rated with respect to the above policy topic/subtopic(s) and undergoes sophisticated processing using text/opinion mining and sentiment classification techniques, in order to assess their sentiment (positive, negative or neutral).By storing the above data in a common database, enabling search of it by the users and visual presentation of the results, public policy stakeholders are able to identify useful expert knowledge on complex policy debates, i.e. the most reputable/credible experts or the most relevant documents on a specific topic A comprehensive description of this method is provided in [21].

Applications
The proposed citizen-sourcing methods have been applied in real policy scenarios and evaluated through pilot applications organized in cooperation with governmental actors (government agencies, members of national and European parliaments, public officials, etc.) in order to identify their strengths, weaknesses, barriers, limitations, as well as appropriate improvements and adaptations that will favor their practical usefulness and integration in the policy making processes.In order to build multi-perspective frameworks for the evaluation of the proposed methods, we draw elements from previous research in management science (concerning risks of crowdsourcing [22][23] and diffusion of innovation theory [24]), political science (concerning wicked problems theory [2]), and IS research (TAM [25]) (see [14], [26]- [27] for more details).In order to combine the advantages of the qualitative and the quantitative techniques [28] we used mixed methods of data collection, i.e. focus-group discussions, one-to-one interviews, and surveys.The active citizen-sourcing method outlined in 3.1 has been evaluated through three pilot applications, in cooperation with members of the European Parliament.At the end of each pilot application the following data have been collected and analyzed: (i) Social Media Metrics as provided by the SM accounts of the consultation initiators and the Google analytics engine and (ii) textual input of the participants were retrieved and analyzed using the opinion mining capabilities of the ICT platform in order to extract the main topics mentioned and the corresponding sentiments.All textual inputs by citizens were examined in more detail, in order to be classified into issues/concerns, solutions/activities, advantages and disadvantages/barriers.Fig. 1 shows an example of such classification in one of the pilot applications.

Fig. 1. Examples from the textual input of citizens in one of the active citizen-sourcing pilot applications
From this evaluation it has been concluded that this active citizen-sourcing method enables interaction and consultation concerning specific social problems/needs and public policies with wider and more heterogeneous audiences than other alternatives used by government agencies for this purpose, in shorter time and at lower costs.Furthermore, it assists in the analysis and elaboration of the particular problem/policy under discussion, as the identification of a wide range of particular issues and dimensions perceived by the citizens with respect to, leveraging relevant collective knowledge and experience.However, the method seems to be less efficient in the generation of solutions and the facilitation of convergence among stakeholders' views.
With respect to the passive citizen-sourcing method outlined 3.2 three pilot applications have been conducted, in co-operation with the Greek and the Austrian Parliament, and the European Academy of Allergy and Clinical Immunology (EAACI), on topics that reflect important current debates and interests of these organizations.In Fig. 2 we can see a visualization of the results derived in one of these pilot applications, concerning the energy policy.

Fig. 2. Results visualization of the "Energy" pilot application of the passive citizen-sourcing method
In particular, the upper left visualization provides a word cloud of the most frequently issues detected in the accumulated content concerning the energy policy, while the upper right visualization provides charts on the volume of textual content found that is relevant with specific elements of the constructed policy models entities explained in Section 3.2 (policy statements or arguments).Then, the visualization in the middle of Fig. 2 indicates example of text excerpts that have been found in the crawled Web 2.0 sources and characterized as positive or negative arguments by the opinion mining analysis (indicated with green or orange color respectively).Finally, the visualizations in the lower part of Fig. 2 indicate the overall sentiment distribution in the retrieved content, the distribution of the volume of content found per type of source, and the evolution of content over time.
From the evaluation of these pilot applications it has been concluded that this passive citizen-sourcing method can provide considerable support for public policy making, by enabling the low cost and fast assessment of citizens' feelings/attitudes concerning a prospective or existing policy, and also the identification of particular issues posed by the society concerning this policy.Furthermore, it allows to a lower extent the collection of proposals concerning possible problem solutions and policy interventions.However, this method has some inherent risks, associated: a) with the misuse of it for promoting individual interests (by reporting selectively only a sub-set of its results, which is in the desired and supported directions by specific stakeholders, and hiding some others); and b) with the possible intrusion into citizens' private sphere (so it is necessary to avoid content sources in which contributors perceive their postings and discussions as private).Critical success factor of this method is the selection of an extensive, diverse and representative set of high reliability and quality medial sources to be monitored.Finally, for the evaluation of the passive expert-sourcing method outlined in 3.3 three pilot applications of it have been conducted, concerning three important EU policy related topics agreed among the 'EU-Community' project partners: Innovation and Entrepreneurship, Energy Union and Future of the EU.In Fig. 3 we can see some typical results visualizations.In the upper part we can see the detailed information about a specific document retrieved on a policy of interest.This information includes the results from the sentiment classification provided by the opinion mining algorithm regarding its polarity and as well as ratings and comments on it as input provided by other users.The lower part of the figure also presents a visualization of the sentiment classification of all documents retrieved within the application on the topic "Innovation & Entrepreneurship", ordered by temporal order of their appearance.

Fig. 3. Results visualizations of the passive expert-sourcing method
From this evaluation has been concluded that this passive expert-sourcing method has high levels of usefulness for the collection of high quality information and knowledge concerning all main elements of important social problems that have to be addressed through public policies: particular issues, proposed actions/interventions, advantages and disadvantages of them.Therefore it can make a significant contribution, and more multi-dimensional than the other two abovementioned citizen-sourcing methods, towards addressing the fundamental difficulty of modern policy-making: highly complex and 'wicked' social problems to be addressed [1][2], with many issues, proposed actions/ interventions, with each of them having various advantages as well as disadvantages, and also multiple stakeholder groups with differing views and perceptions about them.Furthermore, this method has medium to high levels of usefulness for identifying existing attitudes/sentiments in the society towards the above main elements of important social problems under discussion, as well as their time wise change.

Comparison of Citizen-sourcing Methods
In the following Table 1 we can see a detailed comparison among the three citizensourcing methods discussed in sections 3 and 4, taking into account the capabilities they provide, as well as the outcomes of their pilot applications.Part of the comparison criteria have been taken from the e-participation domain model proposed in [29].
The main differentiations of the proposed methods lie on the type of citizen-sourcing they perform (active or passive) and their targeted audience (citizens/general public or experts), while each of them also employs different but overlapping sets of technologies.All methods exploit multiple Web 2.0 SM simultaneously as content sources, in a centrally managed manner, based on a central ICT platform.The acquisition of data from them is automated by using their APIs, however for some of the selected data sources that didn't provided such APIs, the usage of specialized crawlers is essential.Then all methods make sophisticated processing of the collected content, in order to extract the most significant points from it, in order to reduce the 'information overload' of government decision makers and provide meaningful insights for the policy formulation process.For instance, they all employ opinion mining and sentiment analysis techniques in order to extract target groups' opinions from the collected SM content, as well as advanced visualized presentation of the results.However, in the case of the two passive citizen-sourcing methods the quantity of the accumulated content is much bigger than in the active citizen-sourcing ones, so much more sophisticated processing has to be performed.A major difference is that in the first two methods content analysis is conducted at an aggregated level, and not at individual author level, while, in the third method results are collected and presented on the basis of individuals recognized as experts.For this reason, the third method includes techniques of policy experts' profiling and reputation assessment and management, used for filtering collected content.With regard to their application models each method demands effort in different phases.In particular, the application of the passive citizen-sourcing method needs more extensive work in the initial preparation, where domain and policy models have to be built by policy makers and domain experts.On the other hand, the active citizen-sourcing needs content posting by policy makers and their associates (defining the question/ topic of the consultation, and providing some base information about it, e.g.relevant text, images, video, etc.); also, this SM consultation has to be advertised, both initially, and in the whole period it is active, in order to attract large groups of citizens.Finally, in the passive expert-sourcing method less effort is needed, which is mainly concentrated in the interpretation and filtering of the results.
In order to examine and compare the of policy making each of the proposed methods can be used for, we have used the model of policy-making lifecycle stages proposed in [30], which includes five stages: agenda setting, analysis, policy creation, policy implementation and monitoring.Since passive citizen-sourcing is an unstructured idea collection process, without any definition of a specific problem statement, it can be launched in the agenda setting in order to bring social problems or issues into the attention of governments and administrations.When the definition of the social problem is structured, and the targeted policy area is defined, active citizen-sourcing can be launched to trigger citizens' reactions on them and gather their perspectives.In the subsequent stages (the policy creation and implementation), expert-sourcing is more substantial, since expertise and specialized knowledge is essential for these stages.Finally, in the monitoring and evaluation stage it is crucial to convey citizens views on the implemented policies, therefore either passive or active citizen-sourcing methods (posing questions on particular aspects of the policies) can be employed.
The evaluation results have revealed the major advantages of 'passive' approaches over the 'active' ones: (i) they enable government agencies to access, retrieve and exploit much larger quantities of more diverse policy relevant content from a wide variety of social media sources of different political orientations; and (ii) this content already exists, so government agencies do not have to find ways to attract large numbers of citizens to participate in citizen-sourcing and generate new content

Conclusions
In the previous sections of this paper a set of different approaches and methods for the exploitation of SM in government for supporting public policy making have been presented.Therefore, it provides some interesting contributions, which can be useful to both researchers in the policy informatics domain and government practitioners dealing with the public policy making.The findings from this research indicate that all the above approaches can definitely contribute to the timely collection of citizens' and as well experts' knowledge about social problems/needs as well as actions/interventions/policies for addressing them, taking advantage of the continuously growing Web 2.0 SM.So, they constitute valuable tools that can increase the quality, quantity and diversity of public opinion integrated and taken into account in public policy making.In general, the results revealed that although there are a number of risks associated with the application of these approaches (e.g.credibility and quality of collection information, manipulation of crowd), they are in general considered as effective and efficient methods for reaching wider and more diverse audiences at lower cost.Furthermore, the proposed approaches allow overcoming the usual 'information overload' problems of the traditional approaches, as they incorporate advanced content processing techniques, which are capable of extracting the main points of the collected content.
Based on the evaluation and analysis of these three methods we can distinguish a maturity model concerning the use of SM for citizen-sourcing by government agencies in order to support policy making.It includes the following five maturity stages: I. Set-up and manual operation of multiple SM accounts: In this initial stage a government agency sets-up accounts in the most popular SM (e.g.Facebook, Twitter, YouTube), and operates them manually: content concerning its current services, activities as well as policies (current and future) is posted manually in each SM account, while citizens comments are read by public servants, and then summarized, and conclusions are drawn from them and sent to the appropriate interested units.II.Centrally managed operation of multiple SM accounts: In this stage the posting of content on each particular topic is conducted from a central ICT platform automatically to all SM accounts of the government agency; this ICT platform also retrieves automatically citizens' interactions (e.g.likes, shares, comments) for each posting, and makes advanced processing of them to facilitate summarization and conclusion drawing.III.External SM accounts central monitoring: In this stage, in addition to the centralized operation of the SM accounts of the government agency, we proceed to centralized monitoring of 'external' SM accounts and Internet sources in general, which have high quality content of interest, related to its activities and competences: interesting content is automatically retrieved, and then undergoes advanced processing, in order to facilitate summarization, main points extraction, sense making and conclusion drawing.IV.External SM accounts monitoring with quality filtering: This stage combines the characteristics of the previous ones, with quality filtering of the collected policy related content, based on the reputation of the author or/and the sources, aiming to provide information, knowledge and opinions from highly knowledgeable experts, and promote a 'democracy -technocracy' balance [16][17] in the formulation of public policies.V. Internal dissemination and consultation: This final stage includes the characteristics of the above stages II, III and IV, combined with ICT-based internal dissemination of the collected information, knowledge and opinions from the citizens' general public and the experts, and also internal consultation on them (e.g. through 'internal' SM); this facilitates collective sense making, assimilation, conclusions drawing, and better exploitation of them for taking action, making innovations and designing better policies.
It should be noted that the three SM-based citizen-sourcing methods are not mutually exclusive, but can be combined.Further research is required concerning the combination and 'interoperation' of different methods along the policy formulation stages for providing more substantial decision support to policy makers and social actors.

Table 1 .
Comparison among the three methods for SM-based citizen-sourcing