The Untold Story of USA Presidential Elections in 2016-Insights from Twitter Analytics

Elections are the most critical events for any nation and paves the path for future growth and prosperity of the economy. Due to its high impact, a lot of discussions take place among all stakeholders in social media. In this study, we attempt to examine the discussions surrounding USA Election, 2016 in Twitter. Further we highlight some of the domains influencing the voter behaviour by applying the outcome of Twitter analytics to Newman and Sheth’s model of Voter Choice. Through the analysis of 784,153 tweets from 287,838 users over 18 weeks, we present interesting findings on what may have affected the polarization of USA elections.


Introduction
Social media plays a pivotal role in impacting the outcome of national elections (Bruns and Stieglitz, 2013). The United States presidential election of 2016 was held on Tuesday, November 8, 2016. The two candidates for the presidential election of 2016 were Republican Donald Trump and Democrat To best of our knowledge, this study is the first among the other studies in political domain where the buzz created by presidential candidate tweet was mapped to citizen responses for the purpose of exploring the drivers of polarization and acculturation of voting behaviour. This research paper attempts to evaluate how the sentiments and topics evolving among the voters change over the period of time of the election. In the subsequent subsections, we would present the importance of the social media, followed by the evolution of the social media analytics, geographically acculturation and polarization among voters. We would see different social media analytics methods apply over the 784,153 unique tweets from 287,838 users to get a better understanding of the sentiments changing over the election and about the topics that tweeter share and discuss among themselves. For each tweet around 46 parameters, focusing on the user demographics and tweet characteristics were extracted. Some of the variables capturing user demographics are name, location, description, followers, following, likes, lists and moments. Some of these variables capturing tweet characteristics are tweet content, language, retweet count, favourite count and various status updates.
The total size of the Twitter data collected and analyzed was 2.23 GB. There were around 36,071,038 data points analyzed (derived from 784,153 tweets with 46 fields). The results from the analysis of tweets are also used to compare and assess the drivers for the final outcome of the election results where Donald Trump won over Hillary Clinton on 8 th November 2016.
The remaining sections are organized as follows. Section 2 briefly illustrates the literature review on the political communication, social media, acculturation in social media and the usage of social media platforms for the political communication along with the research gaps (RG) identified and contribution of the study. Section 3 contains the key references from the literature which had helped in developing the hypothesis. Section 4 illustrates the methodology adopted for the study. Section 5 showcases the analysis of the tweets surrounding the USA Election. Subsequently discussions are made on the contribution of the study, the implications to practice and policy, existing limitations and the future research directions.

Literature Review
The literature review had been divided into the four sections named political communication, social media, acculturation in social media and how the political actors are using the social media for public communication. The last section of the literature review contains the research gaps identified from the literature and highlights the contribution of the study.

Political communication
Traditional media follows a unidirectional communication and also offers asynchronous communications. In contrast, social media is multi-directional and offers interactive communication along with the message broadcast facility to a large number of users (Ross and Bürger, 2014;Kruikemeier et al., 2016). This facility of social media enables the political discourse to shift from the traditional mass media to social media platforms like Facebook and Twitter (Heo et al., 2016).
Both ideology and language constrain the political conversation (Borondo et al., 2014). The usage of the social media platforms in western democracies is very high (Mosca and Quaranta, 2016) for political communication. The usage of the social media platform among various countries depends on the various factors such as broadband facilities, internet penetration, and media literacy (Klinger, 2013).
Politicians and journalists through online interaction are emerging as both actors and sources of information (Ekman and Widholm, 2015). Literature highlights how social media plays a significant role in modern media environment (Bode, 2016). Politicians had used the social media for distributing information (Klinger, 2013;Ross and Bürger, 2014) and campaigning (Jungherr, 2014), to mobilize the voters by attracting their attention to parties agendas (Skogerbø and Krumsvik, 2015). Social media sites are emerging as the journalistic sources (Ogola, 2015;Skogerbø and Krumsvik, 2015) and trying to connect the actively involved citizens to the non-active citizens in political discourse (Mosca and Quaranta, 2016).
Literature indicates that the reach of the protest messages increases when posted on social media platforms  which indeed can lead to crowd enabled mobilization (Ems, 2014;Theocharis et al., 2015). Activist communication on social media platforms gets accelerated and thus encases visual character of activist (Poell, 2014;Ernst et al., 2017). The user-generated content on social media is transferred quickly to the mass media (Heo et al., 2016).

Social Media
Social media platforms are important for various domains such as marketing (Thackeray et al., 2008), customer engagement (Heller Baird and Parasnis, 2011), brand management (Kim and Ko, 2012), product and services promotions (Neiger et al., 2012) and recruitment (Henderson and Bowley, 2010) purposes. More and more people are joining these platforms and using it for social interaction, self-expression and information exchanges (Scott et al., 2017) within the virtual communities in specific interest domains. Domain specific understanding may be developed by analyzing user generated content and understand market dynamics (Joseph et al., 2017;Utsuro et al., 2016) using big data analytics (Grover and Kar, 2017).
Social media data (i.e. user generated content) has been extensively used for analysing real-life problems such as predicting electoral forecasting (Burnap et al., 2016), engaging with voters (Adams and McCorkindale, 2013), identifying social tensions , analysing gross national happiness (Durahim and Coşkun, 2015), evaluating voting intentions (Ceron et al., 2014) and measuring transition in organization behaviour (Lakhiwal and Kar, 2016). Literature highlights that for democratic engagement hybrid mix of television and social media can lead to positive outcomes (Chadwick et al., 2017) in the elections. The online engagement on social media has an effect on user's sentiments (Ibrahim et al., 2017). Users who had followed the hashtags in the discussion had indicated the influence of Twitter discussions in their comments (Chadwick et al., 2017). Literature indicates high frequency social media users are women and highly engaged users are highly educated humans (Scott et al., 2017).
Twitter had been used for various public policies such as campaigns of electronic cigarettes (Harris et al., 2014), for early warning about natural hazards (Chatfield et al., 2013), for understanding the social sensitivity towards the environment (Cody et al., 2015) and emergency management (Panagiotopoulos et al., 2016). The evidence and potential of using Twitter to uncover unbiased information from user-generated content were the drivers for choosing Twitter data for our study.

Acculturation in the social media
Acculturation refers to the changes which occurs within an individual of one group when the person comes in contact with another group of different cultural background (Redfield et al., 1936). Literature suggests four strategies for acculturation. These are assimilation, separation, integration and Marginalisation (Berry, 1997). Assimilation is a strategy when an individual of the nondominant group who do not wish to maintain their cultural identity interacts with the dominant group often. In contrast, in the separation strategy an individual wants to hold his values and tries not to interact with other cultures. When both the groups want to hold their culture values but at the same time want to interact as well with other groups, integration strategy is followed. For groups less interested on maintaining their cultural preferences and less interested in maintaining relationships with other group, marginalisation strategy is followed.
The acculturation theories have been applied in the political domain in an experiment over the native majority and immigrant minorities (Hindriks et al., 2016). The results indicate that (a) in a political assimilation strategy, only the interests of the major groups advance; whereas (b) in a political integration strategy, the interest of majority group along with minority group advances; and (c) in a political separation strategy interest of the minority group only advances.
Literature indicates communication happening on social media has the potential of strengthening or weakening the cultural value among the users (Croucher, 2011;Li and Tsai, 2015;Mao and Yuxia, 2015). There are various studies which had examined the acculturation process happening because of the social media platforms on various groups of humans ethnic are listed below : (a) Chinese professionals overseas (Mao and Yuxia, 2015); (b) Hispanics in the US (Li and Tsai, 2015); (c) International students (Cao and Zhang, 2012;Forbush et al., 2016); (d) Lebanese residing in French speaking urban areas (Cleveland et al., 2009). In the context of US elections, the divergence among communities geographically presents a potential acculturation of ideas and thereby lead to potential polarization of voting outcome.

Political Communication and Social Media
Politicians use social media platforms like Facebook and Twitter for professional communication (Kelm et al., 2017). Social media campaigning can be of two type's party-centric or individualized style (Karlsen and Enjolras, 2016). Political information shared and discussed on the social media engages the young people for connective actions (Vromen et al., 2015). Evidences are present that the degree of social media buzz created by the political parties had positively impacted the outcome of general elections in emerging economies like India (Safiullah et al., 2017).
Literature highlights that microblogging services provide the opportunities to politicians for disseminating information, engaging with voters, monitoring public opinion and for making public relations (LaMarre and Suzuki-Lambrecht, 2013;Frame and Brachotte, 2015). Literature indicates if voters acquire the political information via social media channels and respond to information, this increases the likelihood of his/her to contact politicians and attend offline events . Officials active on social media have more contacts as compared to less active officials (Djerf-Pierre and Pierre, 2016). Therefore politicians use social media platform for both communications, engagement with voters and marketing purposes. For marketing purposes, Facebook is often the preferred tool whereas for continuous dialogue Twitter is often preferred (Enli and Skogerbø, 2013). National Assembly members of Korea used Twitter to communicate with fellow politicians rather than with their constituents (Hsu and Park, 2012). Twitter can also be used as a tool for political opposition by the politicians (Van Kessel and Castelein, 2016).
Political actors in western democracies are using Twitter and Facebook for populist communication . Populist actors get freedom for circulating their messages and ideology through the usage of social media platforms . A political leader using Twitter and Facebook receives a lot of attention on these platforms (Larsson, 2017).
Twitter had been used by the politicians for broadcasting (Hutchins, 2016;Theocharis et al., 2016), advertisement (Domingo and Martos. 2015;Hutchins, 2016) and for engaging the citizens (Ahmed et al., 2016). Literature indicates Twitter usage by politicians increases their chances of winning the election (LaMarre and Suzuki-Lambrecht, 2013). Politicians have created their accounts on Twitter because it is in the trend but are rarely using it (Rauchfleisch and Metag, 2016). The adoption of Twitter is conditioned at a personal level (Scherpereel et al., 2017) and driven by a politician's age (Rauchfleisch and Metag, 2016).
Twitter is being used by established parties as well as new and upcoming political parties for political communication. Established parties use Twitter for supplementing offline strategies whereas new and upcoming political parties use it for self-promotion and media validation (Ahmed et al., 2016). Politicians who maintain the synergy between the social media platforms and traditional media channels can act as an influencer on social media platforms (Conway et al., 2015;Karlsen and Enjolras, 2016). The more the politician is active on the social media, the more the journalist will follow the politician (Rauchfleisch and Metag, 2016).

Research gaps and major contributions
The main focus of this study is to explore how specific topics discussed in social media among specific communities can have an impact in polarizing the outcome of an election. The frequency of tweets posted on Twitter has the impact on voter's engagement (Scherpereel et al., 2017). Tweet influence can be measured in terms of the number of followers the author has within his/her egocentric network (Moya-Sánchez and Herrera-Damas, 2016). The reach metric is given in Table 1 help us in computing the reach of the message (Ganis and Kohirkar, 2015). It also indicates number of the accounts which can participate in disseminating the information contained within the tweet. Some of the research gaps (RG1, RG2 and RG3) identified are given below: RG1: Does high frequency of social media activity lead to popularity and higher engagement? Are the topics discussed by Trump are more popular than the topics discussed by Clinton in Twitter? For answering these questions, the study attempts to analyse tweets using social media analytics like descriptive analysis, content analysis and network analysis (Chae, 2015) along with the data mining approaches such as regression analysis and community detection (Fortunato, 2010), the details of which are provided in subsequent sections.

RG2: How are drivers of voter's behavior choice being discussed by the voters in
The study showcases how the engagement is happening on the social media platform during the election period among the different stakeholders in the virtual communities. The study also highlights the role of Twitter features such as hashtags, @mention, retweets and likes and the ways features being used by users for the communication. The results of the study can be used by the political actors in future for planning the digital campaigns over Twitter platform. The results of the study indicates more communication during the election over Twitter may lead to negative buzz on the platform.

Proposition
Literature highlights high frequency of the tweets and interactive communication on Twitter leads to higher visibility which in turn leads to more social discussions about the candidate among the other users. These social discussion can polarize the users towards the candidate which can lead to a candidate in winning the election (Larsson and Moe, 2012;Kruikemeier et al., 2016). Literature highlights candidates' facial expressions and physical gestures are predictors of the volume and valence of Twitter expression . A candidate who engages a lot with people on social media platforms is likely to get exposed more to criticism and harassment (Theocharis et al., 2016).

H1: Reinvestigating if higher frequency of social media activity always leads to higher popularity and engagement among followers.
Literature highlights campaigns can empower communication operations on Twitter in three ways by responding, retweeting and engaging others (Jensen, 2017). The political engagement through hashtags had been regarded as the strongest and most consistent associations . Communicative exchanges can be easily tracked using the hashtags. Free-text on Twitter has a larger correlation to their vote tallies as compared to the @mention (McKelvey et al., 2014).

H2: Lesser variation of time (greater nexus) between consequent campaigns increases higher popularity and engagement.
Some accounts (influencers) play a larger role in disseminating the information as compared to the others in the social network. Literature highlights that the information on Twitter can be received from the decentralized network as well (Theocharis, 2013). Thus there is a need for handling Twitter account responsibly. Periodicity of such tweets in sustaining the interest in a social media campaign has been indicated to be important (Mills, 2012), however the importance has not been established empirically. Therefore this study attempts to explore whether the nature of periodicity of tweets during the election period matters, along with the issues and topics discussed by the candidate. Therefore our hypothesis investigates this in US Election 2016.

H3: Higher thresholds of sentiments (polarity) within tweets creates greater popularity and engagement among followers.
The tweets which are emotionally charged may be retweeted more as compared to neutral tweets . For this, we adopted the Newman and Sheth's model of voter choice, which indicates seven factors which drive the voter's behavior in the physical world. These domains are issues and policies, social imagery, emotional feelings, candidate image, current events, personal events and epistemic issues (Newman and Sheth, 1985).
This model of voter choice had been widely applied in examining the voter choice behavior in empirical surveys. Therefore in this case study we had tried to map the model factors in the virtual environment, to determine whether the discussions surrounding these factors are initiating polarization and acculturation process among the user.
Twitter had been used by candidates to interact with voters (Graham et al., 2013) and voters also creatively participate in the election discussions on Twitter (Raynauld and Greenberg, 2014). The discussion surrounding these domains can highlight how the voters/Twitter users are getting impacted in the virtual world. The drivers of the voter's choice behavior can be explained through the Twitter analytics methods.

H4: Greater coverage in social discussions on different factors of Newman's Sheth's Voter's Choice Behavior increases the engagement with voters, actively or passively.
Twitter had been used by candidates for mobilizing their campaigns and for directly interacting with voters (Graham et al., 2013). Greater coverage of different factors of voter's choice beahvior would ensure addressing the concerns of more diverse groups from the voting communities. Literature indicates social media are useful platforms for the acculturation process (Li and Tsai, 2015). Chinese professionals overseas had regarded Facebook as a useful acculturation tool for acquiring information on the trending topics in the host countries (Mao and Yuxia, 2015). The next hypothesis will explore how hashtags or campaigns (Borondo et al., 2014;Bode et al., 2015;Chae et al., 2015) contribute towards the acculturation process among Twitter users located in different geographical locations.

H5: Popular Hashtags or campaigns can initiate acculturation process of ideologies among Twitter users located in different geographical locations.
Literature indicates hashtags or campaigns had lead to polarizing voter choice within in the virtual community (Larsson and Moe, 2012;Bode et al., 2015;Kruikemeier et al., 2016). Literature contains the evidence which shows that social media buzz created by the political parties on the social media platforms had resulted in their favour (Safiullah et al., 2017) whereas other researchers pointed out the candidates' likelihood of being elected is negatively related to engaging style (Theocharis et al., 2016) and some election campaigning had resulted in minimal public attention (Hong and Nadler, 2012). From all these evidence it can be concluded people are getting polarized in the virtual environment. The users may be getting polarized by campaigns, tweets or discussions going around the candidate.

H6: Discussions in social media platforms demonstrates the occurrence of polarization among the voter groups based on participation in political discussions like elections.
Persuasive campaigning may have less impact on citizens (Hosch-Dayican et al., 2016). Literature indicates men tend to be neutral whereas women tend to be more opinionated on the social media platforms and youth gives more of negative opinions and emotions (Volkova and Bachrach, 2015). Protestors and non-protestors on Twitter can be clearly demarcated (Lysenko and Desouza, 2011;Mosca and Quaranta, 2016). Through this hypothesis we attempt to explore how polarisation happens based on social discussions among supporters and non-supporters of ideologies presented through social media.
H7: Communities are formed among the groups which are polarized during social media discussions during political events like elections.
Literature indicates user's tries to cluster themselves in politically homogeneous networks (Borondo et al., 2014). Theory of homophily in online political discourse indicates individual's tries to associate themselves with the similar users on the social network (Himelboim et al., 2016). This leads to the formation of the clusters within the virtual communities. Users within these communities are unlikely to be exposed to cross-ideologies from different clusters (Himelboim et al., 2013). However social media opens up the potential for cross cultural interaction.

Research Methodology
A social media analytics framework in the political domain had been proposed in the literature , which consists of two parts: data tracking and monitoring, and data analysis. The data on the social media can be tracked through user timeline, keyword, and topics, hashtags, and URL. The data can be extracted from social media through the API such as ''Search API'' and ''Streaming API''. The framework highlights social media data can be analyzed using content analysis, opinion mining, social network analysis and sentiment analysis . Twitter allows users to download data posted or discussed around the search term within a particular period. This data can be subsequently analyzed for deriving metrics and developing deeper insights.
Some of the metrics for comparing communicative patterns on Twitter had been highlighted in the literature (Bruns and Stieglitz, 2013;Chae, 2015). An indicative list of methods for Twitter analytics is illustrated in Table 1. The overview of Twitter analytics methods is a scientific contribution of this study to best of our knowledge this list had not been introduced in any of the academic literature.
The methods within Twitter analytics have been divided into the four broad categories such as descriptive analytics, content analysis, network analysis, and geospatial analysis. The descriptive analysis focuses on descriptive statistics, such as the number of tweets and its types, number of unique users, hashtags, @mention and hyperlinks added in the tweets with frequency, word cloud and the reach metrics. Word clouds help us to visualize the popular words/topics tweets (Nooralahzadeh et al., 2013). The "reach" metric can be used to measure the reach of the messages (Ganis and Kohirkar, 2015). Similarly, reply and retweet feature in Twitter helps in assessing twoway interaction and engagement (Purohit et al., 2013). The hashtags are used in the tweets so that the tweet opinion can be associated with a wider community of similar interest (Chae et al., 2015). Similarly, the @mentions analysis helps in identifying the influencers who had influenced the users to the extent that he/she wants to have a discussion with the influencer on the tweet topic (Shuai et al., 2012).
Content analysis is used to extract the semantic intelligence from the text data. It leverages upon natural language processing (NLP) and text mining (Kayser and Blind, 2017) to retrieve the information from large amount of the text data (Kassarjian, 1977). For example, sentiment analysis includes two types of the analysis such as polarity analysis and emotion analysis. Sentiment analysis is the process of computationally identifying and categorizing the opinions of the text (Zhang et al., 2016). For this study, the sentiment analysis of the tweets was done using the R using syuzhet, lubridate and dplyr libraries. Polarity analysis is one of the highest used techniques for Twitter data analysis to measure the opinions of the user. The emotion analysis is one of the sentiment analysis techniques where user generated content is grouped into eight emotions categories such as anger, anticipation, disgust, fear, joy, sadness, surprise and trust. Literature highlights the emotions expressed on the social media reveals the insights of the user (Volkova and Bachrach, 2015). Similarly, topic modeling identifies the key themes among the tweets (Llewellyn et al., 2015). Topic modeling can be done using the tm and topicmodels libraries of R.
The connection among the users on Twitter can be visually depicted using the networks (HerdaĞdelen et al., 2013;. The networks analysis can help us in identifying communities and clustering the users on the basis of their opinions and thoughts on social networks (Abascal-Mena et al., 2015). The information flow on social media can be visually represented through Information flow networks (Park et al., 2015).
The geospatial analysis had been segregated into two broad categories, such as geographic location specific analysis and time-trend specific analysis. The time-trend analysis helps in the analysis the evolution of the topics and trends over the period of time. It helps in identifying how things are being evolved with respect to time (Saboo et al., 2016). Geospatial analysis helps in mining the opinions geographical locations wise (Stephens and Poorthuis, 2015;Attu et al., 2017).  (Bode et al., 2015) Allows one follower to share someone else's tweet. URL analysis  Allows users to disseminate the information by giving the URL within the 140 character tweet. Hashtags analysis (Borondo et al., 2014;Bode et al., 2015;Chae et al., 2015) Hashtags are user-generated keywords preceded by the # symbol. It allows users to cluster their opinions. @mentions Analysis (Shuai et al., 2012;Larsson and Ihlen, 2015;Borondo et al., 2014) Helps in promoting one to one discussions on Twitter.
Word Cloud (Nooralahzadeh et al., 2013) Pictorial represents the most frequent words in the discussions Reach metric (Ganis and Kohirkar, 2015) Measure the reach of the tweets. Content Analysis Sentiment Analysis  Identifies and categorize the text.

I. Polarity Analysis
Categorize the text into the three sets such as positive, negative and neutral.

II. Emotion Analysis
Categorize the tweets on the basis of the emotions expressed within it. Topic Modelling (Llewellyn et al., 2015) Identifies the key themes within the text. Network Analysis Network analysis (HerdaĞdelen et al., 2013; Depicting the connection among the users on the basis of commonality. Cluster/ Community detection (Abascal-Mena et al., 2015) Identifies different communities among the users.
Information flow networks (Park et al., 2015) Depicts the flow of the information across the network.

Geo Spatial Analysis
Time-trend analysis (Saboo et al., 2016) Pictorial representation of the trends or topics changing with the time. Geospatial analysis (Stephens and Poorthuis, 2015;Attu et al., 2017 ) Analyzing the data on the basis of the geographical location.
To test our hypotheses of interest, we retrieved data from Twitter in two ways for 120 days. Firstly by extracting the data from Twitter on daily basis using the search terms "USA election", "Hillary Clinton" and "Donald Trump" concatenated by "OR". Secondly, extracting Twitter timeline data of "Hillary Clinton" and "Donald Trump" for 120 days.
For the first part of the data extraction, the methodology had been divided into the five-phase such as phase 1 identifies the search terms to extract the data from Twitter. For this study, a list of electionrelated search terms like "USA election", "Hillary Clinton" and "Donald Trump" were identified based on listing in Twitter trends. Phase 2 of the study focuses on extracting data from Twitter. The unstructured data collected through the Twitter API using Python scripts was in JSON format. Phase 3 of the study helps in converting unstructured data to structured data, i.e. JSON to the structured Excel format. The steps in phase 2 and 3 were repeated daily over the 18 weeks to extract the data from the Twitter because literature indicates smaller online samples do not give an accurate picture of activities happening on Twitter (Gonzalez-Bailon et al., 2014); Phase 4 helps in digging the insights of the data through various Twitter analysis methodologies such as descriptive, content, network and time-space analysis. Table 1 illustrates an indicative list of methods for Twitter analytics. Phase 5 explains the impact of the findings through the Newman model of voter behavior using seven concepts like issues and policies, social imagery, emotional feelings, candidate image, current events, personal events and epistemic issues.

Finding and Interpretation
This section had been divided into the three section. Section 5.1 illustrates the way Twitter handle being handle by the presidential candidate. Section 5.2 shows the impacts of Twitter users on topics discuss by presidential candidates with the help of the Newman and Sheth's Voter's Choice Behavior. Section 5.3 shows the communities formed by users with the help of the hashtags.

Tweets frequency lead to popularity and higher engagement
For investigating the hypothesis 1, 2 and 3, the tweets from both presidential candidates Twitter screen where extracted. To give the overview of the activities perform by presidential candidates during the election period between August 13, 2016 to December 10, 2016 were analysed in terms of the number of tweets along with hashtags posted by each candidate and the way Twitter users are reacting towards the tweets through the "retweet" and "like" features offered by Twitter. The insights derived out of tweets posted by presidential candidates can be explained using the SPIN Framework (Mills, 2012). SPIN frameworks indicates the spreadability and propagativity of tweets among Twitter users.

H1: Reinvestigating if higher frequency of social media activity always leads to higher popularity and engagement among followers.
Spreadability refers to the ease with which campaigns can be spread across Twitter ecosystem. Likes and retweets help tweet to spread across the various networks (Mills, 2012). A descriptive overview of the Twitter activity of Clinton and Trump is presented in Table 2, which illustrates the degree of spreadability of both candidates among Twitter users/voters. From the table 2, it may be inferred that higher frequency of the tweets may lead to higher visibility and social presence (from fig. 11) which is in line with the literature. Clinton was tweeting twice as Trump but lost the election although literature indicates that high frequency of tweets leads to a positive outcome in elections (Larsson and Moe, 2012;Kruikemeier et al., 2016). Clinton was exposed to lots of criticisms (Annexure -URL Analysis), which may be an outcome of the high frequency of tweets. Literature also contains the evidences of negative fallout of high activity in social media (Karlsen and Enjolras, 2016;Theocharis et al., 2016). Interestingly, the mean retweet count of Trump is almost twice time of Clinton whereas mean like count of Trump is almost 3.8 times of Clinton. In subsequent sections, we attempt to explore why this inverse outcome may have happened.
Propagativity refers to the ease with which tweets can be redistributed through the voters among the voters which take into account cycle time, network size (i.e. number of followers), content richness and content proximity (Mills, 2012). As illustrated in Figure 2 and data collected during the election period, it can be inferred USA citizens during this period are discussing USA election, followed by Hillary Clinton and then Donald Trump. Around 441,261 tweets were collected on the search term "USA Election", around 258,212 tweets were collected on the search term "Hillary Clinton" and around 84,680 tweets were collected on "Donald Trump". The difference in the number of tweets collected for Clinton and Trump may be because Clinton had posted approximately twice the number of the tweets posted by Trump. From fig. 2, it can be derived Trump is more regular on Twitter as compared to Clinton, though the buzz created by Clinton was higher.

Fig 2. Tweeting frequency vs social media buzz
The primary axis of Fig. 2 consists of the buzz on the candidate while the secondary axis contains the number of tweets on the candidate screen on each individual day. Trump has 17.6 million followers on Twitter with 34,160 tweets whereas Clinton has 11.7 million followers with 9,838 tweets. A regression analysis highlights that the buzz (Y) may be modelled using regression with the user activity (X) as follows: (a) For Clinton Y = 3.122*X + 2089 (b) For Trump Y = 1.989*X + 685.3. It appears as if Donald Trump had more reach than Hillary Clinton.

H2: Lesser variation of time (greater nexus) between consequent campaigns increases higher popularity and engagement.
Twitter campaigns are launched with the help of the hashtags. The online campaigns using hashtags are cost-effective for the presidential candidates. The hashtags provide meta-data information about the campaigns (Abascal-Mena et al., 2015). In this case, we will try to explore how the campaigns had been used by both the presidential candidates. Fig 3, presents the frequency of the hashtag campaigns used by presidential candidates along with the periodicity mean, periodicity standard deviation, retweet (10K), retweet mean (10K), retweet standard deviation (K), favorite sum (10K), favorite mean (10K) and favorite standard deviation (K). Trump had beautifully incorporated his campaigns hashtags (#maga; #draintheswamp; #bigleaguetruth) in his Tweets whereas Clinton did not use the hashtags of her dominant campaigns much. The usage of campaign hashtags in Trump's tweets may have led to the higher campaign polarity among users, and voters participated using these hashtags, which further propagated the core message of his campaigns.

H3: Higher thresh-holds of sentiments (polarity) within tweets creates greater popularity and engagement among followers.
Subsequently, we wanted to explore whether greater levels of polarity and emotions in tweets have a positive impact in terms of buzz. From Fig. 4, it may be inferred that for all the emotions, Clinton was scoring more than Trump in absolute number numbers, but when these statistics are compared to percentage there is very high difference in surprise emotion. Hillary Clinton had scored around 49.88% whereas Donald Trump had scored around 25.51% in surprise emotion of the tweets. It may be inferred from these graphs that through the tweets Clinton was highlighting more surprises for the voters and thus it may have resulted in increasing the social buzz as indicated in Fig 2, in line with existing literature (Berger and Milkman, 2012).

Twitter discussions surrounding the drivers of the voter choice
To explain these trends, we attempt to use a model for analyzing the discussion surrounding the drives of voter's choice on Twitter, as illustrated in figure 5. The model maps the Twitter analytics methods to the drivers of voter's choice. Literature highlights various features of Twitter such as @mention, reply, retweet had been used by the candidates for engaging the voters (Borondo et al., 2014;Hosch-Dayican et al., 2016;Jensen, 2017). To engage the citizens in communication more frequently @message functionality had been used Norwegian party leaders (Larsson and Ihlen, 2015).
In the subsequent section, we attempt to explain based insights derived from "USA Election Twitter data" by applying Twitter analytics method through the Newman and Sheth's model of voter choice, through seven distinct and separate cognitive domains which drive the voter's behavior. These factors are issues and policies, social imagery, emotional feelings, candidate image, current events, personal events and epistemic issues (Newman and Sheth, 1985).

Issues and policies
Issues and policies try to address the economic policy, foreign policy and social policy raised by the candidate during the election period and the leadership characteristics possessed by the candidate. Literature highlights the issues and policies are important components in influencing voter behavior (Newman and Sheth, 1985). In general, it is assumed that voters will vote for the candidate that will provide them with a higher level of utility. Economy policy refers to the policies focusing on reducing inflation and budget balancing. Foreign policies include policies like increasing the defense spending. The tweets from both the presidential candidates Twitter screen where extracted and classified into four areas such as the economy, foreign policy, social issues and leadership with the help of content analysis. The content analysis procedure was applied on the tweets by both the judges individually. There were 14,508 decision points (2400 tweets of Hillary Clinton, 1227 tweets of Donald Trump and four areas. Two independent judges agreed on 13,293 decisions and disagreed on 1,215 decisions with a coefficient of reliability of 91.62% which satisfies the thresholds of being over 85% (Kassarjian, 1977). Fig. 6. Illustrates the counts of the tweets posted by presidential candidates regarding the policies and issues.

Fig. 6. Issues and Policies discussed by Clinton (left cloud) and Trump (right cloud)
There were around 167 tweets posted by Hillary Clinton regarding the policies and issues whereas Donald Trump raised 138 tweets only. Clinton discussed various social issues surrounding the women and children related to equality, safety, empowerment and child care leave, disability, free education, career progression and mental stability. Clinton's tweets were focusing more on social issues (and Trump's policies!) whereas Trump was focusing more on the economy and foreign policies like fighting against terrorism and crime, immigration, raising jobs and easing the business processes in USA. Literature highlights women representatives' focus more on women issues and their communication style is more on attacking side (Evans and Clark, 2016), thus our finding is in line with literature findings.
To investigate how people are responding to the issues and policies tweets posted by the candidates on Twitter during election period the analysis of the issues and policies tweets was done by aggregating the retweet count and like count of the tweets containing the policies.  fig. 7, it can be concluded Trump had tweeted about the issues and policies relating to the people that's why people had supported him by retweeting his tweets and liking the tweets. Both frequency and content of the tweet matter during the election period. Clinton had tweeted high but not able to touch the voters' heart but Trump tweeted less but had touch the hearts of the voters.

Social imagery
The domain refers to the image of the candidate perceived by the voter in his/her mind. The candidate can have positive and negative stereotypes of the candidate depending on the various attributes such as demographic, socio-economic, cultural, ethical, political and ideological dimensions. Fig. 8 show the top 30 popular hashtags in the election period through which the social image of the candidate can be highlighted. Interestingly, WikiLeaks had released around 20,000 emails with almost 8,000 attachments of Democratic National Committee which indicated the possibility of corruption in campaigns led by Clinton. Such discussions are indicated with hashtags like #podestaemails, #wikileaks, and #crookedhillary. However, the popularity of #iamwithher was also one of the dominant among the hash tags, which indicate a huge amount support for Clinton.

Fig. 8. Top 30 hashtags in election discussions on X-axis and frequency on Y-axis and reflecting the imagery of presidential candidates in Tweets
The hashtags in a green box indicate positive imagery of the Clinton, whereas hashtags in the red box indicates negative imagery of Clinton. In contrast the hashtags in the blue box show positive imagery of Trump and no negative imagery appears among top 30 hashtags for Trump. The hashtag feature offered by Twitter helps the candidate in campaigning whereby followers can join and take part in the discussion in a particular campaign by using these hashtags (Jensen, 2017).

Emotional feelings
Emotional feelings refer to the personal feelings possessed by voters towards the candidate. A comparative analysis of all discussions surrounding the two candidates was conducted in terms of emotion analysis, as illustrated in Fig 9. In sheer volumes, discussions centered on Clinton surpassed all discussions surrounding Trump, in terms of all sentiments. This outcome is also comparable in the emotion comparison where the difference is highly contrasted for emotions like trust, anger, anticipation, fear, and disgust. The fig. 9 contains two bar charts, the left one shows the emotion comparison percentage wise whereas the right one shows the emotion comparison on all social media buzz tweets surrounding the three keywords, "USA Election", "Hillary Clinton" and "Donald Trump". From the left bar graph, it can be concluded that users are trusting both Clinton and Trump equally but users are posting fear tweets more towards Clinton as compared to Trump. In terms of surprise, however, count of tweets were somewhere comparative surrounding both the candidates. Literature highlights different emotions have different effects and people gets more influenced by the emotional discussions as compared to the cognitive discussions (Song et al., 2016).

Candidate image
This factor refers to the salient personality traits of candidate image. Voters make up their opinion of the vote on the basis of "candidate image" rather than referencing into election campaign issues, which result in interaction and engagement. However, in terms of percentage of tweets, the polarity is somewhat similar as illustrated in fig. 10. But given the difference in number of tweets, it is apparent that discussions surrounding Clinton, negative as well as more positive tweets, are more as compared to that of Trump.

Current events
This factor takes into the account all the events which had happened during the course of the election campaigning it includes both the domestic and international situations that would cause the voter to switch his/her voting preference. Since topic modeling is extremely computationally extensive, only the selective days when user sentiments in Twitter fluctuated significantly (i.e. over Mean tweet polarity + 2 x Standard Deviation), the tweets were analyzed. Then the topic identified from 18 days for creating the word cloud to identify the main concerns during the periods which enhanced user activity and resulted in major fluctuations of sentiments during the period of the elections. For topic modeling, each day top 15 topics were identified. Fig. 11 illustrates the word cloud created based on the popularity of 15 topics across 18 days each, to visually present the dominance among emerged topics. Trump has 17.6 million followers on Twitter with 34,160 tweets whereas Hillary Clinton has 11.7 million followers with 9,838 tweets. From this statistics, it can be said Donald Trump had more reach than Hillary Clinton. However fig. 11 still indicates that Twitter users are more frequently discussing Clinton rather than Trump. WikiLeaks appeared to have played an important role in the discussions surrounding around Clinton. Despite such popularity, the final outcome may be attributed to the nature of popularity in such discussions, which may have polarized the citizens of USA. Literature showcases increased citizen activity on Twitter about the presidential candidate can be related to the negative campaigning or to the citizen incivility (Hopp and Vargo, 2017). . 11. Word cloud on the topics identified in the discussions on US elections Figure 11 illustrates the word cloud created based on the popularity of 15 topics across 18 days each, which emerged after topic modelling, to visually present the dominance among emerged topics.

Fig
From the above visualization, it can be concluded that Hillary Clinton posted more and was discussed more on Twitter during the election period when social media discussions on the event increased significantly, maybe due to the emergence of popular news and incidents.

Personal events
This factor refers to all the events which had happened in the past of the presidential candidate and which can cause the voter to switch his/her voting preference. The personal events can influence the voters positively or negatively. Literature highlights how social media has made journalism focus on a politician's private life (Ekman and Widholm, 2015), which users disseminate using tweets connecting to the URL.
Some of the personal events surrounding Clinton's activities which had been discussed negatively and extensively in Twitter are surrounding deletion of emails using BleachBit; WikiLeaks releasing the information regarding the governance of Clinton; FBI releasing the detailed interview notes of investigation of Clinton's email practices, and USA WTFM declaring Clinton as an insider. Fig. 8 shows the Top 30 hashtags in election discussions and WikiLeaks is coming on 13 th position from this only the popularity of WikiLeaks among Twitter users can be estimated. Trump, in contrast, did not hold a government post before winning the election, and such influence based on social discussions were not available. To analyze the impact of the personal events; the top 10 URL of each month were extracted (Annexure 1) which are creating a buzz in the social media discussions. Every month, it was analyzed that the top 10 URLs were revolving around Clinton's personal life and was having a negative impact on her personal image. Some of the most shared URLs include: (a) URL of video link posted by Trump showcasing the activities done by Clinton to raise the fund; (b) Video posted by Atlantic differentiating between the Clinton and Trump in terms of ethical disposition; and (c) Links posted by WikiLeaks containing the information regarding the governance of Clinton. These events which happened affected the participants of the Twitter discussions, thereby polarizing them.

Epistemic issues
This factor refers to the issues raised by candidates to change the pace of the time and bring something new and different. The issues which raise the curiosity of the voters also come under these. In fig. 8, the analysis highlighted that hashtag #maga contains the highest frequency among all the other hashtags which relates to the nationalist campaign "Make America Great Again". Other famous campaigns drive by Donald Trump was "Big League Truth" and "Drain The Swamp" were also popular. In contrast to this #strongertogether was launched by Hillary Clinton motivating the citizens to unite and fight against social issues, had much lesser popularity among followers. While fig. 6 illustrates Trump's campaign got social support, Clinton's campaign did not get too much social support from Twitter retweets and mentions.

Overview of presidential candidate engagement from Twitter screen
In line with the previous analysis, we wanted to explore the participants who took part in this discussion as influencers and how were they connected in the network. The top 50 @mention where extracted from presidential candidates Twitter screen and were mapped in @mention network in fig.  12, where the size of the node indicates the frequency. Fig. 12. highlights how through Twitter platform, voters, official and media houses can reach out to their presidential candidates for queries and inquiries. Mostly media houses and official are actively using Twitter for queries and discussions.

Fig 12. Top 50 @mention network for the candidates along with strength of association
From fig. 12 it can be derived media personalities and houses are interacting more with Clinton using Twitter, which is in line with the literature which indicates more the politician is active on the social media, more the journalist will follow him/her politician (Rauchfleisch and Metag, 2016).

Acculturation and Polarization of the users in the online environment
The line between the social media and traditional media is getting blurred day by day. Literature indicates social media platforms are playing significant roles in shaping user's cultural orientation (Li and Tsai, 2015). Therefore we thing the hashtags or campaigns run on the Twitter has the potential of connecting users located in different geographical locations and to initiate process of acculturation among users.

H5: Popular campaigns may initiate acculturation among Twitter users in different geographical locations.
To explore this the tweets posted in English (in numbers 754,109) were extracted. Only around 412,767 tweets contains the location of the authors. From these tweets containing USA states names were extracted through content analysis. The analysis resulted in 148,881 tweets posted by 26,386 users. The graphical distribution of the tweets (in red), users (in green) and user per tweet in blue given figure 13. In terms of sheer volumes of tweets surrounding the top 5 campaigns, the highest contributing states in decreasing dominance are Tennessee (15815), Arkansas (14359) and Georgia (13283). In all these states, in the election Trump won over Clinton which indicates popularity of the #MAGA campaign may have affected the outcome of the election. Figure 14 illustrates the support of popular five campaigns (Jensen, 2017) across the states. The highest number of the instances captured in the sample belong to Texas and California; whereas the states Delaware, South Dakota and West Virginia did not contributed to the top five hashtags. The instances captured in the sample surrounding #maga came from the location: Texas (422) and California (328), which is around 28.7% of total instances captured for #maga. In California and Texas, Clinton and Trump won respectively and the direct impact of the top campaigns appear nonconclusive, although across the states discussions are prevalent on the top 5 campaigns. Figure 15 shows the distribution of the tweets containing the top five popular hashtag campaigns (in section 5.2.2) during the USA Election. The figure 15 demonstrates how users living in distributed location are getting connecting through hashtags on Twitter. On Twitter many dispersed people are contributing towards the hashtags. Thus from figure 14 and figure 15 it can be derived campaigns are leading to political integration through the acculturation of the ideology (Hindriks et al., 2016) in the social media irrespective of race, ethnicity, religion, income and profession on USA Election.

Fig. 15. Top 5 hashtags usage by different geographical locations
We also attempt to assess the possibility of voter's polarization in terms of their preferences. For understanding the same, the election period was divided into the two phases. For both the phases, the tweets were segregated on the basis of Clinton and Trump. The sentiment analysis was applied on tweets for identifying the polarity. On the basis of the transition undergone by the users, the users can be segregated into the four groups. The users who are positive in the first phase for the candidate and had undergone the transition in the second phase and had become negative in the second phase. The user's polarity towards the candidate can be mapped through the tweets posted by the users in the subsequent phase. Similarly, other three groups of users are negative in the first phase and had become positive in the second phase, positive in the first phase and in the second phase also remain positive, and the last group of users negative in the first phase and in the second phase also remains negative.

H6: Discussions in social media platforms demonstrates the occurrence of polarization among the voter groups based on participation in political discussions like elections.
To investigate research question 3 the following methodology was adopted: Step 1: The dataset of tweets which were collected was divided into the two phases of 60 days each. Phase 1 from August 13, 2016, to October 11, 2016, and phase 2 from October 12, 2016, to December 10, 2016.
Step 2: For both the phases the tweets were segregated on the basis of presidential candidates Hillary Clinton and Donald Trump.
Step 4: Positive users and negative users from Phase 1 and Phase 2 for both Hillary Clinton and Donald Trump were extracted.
Step 5: For both Hillary Clinton and Donald Trump the following users were mapped to:  Table 3 illustrates the count for users in which sentiment transition had occurred during the election period for Trump and Clinton respectively.  (11057) H7: Communities are formed among the groups which are polarized during social media discussions during political discussions like elections.
Hypothesis 6 and 7 from research question 3 needed the segregation of the user's sample into the four groups. This exploration tries to investigate how the top 15 hashtags of the sample collected from Twitter being used by these four groups. Literature indicates network clustering had been done on the basis of the hashtag usage (Bode et al., 2015). We tried to investigate how the top 15 hashtags in fig. 8 been used by the four groups identified in Table 3 and whether these groups are forming communities with the help of the hashtags. For this user from table 3 who had used the top 15 hashtags were extracted. The count of the users in each group is given in table 4. The network graph was plotted showing the usage of the top 15 hashtags, where each user and hashtag is a node. A user is represented as a circle. The node colour demarcates the user on the basis of polarization. A green colour node indicates a user who had undergone a transition from negative in the first phase to positive in the second phase. A red colour node indicates a user who had undergone a transition from positive in the first phase to negative in the second phase. A yellow colour node indicates a user who had not undergone any transition. The hashtag is represented as a square node and size of the square indicates the frequency of the hashtags. If the user had used the hashtag then an edge had been drawn connecting the user and the hashtag, square. The hashtag usage graph had been drawn for both the presidential candidate individually given in fig. 13.

Fig. 17. Community detection based on greedy optimization of modularity of the above graph for Clinton (left) and Trump (right) respectively
From Fig. 17, it may be inferred that the users were forming their communities on Twitter through the hashtags which had high degree of overlap based on discussions taken part in. For Clinton, the user groups were somewhat segregated and not very unified, as depicted in the visualisation of network analysis. In comparison, the users who were discussing about Trump had synergy among discussed topics and took part in many of the issues and campaigns highlighted by Trump. The theory of homophily (Aral and Walker, 2012) is seen to be satisfied in USA election discussions, which was also evident in the outcome of the election where Trump won. The graph also indicates that integrative acculturation may have happened across the communities that that supported Trump.

Discussion
Our study highlights that the discussions of the policies and campaigning on Twitter may have affected the outcome of USA elections. The study helps us in understanding the possible reasons for polarization of voters among the Twitter users during the USA election. It helps us to identify the popular hashtags, @mention and the various domains influencing the voter's behavior on Twitter (section 5.2). However, the analysis of tweets highlights that the election outcome may have been strongly polarized by the way the Twitter handles been used by presidential candidates.
High frequency of social media activity can lead to popularity but in Clinton case, it had led to negative popularity, with a lot of criticism. Engagement through Twitter leads to two-way dialogues between presidential and voters ( fig. 12) (Enli and Skogerbø, 2013;. The topics of tweets matter a lot during the election period ( fig. 7), and if the topics are being discussed by presidential election candidates, they are liked and retweeted by voters, which spread the message very fast (Zhang et al., 2016). The lesser variation in time between consequent campaigns increases campaigns popularity among the voters/ Twitter users by strengthening nexus using which engagement improves in virtual communities.
The study also helps us in examining the reactions of the users towards news evolving over the period of the elections. Despite Clinton having much more visibility in terms of interaction, the outcome of the election was affected by the nature of visibility and the resonance the voters had with her content.
Not much of overlap was visible among the supporters of Clinton in terms of affinity towards campaigns. Twitter users were more inclined towards the policies discussed by Trump as users had liked and retweeted more for the policies introduced by the Trump in comparisons to Clinton shown in fig. 7.
It appears that Clinton campaigns failed to gain popularity, though Trump's campaign gathered significant support, in terms of their presence in the descriptive analytics of hashtags, @mentions and word-cloud built of topics created. More than the campaigns and their outcome, Clinton also appeared to have spoken more about her competitor which was strongly contrasting for Trump who focused more on his policies and their outcome. From the network analysis of fig. 13, and fig. 14, it can be concluded hashtags are helping the users informing the communities. Trump users (polarized or nonpolarized) community is more diverse in using the hashtags as compared to the Clinton users. Thus from this, it can be concluded Trump users have more reach as compared to Clinton users on Twitter. This study indicates Twitter is an extension of off-line interactions between candidates and voters (Miller and Ko, 2015).
Newman and Sheth in 1985, had discussed seven domains that drive the voter behavior. Through this study, we had highlighted these domains are being discussed on Twitter and may have played a significant role in the election outcome. The count of tweets containing the issues and policies raised by Clinton and Trump shown (in fig. 6). The social imagery of the presidential candidates shown using the hashtags used by voters/ Twitter users for communicating among them (in fig 8). The emotional feeling of voters/ Twitter users were tried to understand by applying the sentiment analysis algorithm to the social media buzz (Berger, 2011). Literature highlights candidates are main characters and capture most of the attention (Borondo et al., 2014;Gonzalez-Bailon et al., 2014), therefore to map the candidate images among the voters/ Twitter users the polarity of the social media buzz along with @mention were analyzed. The epistemic issues raised by presidential candidates were mapped by analyzing their popular campaign such as "maga", "Big League Truth", "Drain the Swamp" and "strongertogether".
The discussion happening on Twitter can polarize the users towards presidential candidate as depicted in table 3 and literature as well (Larsson and Moe, 2012;Bode et al., 2015;Kruikemeier et al., 2016). The number of polarized users for Clinton is larger than that of Trump. This may be because of the high frequency of tweets by Clinton or may be because of high social media buzz (on Twitter) around Clinton or may be because of both. This is open research question and can be investigated in future studies. Polarized users are forming the communities among themselves through the hashtags.
Among top 15 hashtags, negatively polarized users for Clinton are mostly using the hashtags: #podestaemails, #tcot and #pjnet; whereas positively polarized users towards Clinton are mostly using the hashtags #hillaryclinton and #imwithher; and non-polarized users towards Clinton are mostly using the hashtags #neverhillary and #crookedhillary. For Trump the polarized and non-polarized users are randomly distributed across the hashtags, no clear indication of hashtags usage can be highlighted from the polarized behavior of users, this may be because of the small user's group had been captured within the Twitter extracted data.

Theoretical Contributions
Methodologically, the study presents a way user generated data (tweets) can be collected from Twitter (Stieglitz and Dang-Xuan, 2012) and the way insights can be gathered by applying the Twitter analytics (Bruns and Stieglitz, 2013;Chae, 2015) and data mining approaches like regression analysis and community detection. The research paper presents the extensive list of Twitter analytics (descriptive analytics, content analysis, network analysis and geospatial analysis) which can be used to derive the insights from the user generated content (tweets). The adopted methods highlight how approaches of big data analytics can be used in social media data to provide innovative insights to complex problem domains by digging up insights which are otherwise not evident at all.
In terms of the domain, the findings in our study contribute to literature on how social ecosystems use social media for conversing on topics across geographically diverse states. Higher uniformity of frequency of social media activity by the candidate leads to higher popularity and engagement among followers along with the criticism and harassment on the candidate. Consequent campaigns on the social media gains higher popularity and engagement among Twitter user. The study also depicts how strong emotional elements (like surprise) in a tweet can increase the social buzz on the social media platforms. Further, greater coverage of factors like issues and policies, social imagery, emotional feelings, candidate image, current events, personal events and epistemic issues create greater connect with otherwise geographically segregated social communities. Trump had greater coverage on these factors of voter's choice behavior as compared to Clinton which may have impacted the outcome of the election. The study reveals popular campaigns during USA election are connecting the dispersed users on the social media platforms and bringing acculturation of ideologies among users which may be the reasons of users getting polarized through the discussions and forming the virtual communities on social media platforms.
The results can be used in future for election campaigning, analysing the impact of digital communication on various social media platforms (i.e. between the political actors and voters; and among the voters in virtual communities), identifying influencers and communities in the digital world (Larsson and Moe, 2012;LaMarre and Suzuki-Lambrecht, 2013;Bode et al., 2015;Frame and Brachotte, 2015;Kruikemeier et al., 2016). The study tries to propose the method for visually representing the communities (HerdaĞdelen et al., 2013;) and information flow among communities (Park et al., 2015). The research paper illustrates how popular frameworks such as Newman and Sheth's Voter's Choice Behavior (Newman and Sheth, 1985) and SPIN framework (Mills, 2012) can be adopted for promoting and analysing the communication undertaken in virtual communities like social media.

Implications to Practice and Policy
The implication of the study can divided into the three section for practise under the policy section. These three sections are: (a) the best practices overview for the candidate standing in the election; (b) the characteristics of a good campaigns launched during the election period; (c) strategies for polarizing the voter's behavior on social media platforms such as Twitter.

Overview of the best practises for the candidate standing in the election (Individual level)
Literature indicates political actors are using Twitter to reach out the public and the media Shapiro and Hemphill, 2016;Waisbord and Amado, 2017) as it is multi-directional and offers interactive communication along with the message broadcast facility (Ross and Bürger, 2014;Kruikemeier et al., 2016;Hutchins, 2016;Theocharis et al., 2016). Therefore some of the best practices for the candidate during the national election are listed below: (a) Twitter handle should be responsibly used by the main political actor of the party. The political actor should not response to every comment made by protestors in the public forum (Poell, 2014;Ernst et al., 2017); (b) the candidate should make sure the wording used in the tweets do not reveal negative emotions like anger or disgust (Theocharis et al., 2016). The study indicates different stakeholders of voting process such as protesters, supporters, official, celebrities, corporates, media and social workers are using Twitter for engaging with the presidential election. The candidate should strategically handle the engagement over Twitter to act as an influencer on social media platforms (Conway et al., 2015;Karlsen and Enjolras, 2016); (c) candidate should wisely use their past information surrounding their personal and professional background during the election and should take precautions regarding protecting their secret of past. The study illustrates the impact of releasing the past internal government information on Twitter in Clinton's case. (d) Candidate should balance the use of social media platforms and traditional media because literature and as well as this study indicates more the candidate is active on the social media, more journalists and this criticism could follow the candidate (Karlsen and Enjolras, 2016;Rauchfleisch and Metag, 2016;Theocharis et al., 2016).

Characteristics of a good campaigns or hashtags launched during the election period (Organisational level)
The campaigns launched during the election reveals the real attention of the candidate and when the citizens participates in the election campaigns their political knowledge increases (Dimitrova et al., 2014;Ogola, 2015). The campaigns on social media platform are launched through the hashtags (Abascal-Mena et al., 2015). The study reveals usage of actionable agenda focused campaign (#maga, #draintheswamp) hashtags in Trump's tweets had led to the higher campaign polarity among users which had further help in propagating the core message of the campaigns. The key characteristics a digital campaigns or hashtags introduced during the election should have are listed below: (a) should have the conviction value; (b) should be true; (c) should be associated with the large population emotionally, professionally etc. The study as well as literature highlights people gets more influenced by the emotional (Song et al., 2016); (d) should be capable of holding the voter's attention; and (e) should demonstrate the benefit or values to the voters in the long run.

Strategies for polarizing the voter's behavior on social media platforms
Twitter had been used by political actors for engaging the voters (Graham et al., 2013;Purohit et al., 2013;Raynauld and Greenberg, 2014) connection among the users on Twitter can be visually depicted using the networks (HerdaĞdelen et al., 2013;. While political parties are designing the agendas for the elections the key points should be consider by them are listed below: (a) before making the strategies the organization should investigate on which issues and policies (economic policy, foreign policy and social policy) voters are concerned about. As study reveals USA voters are concern about the security issues, Trump posted more on foreign policy regarding the security issues which had increase the engagement among him and the voters; (b) Any campaigns launch during the election period should make sure campaigns are improving the social image of the candidate and the organization among the voters.

Conclusion
Literature as well as this study indicates social media discussion had impacted the national elections (Bruns and Stieglitz, 2013;Heo et al., 2016) and politicians had been using social media platforms such as Twitter and Facebook and many more for campaigning (Graham et al., 2013;Jungherr, 2014;Kelm et al., 2017) and disseminating information using social media platforms (Klinger, 2013;Ross and Bürger, 2014).
The study throws light on (a) how Twitter is being used by presidential candidates during 2016 USA election in line with SPIN framework (Mills, 2012); (b) how voters are being influenced by the online campaigns using the Newman model of voter behavior (Newman and Sheth, 1985) and (c) showcasing the polarization of users towards presidential candidate on Twitter platform, enabling the users to form the online communities using hashtags, thereby explained using the theory of homophily (Himelboim et al., 2016).
This study contributes to the domains of media studies and political engagement by shedding light on the campaigns running on Twitter during the election period. The study helps us in understanding the dynamics of polarization of the preferences of Twitter users towards Clinton and Trump in an online environment. The study had tried to analyze the two folds impact in terms of how presidential candidates are using their Twitter account during election period and impact of their activities on other Twitter users. This study has thus attempted to quantify the impacts of the USA election on the presidential candidates and voters / Twitter users, by converting the qualitative tweets into quantified numbers by using the machine learning algorithm, content analysis, and network analysis.
The various factors influencing the voter's behavior on Twitter had been highlighted in the study. The study also highlights how in current times social media plays a great role in the success of elections as it can facilitate both voter engagement, public scrutiny, public harassment and finally polarise voting outcome (Theocharis et al., 2016). Table 5 briefly presents the overview of the finding of the study. Reinvestigating if higher frequency of social media activity always leads to higher popularity and engagement among followers.
Yes, but in negative sense. Negative feedback also may become higher with higher engagement (Clinton Twitter activity) 2 Lesser variation of time (greater nexus) between consequent campaigns increases higher popularity and engagement.
Yes, in positive sense (Trump usage of campaigns) 3 Higher thresholds of sentiments (polarity) within tweets creates greater popularity and engagement among followers.
Partially (As there was very less difference within Trump and Clinton emotion percentage within the tweets posted except for surprise emotion) 4 Greater coverage in social discussions on different factors of Newman's Sheth's Voter's Choice Behavior increases the engagement with voters, actively or passively.
Yes. Greater coverage of all the factors in campaigns indicate a positive outcome with higher engagement.

5
Popular Hashtags or campaigns can initiate acculturation process of ideologies among Twitter users located in different geographical locations.
Yes. #maga campaign built support communities from citizens across USA.

6
Discussions in social media platforms demonstrates the occurrence of polarization among the voter groups based on participation in political discussions like elections.
Yes. The analyses highlights that the voter count transiting from negative to positive support over a period of time is higher as compared to the transition in positive to negative support. 7 Communities are formed among the groups which are polarized during social media Yes, using hashtag analysis, it is evident that communities around discussions during political events like elections. campaigns are formed which are often overlapping.
The current study extends the existing literature in social media by presenting how community formation and polarization of voting outcome is feasible based on campaigns in social media. This study contributes to various research streams such as the role of influencers in cascading information over the networks, social psychology of online users, best practices in computer-mediated communication and social media usage.

Limitation and Future work
Like all other studies, this study also faced limitations in building theory from multiple perspectives. This study had extracted the data set from Twitter and Twitter had allowed around 4000 to 10000 records of extraction on the daily basis. This restriction of extraction of Tweets poses a limitation of the study. It is possible that we have not been able to track some of the crucial events happening, if it was not dominant in Twitter discussions. The second limitation of the study could be that if Twitter users get influenced by other external events rather than Twitter discussion than that cannot be mapped or cannot be taken into account for explaining their polarization in preferences. Similarly other popular social media platforms like Facebook and Linked In has not been considered in this study due to access challenges of such data as well as integration challenges of the diverse data sets. The third limitation of the study is for hashtag clustering of users we had limited our self to top 15 hashtags. If a Twitter user is not aware of a hashtag which is being used popularly, he may not be to contribute to the discussions in that theme. Fourthly, most of the analysis using social media analytics is still based on visualization to draw inferences. Statistical validation of propositions could not be attempted due to the limitations of methodological advances. However future research could try to address these limitations and take the existing study forward while analysing datasets from different sources of user generated content from different platforms.