Visualization of the UK Stock Market Based on Complex Networks for Company’s Revenue Forecast

. As an emerging research field, the complex network theory is able to depict the most daily complex systems’ topologies, but in terms of financial market analysis, it still needs more attention. We can apply this theory to construct financial networks and detect them both from macro level and micro level to support a company in forecasting its revenue. This paper aims to explore the macro-characteristics of the UK stock market. We examine the properties of return ratio series of selected components in FTSE100 index, adopt the Kendall’s  rank correlation coefficient between series to write adjacency matrices and transform these matrices into complex networks. Then we visualize the networks, analyze features of them at different thresholds and find evidence of WS small world property in the UK stock networks. All these work follow our research framework proposed at beginning of this paper. According to the framework, more future work needs to be done to achieve the goal and make decision support in a company.


Introduction
The function of the stock market is to provide a public platform for trading businesses and raise capital for companies.Various information technologies and statistics methods have been engaged in analyzing its dynamics, such as Internet, portable computers, regression analysis and so on.However, the interactions between companies are rarely taken into account, but undoubtedly they play an important role to impact the markets.From inside perspective of a company, forecasting its revenue through the stock market's behavior could be one way to support decision making.In order to achieve this research aim, we propose utilizing complex network theory to visualize and analyze the complexity of the interactions in this paper.
Complex network theory is able to abstract and depict the most daily complex systems' topologies.As an emerging hotspot, this theory has attracted attention of schol-ars increasingly from various research fields and became one of the most significant tools in social network analysis.Complex network theory has been introduced since decades ago.To date, it has been widely used in the study of social relationships, transport networks, biology, ecology, emergency management, power grids, etc., and as a credible and powerful tool to characterize the markets.
Up to now, some scholars have done several researches about the major stock markets all over the world.In 1999, Mantegna [1] proposed a hierarchical tree to investigate the common economic factors affecting the stocks' prices.In 2005, Boginski et al. [2,3] constructed a market graph which followed a power-law model and calculated cross-correlations to reflect the market behavior with the data from US stock markets.Till 2009, Haldane [4], the Chief Economist and the Executive Director of Monetary Analysis and Statistics at Bank of England, first convincingly explained the current financial markets with complex networks in an official speech.Caraianie (2012) [5] analyzed the properties of the returns of European stock markets by complex networks, and found that the networks are scale-free and self-similarity.Zhuang et al. [6] constructed a network of stock prices fluctuations in Shanghai stock market and analyzed its topology.Further, Zhang et al. [7] compared the networks extracted from the original series and return series.They showed different characteristics because of different data.The former one fitted with a power law distribution and was a smallworld and free-scale network, while the latter one governed by an exponential degree distribution.For Hang Seng index of Hong Kong stock market, Li and Wang [8] found fluctuation patterns based on the network topological statistic.In these researches, networks from various databases and statistic results are visualized to make texts more visually attractive and help audience understand more effectively.
The aim of this research is to apply the complex networks theory to the UK stock market to forecast companies' revenue.The research framework is shown below (Fig. 1).The first two steps are contents in this paper and the others are the future work in our research.In this paper, we investigate the relationship between return series with observations from 2004 to 2014 in FTSE100.We find that return series show significant non-linear correlations that do not meet strict assumption of Pearson's correlation; therefore we adopt Kendall's  correlation as edges' weight to construct and analyze networks more accurately.

Methodology
FTSE100 index, being one of the most professional stock market exchange indices all over the world, has 101 components from various industries.However, not every company contributes to the market during the period we chose.Hence, an analysis of these long term running companies would give us a big picture of the network structure of the British stock market.

Data collection and pre-processing
The weighted network is able to describe the nodes and relations between them much more really and clearly than non-weighted network, therefore, a whole connected, no direction and weighted network is constructed in this paper.Within different thresholds, the research will find their macro-characteristics.
The data are extracted from "Yahoo Finance", and the timespan in consideration of this paper is from 1st January 2004 to 10th October 2014.The dates of the public holidays in UK are deleted as stock market is closed, including New Year's Day, Good Friday, Easter Monday, Early Bank Holiday, Spring Bank Holiday, Summer Bank Holiday, Christmas Day and Boxing Day in each year.To reflect facts as real as we can, companies that do not have data in full period are deleted as well.Consequently, there are 86 components of FTSE100 as the vertices in network, such as BP plc (British Petroleum), LSE (London Stock Exchange Group PLC), and LLOY (Lloyds Banking Group PLC) and so on so forth.The edges are the relationships among the 86 stocks and the correlation coefficient is the weight of each edge.
Pre-processing each company's closing price data series is needed.Denoting the original daily closing price of the stock i at time t is () i Pt .These data are calculated as Then there will be the return ratio data series , and the number of data in The more data we have, the greater extent network can reflect the real UK stock market from results we conclude.Therefore, the components we choose have the most amounts of data during the period.Following this principle, there are 234 178 (2723 per each companies) pieces of efficient data.

Edges and their weights
Analysing the statistical characteristics of the return series is the first thing we should do to understand the basic statistic of the return series.Six random return ratio series are shown in table1.From this table, it is clear to see that the means are around 0 and the standard deviations are all over 1, that is to say, there are bigger fluctuations in these return series.Obviously, the fluctuations are depended by the uncertainty of the stock market.Majority skewnesses of data series' distributions, in this table, are smaller than 0, and kurtoses are greater than 6 (twice of 3), which mean the return series show left skewed (thick/fat tail) and high peak distribution.Meanwhile, the probabilities of Jarque-Bera are all 0 and do not accept the hypothesis: the return series are normal distribution.Because of these characteristics, it can be seen that these six return series obey student T distribution, the same as the rest 80 series.
The Q-Q graphs below give evidence of the judgment we made above.As the graphs showed, the return series have the features mentioned above and do not accept the hypothesis of normal distribution.
To examine the correlations among the series, it is calculated the autocorrelation and partial autocorrelation of 21 period-lag of these series.The Ljung-BoxQ value is significant and the results show that there are no significant relations.Therefore, all the return series we chose are independent in the period.
According to the test results, in this paper, we employ the Kendall's  correlation as the edges' weights.A correlation coefficient is a measure of the consistency of changes between the variables.Pearson Product Moment Correlation or PPMC (Pearson's correlation, for short), which is a measure of the linear correlation between two variables, is common to see in the correlation research of the stock exchange markets since it is easier to understand and calculate than others.However, there are some limitations for Pearson's correlation.For example, the variables should be normally distributed and it is for linear relationship between the two variables.Kendall's  correlation measures the coordination degree of variables that based on the rank of random variables.Therefore, Kendall's  correlation coefficient is able to reflect nonlinear dependences better without restriction mentioned above.
With these calculations, we can write adjacency matrices.Through the adjacency matrices, we construct complex networks and analyse them with respect to degree distribution, average shortest length path and clustering coefficient.

Results and Discussion
With Pajek, a social network analysis tool, we transform the adjacency matrices into complex networks.

Network structure analysis from macro perspective
Visualising networks at a threshold of 0.36 as below: the nodes are the 86 companies from FTSE100 index and the edges are determined by the Kendall's  correlations between the series.The network is the whole connected weight network.In Fig. 3, the darker edges represent the bigger correlation coefficient between the companies.It is noticed that five companies have the strongest relationships and these five companies are formed into two clusters: AAL01 (Anglo American Plc.), BLT17 (BHP Billiton Plc.) and RIO73 (Rio Tinto Plc.); BLND16 (British Land Co Plc.) and LAND53 (Land Securities Group Plc.).The first three companies highlighted by the green are all multinational mining companies.The last two highlighted by the blue are all the property companies and they all turned into a real estate investment trust (REIT) in January 2007.British Land is the second largest REIT in the UK and Land Securities is the largest one.Undoubtedly, the companies in the same industries would be easy to have strong relations.
In order to describe a network from the macro aspect, it is usual to characterise its density and scale.Clustering coefficient and average path length are the indicators to measure these two features. ( The overall level of clustering C in a network is defined by Watts and Strongatz [9] as the average of all the i C : Obviously, 01 C  .If 0 C  , all the vertices in the network are independent.If 1 C  , any two nodes are directly connected in a network, i.e. the network is a global coupling network.
Average path length is defined as the average number of steps for all possible pairs of network nodes.It is a measure of the efficient of information transport on a network, i.e. the scale of a network.Suppose in a network G Gthe length of any two nodes i and j is ij l , which is the shortest number of steps from node i to j .The average length of a whole network N is the number of vertices in a network.We do not consider vertices' own distance in this paper.
Clustering coefficients and average path lengths are calculated at different thresholds with data mentioned before, showing in table 2. In table 2, at various thresholds, the FTSE100 network has a small average shortest path length and a large clustering coefficient.With these two characteristics in a network, referencing the research did by Watts and Strogatz [9,10]in 1998, it is a Watts-Strogatz small-world network (WS small-world network for short).With these features, the fluctuation in this stock exchange market would spread rapidly.Specifically, it is found that sharp changes of some influential companies' stocks will spread at a faster pace to a wider range.These two characteristics, small average path length and large clustering coefficient, are two indicators to measure the small-world property.WS small world network is a transition from a complete regular network to a complete random graph.

Thresholds in networks
Reasonable threshold selection is a crucial step for constructing networks.If a selected threshold is too small, there would be false associations due to the random noise, resulting in the complete connected network; if the selected threshold is too large, more isolated points will make the existence of the network, the network will contain fewer vertexes, and hence lots of important related information will be lost.Under these situations, it is no need to analyse the network degree distribution and clustering coefficient any more.In addition, according to the same interval, when the thresholds change from small to large, if the number of relations among stocks decreases rapidly, the left companies are more significant than others to contribute to the national economy or local economy.Otherwise, there are no more huge differences among companies to make contribution to a nation or a region.So the research with appropriate thresholds can explore real relationship among stocks to the whole system.
In this paper, the research draws 26 complex networks at 26 thresholds respectively, and the number of edges shows in Fig. 4, which reveals a descending trend of the edges.As we all know, even with the same vertices, various numbers of edges can construct various networks.Based on the specific networks, researches for parameters could be processed, such as the adjacency matrix of networks, the cumulative distribution of degree, clustering coefficient, average nearest neighbour degree, k-core and partitions and so on so forth.For the thresholds of the financial network we construct, it is steadier to the edges' number between 0.3 and 0.4.

Conclusion and future work
As an emerging method in economic disciplines, complex networks still needs more attention.In this study, we proposed a basic framework to forecast revenue for a company to make decision support based on complex networks theory.We detected data from the FTSE 100 index and write adjacency matrices to construct networks and visualize them.Based on basic statistics of data, we explored features of networks, and found the FTSE100 index network is a WS small world network and thresholds are playing a significant part in constructing networks.We investigated the UK stock market data independently with other research data.Similarly, the WS small world properties show in US stock market, Shanghai stock market [7], international stock markets [11]and so on.Combining data analysis with visualised graphs, decision makers could do predictions much faster and more reliably to support strategies in companies.Other macro features and in-depth micro research will be conducted in the future, for example, clustering communities, detecting topology changes of networks with impacts from internal and external events in companies.Many networks from stocks markets have been studied, but there are still some important questions need to be answered: what are common features in international stock markets, what these markets differ in each country and how we can visualise them clearly to support making decision, as examples.
Acknowledgments.The authors would like to thank Prof. Kecheng Liu for engaging discussion.We also thank the editor and reviewers for their insightful comments and helpful suggestions.This research is supported by (1)

Fig. 1 .
Fig. 1.A basic framework to forecast company's revenue using complex networks

Fig. 2 .
Fig. 2. Q-Q graph of return series against normal distribution

Fig. 3 .
Fig. 3. Complex Network of FTSE100 index at a threshold of 0.36

Fig. 4 .
Fig. 4. The number of edges at different thresholds of financial network

Table 1 .
Basic statistics of six return ratio series

Table 2 .
Clustering coefficient and Average path length at different thresholds the Research on Control and Efficiency of Emergency Support Networks (No. 71271126), financed from National Natural Science Foundation of China; (2) Study on Control and Efficiency of Emergency Transportation Support Networks (No. 20120078110002), financed from Research Fund for the Doctoral Program of Higher Education of China; (3) the Research