Overlapping Community Detection Combining Topological Potential and Trust Value of Nodes

. Aiming at the problems of existing algorithms, such as instability, neglecting interaction between nodes and neglecting attributes of node, an overlapping community discovery algorithm combining topological potential and trust value of nodes was proposed. Firstly, the importance of nodes is calculated according to topological potential and the trust value of the node, and then K core nodes are selected. In final, the final division of communities are finished by us-ing the extended modularity and core nodes. Experimental results on LFR network datasets and three real network datasets, verify the efficiency of the proposed OCDTT algorithm.


Introduction
A complex network is an abstraction of a complex system.Most real-world networks, such as transportation, social or gene-regulatory networks, are complex networks.The "community" refers to the set of nodes with the same or similar characteristics in a complex network [1],there are overlapping and non-overlapping communities in complex network, which reflects the different agglomeration of nodes.So far, there have been many research algorithms for the study of overlapping community research.Li [2] detect overlapping communities in the unweighted and weighted networks with considerable accuracy.Gregory [3] propose overlapping community detection algorithm based on nodes splitting, according to betweenness and edge-betweenness.Zhang [4] is inspired by label propagation and modular optimization, they introduce a community detection algorithm based on fuzzy membership propagation.In each iteration, candidate seed of potential communities are selected using topological features, and then the membership of the selected seeds is propagated to non-seed vertices, thus multiple communities could be obtained.Ahn [5] propose LINK algorithm performing hierarchical clustering on links based on idea of transforming in overlapping and non-overlapping communities in link networks.Although, researchers achieve great achievements in the research of community detection [6,7].Finding communities is a very challenging and promising research field.how to distinguish stable overlapping communities and propose efficient algorithms is still one of hot problems for many researchers.
We propose an overlapping community discover algorithm combining the topological potential and the trust value of nodes (Abbreviated as OCDTT) against problems appeared in existing algorithms.Firstly, the importance of nodes I(vi) in the network is computed combining the topological potential of nodes and the trust value of the node, and then K core nodes are selected according to I(vi).Based on the core nodes, communities are extended using the extended modularity.

Preliminary Study
We review the existing concepts, and define basic concepts and the problem of community detection.Table 1 gives a list of symbols used in this paper.the set of p-order neighbor nodes of vi.s (vi, vj) the similarity between node vi and vj c The size of the set Γ(vi) α the balance factor φ (vi) the topological potential of node vi t(vi) The trust value of node vi I(vi) The importance of node vi λ regulation parameter(0< λ <1)

Trust Value of Nodes
Definition1 If pathmin(vi, vj) = p, the node vj is called as the p-order neighbor node of node vi.For example, when pathmin(vi, vj) = 1, vj is 1-order neighbor node of vi.The trust value of the node vi is defined as the sum of the similarity s(vi, vj) between vi and all its p-order neighbors [8].A i ( , ) is attribute vector of node vi in formula (1).

Topological Potential of Nodes
Potential refers the work generated by a particle moving from one point to another point the field, and the work may depend only on the position of the particle and not depend on the path along which the particle moved in the physics science.If each node in a complex network is regarded as a particle in the field, and the edge between the nodes is used as a link between the particles to generate work [9].Then the concept of potential field is applied to complex networks, which makes the connections between nodes have fine physical features and stability.Considering that the work will decrease with the increase of the shortest path length between nodes, So the topological potential of node vi in the complex network G can be improved as formula (1).Impact factor δ ∈ (0, + ∞) is used to control impact scope of each node.
Here, count(k) is the number of the k-th order neighbors of node vi.Keeping k constant, φ(vi) get larger as count(k) gets larger, the network is densely.Therefore, the topological potential in equation ( 2) reflects the intensity of the network.

Implementation on Our Framework
In this research, we aim to solve community detection problem and describes how they work under this framework, the overall procedure our algorithm is shown in Fig. 2.

Nodes set
Edges set

Select Core Nodes
Due to the different positions of nodes and the difference of the interaction between nodes in complex networks, which make each node has different importance, their contributions to the network are also different [10].Thus we can estimate the importance of nodes I(vi) from the structure of network and attributes of node.( ) ( ) ( ) ( ) Next, top-k nodes are selected as core nodes according to the value of I(vi).

Division of The Communities
We selects adjacent node of the core nodes in turn, and tries to add it to a community and calculate the value of EQ [11].
If adding of the adjacent node makes EQ increases, then add the node into the community.Otherwise, select other adjacent node and repeat such operations until all adjacent nodes of core nodes is retrieved.Procedure is as follows: Select nodes from core nodes to expand communities:

Experimental Results and Analysis
To compare and contrast the performance of the OCDTT method, we apply it to a variety of two datasets.We use Normalized Mutual Information(NMI) to evaluate performance of communities finding.

Experimental Results and Analysis
Complex simulation LFR network which internal structure and sizes of communities are scalable to demonstrate effectiveness of our method.Table 1 illustrates four groups parameter information of simulation LFR network.[3,8].Range of threshold in LINK is [0.1, 0.9] and step is 0.1.Impact factor of OCDTT σ=1.034.Considering that structure and attributes have equal impact, set balance factor σ=0.5.We can study by observing figure 2 that OCDTT is more outstanding to partition communities in most conditions.It indicates that more communities a node belongs to, more complex the network is, lead to poorer performance of algorithms.It can be observed through comparing results of S1 and S2 that OCDTT is more stable than others when the number of communities increases.

Conclusion
We combine the topological potential of nodes and the trust value of nodes to compute importance of nodes in the network.Start from those selected core nodes, utilize expanding module function to generate final community partition.Experimental results demonstrate that OCDTT could achieve better results comparing to algorithms of the same kind.In further work, we will continue partitioning communities in aspects of structure and attributes.Based on this, optimize OCDTT.Strengthen efficiency of the implement.Attempt to apply the algorithm to real network analysis and web recommendations. 6

Fig. 2 .
Fig. 2. Results of four algorithms on different subsets

Table 2 .
two groups of LFR network parameter information Three real networks consist of Karate, Dolphins and Football.Karate is friend network of 34 members in a Karate club.The network has 78 edges representing relations among 34 members.Dolphins dataset is about dolphin groups in New Zealand.The network represents biological family relations of 62 dolphins having 159 edges.Football data represent groups of match teams in a university including 115 members and 613 relations.It can be seen from table 3 that EQ is not less than 0.5000 after partition in OCDTT.The partition performance is remarkable.LINK algorithm produces linking communities of small sizes when implementing.It hampers form of communities, so corresponding result is the worst.

Table 3 .
EQ value on real network