ArticlePDF Available

A Community Structure Enhancement-Based Community Detection Algorithm for Complex Networks

Authors:

Abstract

Community detection has been recognized as one of the most important tools to discover useful information hidden in complex networks which is usually hard to be obtained by simple observations. Existing community detection algorithms have demonstrated their effectiveness on a variety of complex networks, most of them, however, suffer from the scalability issue on complex networks without a clear community structure due to the challenge in the detection of ambiguous community structure. To address this issue, in this paper, we propose a community structure enhancement method, termed CSE, for community detection in complex networks. In the proposed CSE, the community structure of a network is enhanced by adding links between the nodes possibly belonging to the same community and reducing links between those belonging to different communities, thereby converting an ambiguous community structure into a structure much clearer than the original one. Experimental results show the superior performance of the proposed CSE over five state-of-the-art community detection algorithms on both synthetic benchmark networks and real-world networks, especially for those without a clear community structure.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 1
A Community Structure Enhancement Based
Community Detection Algorithm for Complex
Networks
Yansen Su, Chunlong Liu, Yunyun Niu, Fan Cheng, and Xingyi Zhang, Senior Member, IEEE
Abstract—Community detection has been recognized as one of
the most important tools to discover useful information hidden
in complex networks which is usually hard to be obtained by
simple observations. Existing community detection algorithms
have demonstrated their effectiveness on a variety of complex
networks, most of them, however, suffer from the scalability issue
on complex networks without a clear community structure due to
the challenge in the detection of ambiguous community structure.
To address this issue, in this paper, we propose a community
structure enhancement method, termed CSE, for community
detection in complex networks. In the proposed CSE, the commu-
nity structure of a network is enhanced by adding links between
the nodes possibly belonging to the same community and reducing
links between those belonging to different communities, thereby
converting an ambiguous community structure into a structure
much clearer than the original one. Experimental results show the
superior performance of the proposed CSE over five state-of-the-
art community detection algorithms on both synthetic benchmark
networks and real-world networks, especially for those without
a clear community structure.
Index Terms—Complex network, community detection, com-
munity structure enhancement.
I. INTRODUCTION
IN the past decades, complex networks have received in-
creasing attention from a variety of research fields, in-
cluding biology [1], [2], sociology [3], [4], psychology [5],
business [6] and engineering [7], etc. Among various tools
for investigating complex networks, community detection has
been considered as one of the most important tools for discov-
ering useful information hidden in complex networks [8]–[10].
Intuitively speaking, a community in networks refers to a set
of nodes in the network satisfying that nodes within the set
are densely connected, whereas the connections between nodes
in the set and those outside the set are sparse [11]. Commu-
nity detection has become a very important tool for mining
This work was supported by National Natural Science Foundation of
China (61672033, 61822301, U1804262, 61872325), Anhui Provincial Natural
Science Foundation for Distinguished Young Scholars (1808085J06), State
Key Laboratory of Synthetical Automation for Process Industries (PAL-
N201805), Recruitment program for Leading Talent Team of Anhui Province
(2019-16), Humanities and Social Sciences Project of Chinese Ministry of
Education (18YJC870004) and the Natural Science Foundation of Anhui
Province (1708085MF166, 1908085MF219). (Corresponding author: Xingyi
Zhang.)
Y. Su, C. Liu, F. Cheng, and X. Zhang are with the Key Lab of
Intelligent Computing and Signal Processing of Ministry of Education,
School of Computer Science and Technology, Anhui University, Hefei
230601, China (email: suyansen1985@163.com, 18255811669@163.com,
chengfan@mail.ustc.edu.cn, xyzhanghust@gmail.com). Y. Niu is with School
of Information Engineering, China University of Geosciences, Beijing 100083,
China (niuyunyun1003@163.com)
the information hidden in complex networks, e.g., functional
modules in biological networks [12], worm containment in
online social networks [13], and routing protocols in pocket
switched networks of smart devices [14].
A large number of community detection algorithms have
been developed based on different ideas for complex net-
works [15], [16], which can roughly be divided into the
following five categories. The first category for community
detection is to directly adopt graph partitioning algorithms,
which have been widely investigated in graph theory [17].
Graph partitioning algorithms were shown to be effective for
community detection in case that the number of communities
is known in advance. The Kernighan-Lin algorithm [18] and
spectral bisection method [19] are two representative graph
partitioning algorithms that have been frequently used for com-
munity detection in complex networks. The second category
of community detection algorithms adopts the hierarchial clus-
tering, in which two different kinds of ideas, agglomeration
and division, are often used. The agglomeration iteratively
merges the communities if their similarity is sufficiently
high, whereas division iteratively splits the communities by
removing links connecting nodes with low similarity. The
GN algorithm [11] and FN algorithm [20] are two widely
used hierarchial clustering algorithms, which are based on
agglomeration and division, respectively.
The third category for community detection is the spectral
clustering, including all methods and techniques that perform
community detection by using the eigenvectors of adjacent
matrix of the network or other matrices derived from it.
The first contribution on spectral clustering was made by
Donath and Hoffmann in [21], and some promising spectral
clustering algorithms include unnormalized spectral clustering
and normalized spectral clustering techniques proposed by Shi
and Malik [22] and by Ng et al. [23], respectively. Recently,
Mahmood and Small [24] also suggested an interesting spec-
tral clustering based community detection algorithm, termed
SSCF, by using sparse linear coding with l1norm constraint.
The fourth category focuses on community detection by opti-
mizing the modularity, which is by far the most used and best
known metric for evaluating the quality of communities [25].
Due to the fact that modularity optimization is an NP-hard
problem [26], many approximate algorithms have been pro-
posed by maximizing the modularity in a reasonable time,
such as greedy technique [10], simulated annealing [27] and
genetic algorithm [28], [29]. Among these algorithms, Lou-
vain method [30] is a representative of community detection
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 2
algorithms based on modularity optimization.
The fifth category for community detection is based on
the label propagation and the first work on label propagation
based community detection algorithm, named LPA, was re-
ported in [31]. In LPA, each node is initially given a unique
label and labels propagate across the network by taking the
label shared by the majority of the neighbors for each node
until this process reaches convergence. Some further work
on label propagation based community detection algorithms
are LPAm [32] and CK-LPA [33]. It is worth noting that
there are also some community detection algorithms which
adopt ideas different from the above ve categories, such as
random walk [34], [35], local expansion [36], multi-objective
evolutionary algorithms [37]–[39] and nonnegative matrix
factorization [40].
The community detection algorithms mentioned above have
shown the competitiveness for complex networks whose com-
munity structure is clear. The performance of most existing
algorithms, however, will considerably deteriorate as the com-
munity structure becomes unclear. To address this issue, in
this paper we propose a community structure enhancement
method for community detection in complex networks. To be
specific, the main contributions of this paper are summarized
as follows.
(1) A community structure enhancement method is sug-
gested to address the community detection for complex
networks whose community structure is not clear. The
proposed method enhances the community structure of
a network by adding the links between nodes which
possibly belong to the same community and removing
links between nodes in different communities. To the
best of the authors’ knowledge, there are several works
that have been reported for community detection by
changing the network topology, but little work focuses
on enhancing community structure.
(2) Based on the proposed community structure enhance-
ment method, a community detection algorithm, termed
CSE, is suggested for complex networks, especially for
those with an ambiguous community structure. In the
suggested CSE, a local community merging strategy is
also developed since the community structure enhance-
ment method often divides a community into several
local communities.
(3) The effectiveness of the proposed CSE is verified on
both synthetic benchmark networks and real-world net-
works. Experimental results demonstrate that the pro-
posed CSE is superior over five existing community
detection algorithms, especially for networks with an
ambiguous community structure.
The rest of this paper is organized as follows. In Sec-
tion II, existing work on network topology changing based
community detection is briefly reviewed. The details of the
proposed community structure enhancement based community
detection algorithm CSE are presented in Section III, followed
by the experimental results of CSE and five state-of-the-art
community detection algorithms on synthetic and real-world
networks in Section IV. Conclusions and future work are given
in Section V.
II. RE LATE D WOR K
In the past years, a large number of algorithms have been
proposed for community detection in complex networks, and
there are also a few works focusing on community detection by
changing the topology of networks. In what follows, we only
review these network topology changing based community
detection algorithms, since the proposed CSE also belongs
to this category in the sense that it enhances the community
structure by changing network topology.
The idea of changing network topology for community
detection was first adopted in division based hierarchial clus-
tering algorithms, which have become a class of widely
used community detection algorithms in complex networks.
In divisive algorithms, communities were iteratively split by
removing links connecting nodes with low similarity. One
representative community detection algorithm belonging to
this category is the GN algorithm, where communities were
detected by removing links with the largest betweenness step
by step until there do not exist edges in networks or a division
with the maximum modularity was considered as the finial
result [11], [20]. Some further work on GN algorithm have
also been reported in [41], [42].
Another idea of changing the network topology for com-
munity detection was suggested in [43] for incomplete net-
works, where most real-world networks were regarded to
be incomplete due to some links often missing in the data
collection process. A community detection algorithm, called
EdgeBoost [43], was developed by adding links based on link
prediction strategy to obtain a series of networks. Several
existing community detection algorithms were used to detect
community partitions in these networks and the final result was
obtained by merging the different partitions. Empirical results
demonstrated that EdgeBoost outperformed some existing
community detection algorithms, such as Louvain algorithm
and label propagation algorithms [43]. Based on a similar idea,
Cheng et al. [44] proposed two novel indices to predict links
for community detection, which confirmed the effectiveness
of link prediction in improving the precision of community
detection.
Some semi-supervised community detection algorithms
were also proposed based on the changing of network topol-
ogy. This class of algorithms used the prior information on
some nodes and/or links in the networks, and added and/or
removed the links using the known information. Based on this
idea, Zhang [45] et al. developed a semi-supervised learning
algorithm for community detection, where two types of prior
information on some links, must-link and cannot-link, were
considered. The must-links were added in the networks and
cannot-links were removed from the networks. Different from
the work in [45], Yang [46] et al. suggested a semi-supervised
community detection algorithm by using the prior information
on some nodes in the networks. The links were added between
the nodes known in the same community and links were
removed between nodes in different communities based on
the prior information. Experimental results indicated that these
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 3
community detection algorithms based on semi-supervised
learning can considerably enhance the community structure
in complex networks in case that enough prior information is
provided [45], [46].
From the above analysis, we can find that there are some
works on changing network topology for community detec-
tion, but little work focuses on enhancing the community
structure to address community detection on networks without
a clear community structure. In this paper, we propose a
community detection algorithm, termed CSE, for complex
networks, where a community structure enhancement method
is suggested by changing the network topology, which does
not need any prior information in advance.
III. THE PRO PO SE D ALGORITHM CSE
In this section, we present the proposed community de-
tection algorithm CSE for complex networks without a clear
community structure. The main component of the proposed
CSE is a community structure enhancement method, thus we
first give the details of this method.
A. The Community Structure Enhancement Method
It is a widely accepted fact that the ambiguous community
structure detection is challenging due to the small difference
between intra-link and inter-link densities of communities. The
driving idea of the suggested method is to enlarge the differ-
ence for enhancing the community structure by weakening
the connections between communities and strengthening the
connections in a community. Due to the fact that the proposed
method is suggested for unsupervised community detection,
we do not have any prior information on the communities in
a network and thus we need first to find the communities. To
this end, in the networks the proposed method detects some
local communities in which nodes are densely connected, since
these local communities have a high probability of belonging
to the communities in ground truth. To be specific, the pro-
posed community structure enhancement method consists of
three main steps: (1) local community detection, (2) central
and boundary node identification, and (3) adding and removing
links.
Algorithm 1 presents the procedure of local community
detection, which is performed as follows. For all nodes vi
which have not been assigned to a local community in the
network, we first find each neighbor of vihaving a similarity
not smaller than a predefined threshold αwith vi. If the
neighbor vn
ihas not been assigned into a local community,
then viand the neighbor vn
iare considered in the same
local community. The local community is further extended by
considering the neighbors of vn
iwhose similarity with vn
iis
not smaller than α. If the neighbor vn
ihas been assigned into
a local community, then viand the local community of vn
iare
considered to belong to the same local community. Once the
local community associated with viis found, the algorithm
starts to detect the local community associated with another
node which has not been assigned to a local community. The
algorithm halts when the local communities of all nodes in the
network are determined.
Algorithm 1: LC detection (G)
Input: Network G(V, E )
Output: The set of local communities
LC ={LC1, . . . , LCm}
Marked ;
for i= 1 to |V|do
if vi/Marked then
LCcurrent {vi};Expand ;S I ;
temp {vi};
LC LC {LCcurrent};
Marked M arked {vi};
while temp =do
vrandomly select a node from temp;
SI Calculate the similarity between vand
its neighbors according to Formula (1);
SI Normalize SI based on the maximum
value in SI;
Expand Find the neighbors whose
similarity with vis not smaller than threshold
αby using SI ;
if Expand M arked =then
Merge the communities of LC having at
least one node in Expand into
LCcurrent ;
Expand delete the nodes in Expand
belonging to Marked;
temp add nodes in Expand to temp;
LCcurrent LCcurr ent Expand;
Marked M arked Expand;
For measuring the similarity between two nodes in the
network, we adopt the Jaccard similarity [47] defined as
follows.
S(u, v) = |N(u)N(v)|
|N(u)N(v)|,(1)
where |x|denotes the number of elements in the set xand
N(y) = {w|w=yis a node in the network which has a link
with node y}(namely, N(y)is the set of all neighbors of node
y). Due to the non-uniform degree distribution of the network,
in this paper the similarity between a node and its neighbors is
obtained by normalizing the Jaccard similarity according to the
maximum Jaccard similarity with its neighbors. It is necessary
to mention that the idea adopted in the above local community
detection is a little similar with the well-known algorithm
DBSCAN developed by Ester et al. for data clustering in
1996 [48], in the sense that they both find components with a
high local density. The main difference between them lies in
the fact that DBSCAN used the density of data points, whereas
the proposed local community detection adopted the density
of topology structure of the network. It is also noting that
the order of nodes in the above local community detection
has little influence on the detected results, which can be seen
from Table I.
After all local communities in the network are detected, the
proposed community structure enhancement method starts to
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 4
TABLE I
NUMBERS OF LOC AL COMMUNITIES DET ECT ED B Y RUNNING THE
PROP OSE D MET HOD 30 TI ME S ON SIX RE AL -WOR LD NE TWO RKS .
Network Ave num St dev
Karate 8 0
Dolphin 21 0
Football 14 0
Polbooks 6 0
Yeast1 79 0
Yeast2 1439 0
‘Ave num’ represents the average number of the local communities
achieved by the proposed algorithm for Running 30 times, and ‘St dev’
denotes the standard deviation.
Algorithm 2: Find-BN-CN(G,LC)
Input:G: a complex network; LC: a local community;
Output:CN : the central node set of LC;BN : the
boundary node set of LC;
Calculate the local centrality score of each node in LC;
CN Nodes with the maximum value of local
centrality score;
BN Nodes in LC which are not in CN ;
identify central and boundary nodes in each local community.
To this end, we suggest a measure, termed local centrality
score, to evaluate whether a node in the local community is
a central or boundary one. To be specific, the local centrality
score of a node is defined as follows.
CSu=|N(u)NLS |
|N(u)|,(2)
where N(u)is the set of all neighbors of node uin a network,
NLS is the set of all nodes in the local community LS and
|x|denotes the number of elements in the set x.
With the above definition, we calculate the local centrality
score of each node in a local community. The nodes with the
largest local centrality score are identified as central nodes
of the local community and the remaining nodes in the local
community are considered as boundary nodes. Hence, only
the nodes having the most neighbors in a local community
are identified as central nodes of the local community, which
means that the central nodes are closely connected with
nodes in the local community. The boundary nodes are those
connecting with nodes in the local community and there also
are many connections with nodes belonging to the other local
communities. This means that the community structure can
be enhanced by adding and/or removing the links related to
boundary nodes in local communities. Algorithm 2 gives the
procedure of central and boundary node identification in each
local community.
Once the central and boundary nodes in each local commu-
nity are found, the proposed community structure enhancement
method starts to perform the third step, adding and removing
links. As mentioned above, the boundary nodes in local com-
munities are vital for enhancing the community structure of a
network. For this reason, we add and/or remove links related
to boundary nodes based on whether the boundary nodes of
a local community have a large probability of belonging to
another local community. Specifically, the step of adding and
removing links is performed as follows.
For each detected local community LCi, the proposed
algorithm checks whether the boundary nodes in LCiare
closely connected to the remaining local communities. To
measure the closeness between any two nodes xand y, we
adopt the following definition.
c(x, y) = |N(x)N(y)|
min{deg(x), deg(y)},(3)
where N(x)denotes the set of all neighbors of xand deg(x)is
the degree of x. It is assumed that C(LC)min and C(LC )max
denote the minimum and maximum closeness of nodes in
the local community LC, and C(LC )ave =1
2(C(LC)min +
C(LC)max ).
For each boundary node uin LCi, the proposed method
finds all neighbors of uin another local community LCj.
Assume that vis a neighbor of uin LCj. The proposed method
adds and removes the links related to uby comparing the value
of c(u, v)with C(LCi)min ,C(LCj)min,C(LCi)max and
C(LCj)max. When c(u, v)is smaller than both C(LCi)min
and C(LCj)min, the proposed method removes the link be-
tween uand vto weaken the connections between local com-
munities LCiand LCj, since a small closeness often indicates
that the two nodes are not closely connected and thus have a
small probability of belonging to the same community. When
c(u, v)is not smaller than both C(LCi)max and C(LCj)max ,
the proposed method adds the links between vand each
central node in LCito strengthen the connections between
LCiand LCj. When both of the above two conditions are not
satisfied, the proposed method does not add and/or remove
the links between LCiand LCj. In this way, the connections
between the local communities which have a high probability
of belonging to one community are incremented, whereas
the connections between those having a small probability
are reduced, thereby enhancing the community structure of
networks.
After the above operation completes, the proposed method
further enhances the community structure by considering the
boundary nodes with the largest degree in LCi. For a boundary
node umax with the largest degree in local community LCi,
all nodes which are not neighbors of umax but have common
neighbors with umax are found in each of the remaining local
communities LCj,j=i. When the closeness between umax
and one of the nodes is not smaller than both C(LCi)ave and
C(LCj)ave, the proposed method adds a link between umax
and the node, since a high closeness with umax indicates that
the node has a large probability of belonging to the same
community with umax. Algorithm 3 presents the procedure of
adding and removing links.
Fig. 1 gives an illustrative example of the strategy of adding
and removing links. Let us assume that the threshold αis set
to 1 and the local communities are detected starting from node
1. Nodes 1 and 2 are considered in the same community since
only the similarity between node 1 and the neighbor 2is not
smaller than α. For node 2, we continue to find its neighbors
whose similarity with node 2 is not smaller than α. Node 1
is the only neighbor of node 2 whose similarity with node
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 5
a
3
21
4
6 7
8 9
10
13
Added links Removed links
5
14
11
12
15
16
b
3
21
4
6 7
8 9
10
13
5
14
11
12
15
16
Fig. 1. An illustrative example of the strategy of adding and removing links. (a) A complex network Gcontains three local communities, where LC1=
{1,2,3,4},LC2={5,6,7,8,9}and LC3={10,11,12,13,14,15,16}. (b) The community structure is enhanced by removing the links (3,6) and (4,8)
and adding the links (5,11),(5,12),(5,13),(5,14),(5,15),(8,10),(9,11),(9,12),(9,14),(9,15) and (9,16), where the bold solid lines represent
added links, and the dotted lines denote removed links.
2 is not smaller than α, thus we obtain the local community
{1,2}. Then, we find another local community from the nodes
which have not been assigned into a local community (assume
node 3). The similarity between node 3 and node 1, as well
as that between node 3 and node 2 are both not smaller than
αaccording to Formula (3), hence node 3 is added into the
local community {1,2}. Similarly, it can also be found that
node 4 belongs to the local community {1,2,3}. In this way,
a local community LC1={1,2,3,4}is obtained. The other
local communities which can be found in the network Gare
LC2={5,6,7,8,9}and LC3={10,11,12,13,14,15,16},
as shown in Fig. 1 (a).
After the local communities are found, the proposed method
identifies the boundary nodes and central nodes in each local
community. For the local community LC3, the boundary nodes
are those labeled with 10 and 13, and node 11,12,14,15 and
16 are central nodes. The proposed method checks whether
some links related with these boundary nodes can be removed
or added. The boundary node 10 have two neighbors, nodes
7and 9, in LC2and no neighbor in LC1; the boundary node
13 have only one neighbor, node 9, in LC2, and no neighbor
in LC1. The closeness between 9and 10 is not smaller than
the maximum closeness of nodes in LC2and LC3, hence the
links between node 9and all central nodes in LC3are added,
namely, nodes 11,12,14,15 and 16. The closeness between
nodes 7and 10 is neither smaller than the minimum closeness
nor larger than the maximum closeness of nodes in LC2and
LC3, thus no link related to node 7is added or removed.
Similarly, we can find that no link related to node 13 is added
or removed when comparing the closeness between nodes 9
and 13 with the minimum and maximum closenesses in LC2
and LC3. A link between nodes 8and 10 is also added since
10 is a boundary node with the maximum degree in LC3and
the closeness between nodes 8and 10 is not smaller than
C(LC2)avg and C(LC3)avg .
For the local community LC2, it can also be checked that
the proposed method can add links between node 5and nodes
11,12,14,15 and 16. The link between nodes 3and 6and
that between nodes 4and 8are removed since the closeness
is smaller than the minimum closeness of nodes in LC1and
LC2. In this way, the community structure of Gis enhanced
by weakening the connections between LC1and LC2which
do not belong to the same community, and strengthening the
connections between LC2and LC3which belong to the same
community.
B. The Proposed CSE
Based on the proposed community structure enhancement
method, we develop an algorithm, termed CSE, for community
detection in complex networks, which will be shown to well
suit for networks without a clear community structure. The
general framework of CSE is presented in Algorithm 4, which
consists of the following four steps.
At the first step, Algorithm 1 is used to detect the local
communities in a network. At the second step, the central
nodes and boundary nodes in each local community are found
by Algorithm 2. At the third step, Algorithm 3 is adopted to
add and/or remove some links in the network for enhancing
the community structure. At the last step, the communities of
the network are obtained by merging the closely connected
local communities.
Algorithm 5 presents the procedure of merging the local
communities, which is performed as follows. The local com-
munities are first sorted according to their sizes and then we
merge the local communities from the one with the smallest
size. To determine whether a local community LCican be
merged with a local community LCj, we suggest the following
measure, called the merging degree.
M(LCi, LCj) = A(LCj, LCi)
m
j=1 A(LCj, LCi)+J(LCi, LCj),(4)
where mis the number of local communities, A(LCj, LCi)
denotes the interest of LCjaccepting LCiand J(LCi, LCj)
is the interest of LCijoining LCj, which are defined as
A(LCj, LCi) = 2 |Ein(LCiLCj)|
2 |Ein(LCiLCj)|+|Eout (LCiLCj)|
2 |Ein(LCj)|
2 |Ein(LCj)|+|Eout (LCj)|,
(5)
J(LCi, LCj) = |E(LCi, LCj)|
|Eout(LCi)|,(6)
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 6
Algorithm 3: Link-delete-add(G,LC,LC,CN ,BN)
Input:G: a complex network; LC ={LC1,· · · , LCm}:
the set of local communities; LC: a local
community in LC;C N : the central node set of
LC;BN : the boundary node set of LC;
Output:G: the complex network with enhanced
community structure;
GG;
Calculate C(LC)min and C(LC )max ;
for i= 1 to |BN|do
NI find the neighbors of BNiwhich are not in
LC;
for j= 1 to |N I |do
LCNIjfind the local community which
contains NIj;
Calculate C(LCNIj)min and C(LCN Ij)max ;
c(BNi, N Ij)calculate the closeness between
BNiand N Ij;
smin min{C(LC)min , C (LCN Ij)min};
smax max{C(LC)max , C(LCNIj)max };
if c(BNi, N Ij)< smin then
Delete the link between BNiand N Ijfrom
G;
else if c(BNi, N Ij)smax then
Add the link between each node in CN and
NIjto G;
GG;
Dfind the nodes with the maximum degree in BN ;
for i= 1 to |D|do
Mthe set of nodes which are not connected with
Dibut have common neighbors with it;
for j= 1 to |M|do
LCMjthe local community which contains
Mj;
c(Di, Mj)the closeness between Diand Mj;
Calculate C(LC)av e and C(LCMj)ave;
if c(Di, Mj)max{C(LC)av e, C (LCMj)ave}
then
Add the link between Diand Mjto G;
|Ein(LC)|is the number of links in local community LC,
|Eout(LC)|is the number of links between nodes in LC
and nodes outside LC,|E(LC1, LC2)|is the number of links
between nodes in LC1and LC2.
From the definition of merging degree, it can be found that
the larger the value of M(LCi, LCj), the better the quality
of the community by merging LCiwith LCj. To ensure the
quality of merged local communities, a local community LCi
is only merged with the local community having the largest
merging degree among all local communities LCjsatisfying
A(LCj, LCi)> β and A(LCi, LCj)> β, where βis a
predefined parameter.
Algorithm 4: Framework of the proposed CSE
Input:G: a complex network;
Output:Com: the communities of G;
LC LC detection(G);
Sort LC in a descending order by the size of local
communities;
for i= 1 to |LC| do
(CNi, B Ni)Find-CN-BN(G,LCi);
t1;
GG;
while |LCt|>3and t |LC| do
GLink-delete-add(G,LC,LCt,C Nt,BNt);
tt+ 1;
Com Merge(G,LC) ;
Algorithm 5: Merge(G,LC)
Input:G: a complex network; LC: a set of local
communities;
Output:Com: the communities;
Com ;
while |LC| = 0 do
LC find the local community in LC with the
smallest size;
N C find the neighbors of LC;
T emp ;
for k= 1 to |N C | do
A(LCk, LC)calculate the interest of LCk
accepting LC;
A(LC, LCk)calculate the interest of LC
accepting LCk;
if A(LCk, LC)> β and A(LC, LCk)> β then
T emp T emp LCk;
if |T emp| = 0 then
for j= 1 to |T emp|do
M(LC, LCj)calculate the degree for
merging LC into LCj;
LCtfind the local community with the largest
M(LC, LCt);
Merge LC into LCt, and delete LC from LC;
else
Add LC into Com and delete LC from LC ;
C. Complexity Analysis
In this subsection, we gives an upper bound of the time com-
plexity of the proposed CSE. As stated in the above subsection,
the proposed CSE consists of four steps: 1) local community
detection, 2) central and boundary node identification, 3)
adding and removing links and 4) merging local communities.
The first step holds a time complexity of O(d2N)since the
most time-consuming operation is the calculation of similarity
between neighbors of each node, where Nis the number of
nodes in the network and dis the maximum degree of nodes.
The time complexity of step 2) is O(dN)due to the fact that
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 7
TABLE II
TIME COMPLEXITY OF SEV ERA L EXISTING COMM UN ITY DE TE CTI ON
ALGORITHMS.
CSE SSCF [24] Louvain [30] EdgeBoost [43]
O(N2)O(N2)O(N)O(N2)
LMDR [36] Walktrap [35]
O(N2)O(N2log(N))
Ndenotes the number of nodes in a network.
TABLE III
THE IN FOR MATI ON OF SI X REA L-WO RLD NE TW ORK S.
Network Nnode Nlink Ncom Description
Karate 34 78 2 Zacharys karate club
Dolphin 62 160 2 Dolphin social network
Football 115 613 12 Amercian College football
Polbooks 105 441 12 Books about US politics
Yeast 1 990 4687 81 Yeast-D1
Yeast 2 2361 7182 12 PPI network in budding yeast
Nnode represents the number of nodes, Nlink denotes the number of
links and Ncom represents the number of communities.
the calculation of the local centrality score needs to consider
at most dneighbors for each node. The third step has a time
complexity of O(k2d2N), since the maximum number of local
communities is Nand the time complexity of calculating the
closeness of nodes is O(k2d2)in each local community, where
kis the largest number of nodes in local communities. The last
step needs a time complexity of O(N2)due to the fact that
the merging degree of each pair of local communities needs to
be calculated and the maximum number of local communities
is N. Therefore, the time complexity of the entire algorithm
is O(N2), since O(d2N+dN +k2d2N+N2) = O(N2).
Table II gives the time complexity of the proposed CSE and
several existing community detection algorithms, by which it
can be found that the time complexity of the proposed CSE
is comparable.
IV. EXPERIMENTAL RESU LTS AN D DISCUSSIONS
In this section, we verify the performance of the proposed
CSE by comparing it with five state-of-the-art community
detection algorithms, namely, SSCF [24], Louvain [30], Edge-
Boost [43], LMDR [36], and Walktrap [35]. The experiments
are performed on both synthetic benchmark networks and real-
world networks, and all results reported for the proposed CSE
are obtained in case α= 1 and β= 0.05.
A. Experiment Setting and Evaluation Measures
1) Test Networks. The synthetic networks we employ
in the experiments are the LFR benchmark networks pro-
posed by Lancichinetti et al. [49], which have been
widely used for testing the performance of community
detection algorithms. The LFR networks are defined as
LF R(N, dav e, d, t1, t2, minc, maxc, µ), where Nis the num-
ber of nodes of the network, dave is the average degree of
nodes, dis the maximum degree, t1and t2are the exponents
of the degree distribution and the community size distribution,
mincis the number of nodes in the smallest community, maxc
is the number of nodes in the largest community, and µis the
probability of a node in a community connecting with nodes
outside the community. The larger the value of µ, the more
ambiguous the community structure, thereby more challenging
for a community detection algorithm. In the experiments, we
consider four groups of LFR networks with different network
sizes and community sizes. To validate the performance of the
proposed CSE on networks with different ambiguous levels
of community structure, for each group of LFR networks we
consider eight networks with the value of µranging from 0.1
to 0.8with an interval 0.1.
For real-world networks, six networks with known ground
truth are tested in the experiments, namely, Zacharys karate
club network (karate, for short) [50], Bottlenose dolphins
network (dolphin, for short) [51], American college football
network (football, for short) [11], Books about US politics
network (Polbooks, for short) [52], and two different yeast
PPI networks, Yeast 1 [53] and Yeast 2 [54]. The detailed
information of the six real-world networks is listed in Table III.
2) Evaluation Measure. Due to the fact that all networks
considered here have known ground truth, we adopt the widely
used performance indicator, normalized mutual information
(NMI for short) [55]–[59], to measure the quality of com-
munities detected by the algorithms. The NMI is defined as
follows.
NMI(P, P ) =
2
n1
i=1
n2
j=1
Mij log(Mij N
Mi.M.j )
n1
i=1
Mi.log(Mi.
N) +
n2
j=1
M.j log(M.j
N)
,(7)
where Prepresents the true partition of a network and P
denotes a partition of the network obtained by a community
detection algorithm, n1and n2are the number of communities
in partitions Pand P,Mis the confusion matrix whose
element Mij is the number of nodes shared by community
iin partition Pand community jin partition P,Mi. is the
sum of elements of Min row i,M.j is the sum of elements of
Min column j, and Nis the number of nodes in the network.
The larger the value of NM I(P, P ), the more similar the
detected result with the ground truth of the network. If the
detected result Pis the same as P, then NMI(P, P ) = 1;
if they are totally different, then NM I(P, P ) = 0.
B. Experiments on Synthetic Benchmark Networks
Fig. 2 plots the NMI values of the proposed CSE and the
five compared community detection algorithms on the four
groups of LFR networks, where the NMI value reported for
each LFR network is averaged over 30 networks with the
same parameter setting. From the figure, the following three
observations can be obtained.
First, the proposed CSE algorithm achieves an overall better
performance in comparison with the five considered commu-
nity detection algorithms on LFR networks. For µ0.5,
the proposed CSE holds a comparable performance despite
that it performed slightly worse than EdgeBoost and SSCF,
which hold the best performance on LFR networks with
µ0.5. As for µ > 0.5, the proposed CSE achieves the
best NMI value among all the considered algorithms on
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 8
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.2
0.4
0.6
0.8
1
NMI
µ
CSE
SSCF
Louvain
EdgeBoost
LMDR
Walktrap
(a) N= 5000,minc= 10,maxc= 50
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.2
0.4
0.6
0.8
1
NMI
µ
CSE
SSCF
Louvain
EdgeBoost
LMDR
Walktrap
(b) N= 5000,minc= 20,maxc= 100
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.2
0.4
0.6
0.8
1
NMI
µ
CSE
SSCF
Louvain
EdgeBoost
LMDR
Walktrap
(c) N= 10000,minc= 10,maxc= 50
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.2
0.4
0.6
0.8
1
NMI
µ
CSE
SSCF
Louvain
EdgeBoost
LMDR
Walktrap
(d) N= 10000,minc= 20,maxc= 100
Fig. 2. N MI values of the proposed CSE and five compared algorithms on four groups of LFR benchmark networks with different values of mixing parameter
µ. (a) NMI values on networks with N= 5000,minc= 10,maxc= 50; (b) NMI values on networks with N= 5000,minc= 20,maxc= 100;
(c) NMI values on networks with N= 10000,minc= 10,maxc= 50; (d) NMI values on networks with N= 10000,minc= 20,maxc= 100.
LFR networks. Second, the proposed CSE is more suited to
detecting communities in LFR networks whose community
structure is ambiguous. The performance of all community
detection algorithms considered here deteriorates considerably
as the value of µincreases, due to the fact that the community
structure becomes more ambiguous. However, the proposed
CSE shows relatively robust performance on LFR networks
with µ > 0.5, which achieves significantly better performance
than all the compared algorithms. In contrast, the values of
NMI obtained by EdgeBoost and SSCF decrease sharply
when µ0.6. The promising performance of the proposed
method CSE may partly be due to the number of detected
communities, since the CSE finds more communities as shown
in Fig. 3. Third, it seems that the sizes of networks and
communities has little influence on the performance of the
proposed community detection algorithm CSE for LFR net-
works, whereas some of the compared algorithms are a little
sensitive to these two factors, e.g., Louvain.
To further show the effectiveness of the proposed CSE
in community detection, Fig. 4 presents the numbers of
correctly and incorrectly added links and removed links on
the LFR networks with different values of µ, averaging over
30 networks with the same parameter setting. From the figure,
it can clearly be seen that a large number of links have been
added and removed from the original LFR networks, which
can considerably enhance the community structure. To visually
illustrate this fact, Fig. 5 plots the original LFR network and
the network obtained by performing the operation of adding
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
200
400
600
800
1000
Community number
µ
CSE
SSCF
Louvain
EdgeBoost
LMDR
Walktrap
Fig. 3. Numbers of communities detected by the six algorithms on LFR
networks with different values of mixing parameter µ, where N= 5000,
minc= 10,maxc= 50.
and removing links by the proposed community detection
algorithm CSE, where different communities are marked by
different colors. As shown in the figure, there are a number
of links between communities in the original LFR network
with µ= 0.2. After performing the operation of adding and
removing links by the proposed CSE, the number of links
between communities has been significantly reduced, by which
the community structure is converted into a clearer one and
thus naturally improves the quality of community detection.
It can also be found from Fig. 4 that there are only a small
number of links which are incorrectly added and removed by
the proposed CSE for µ0.5. It is worth noting that, although
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 9
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
2000
4000
6000
8000
10000
12000
14000
16000
µ
Number of Edges
add
delete
incorrect
(a)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
2000
4000
6000
8000
10000
12000
µ
Number of Edges
add
delete
incorrect
(b)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.5
1
1.5
2
2.5
3
x 104
µ
Number of Edges
add
delete
incorrect
(c)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
0.5
1
1.5
2
2.5
x 104
µ
Number of Edges
add
delete
incorrect
(d)
Fig. 4. Numbers of correctly and incorrectly added links and removed links on LFR networks by the proposed CSE. Each histogram shows the mean value
of the numbers of links averaged over 30 LFR networks with the same parameter setting. (a) Numbers of correctly and incorrectly added links and removed
links on networks with N= 5000,minc= 10,maxc= 50; (b) Numbers of correctly and incorrectly added links and removed links on networks with
N= 5000,minc= 20,maxc= 100; (c) Numbers of correctly and incorrectly added links and removed links on networks with N= 10000,minc= 10,
maxc= 50; (d) Numbers of correctly and incorrectly added links and removed links on networks with N= 10000,minc= 20,maxc= 100.
(a) (b)
Fig. 5. The LFR networks with N= 1000,minc = 10,maxc = 50 and
µ= 0.2before and after performing the operation of adding and removing
links by the proposed CSE. (a) The original LFR network; (b) The LFR
network after performing the operation of adding and removing links.
the proposed CSE also produces many incorrectly added and
removed links on LFR networks with µ > 0.5, it will still
perform much better than existing community detection algo-
rithms due to the fact that the ambiguous community structure
detection is much challenging. By enhancing the community
structure, it becomes relatively easy to detect the ambiguous
community structure, and thus the proposed CSE can achieve
a competitive performance in detecting communities for LFR
networks with an ambiguous community structure.
From the above empirical results, we can conclude that the
proposed CSE is a promising community detection algorithm
on LFR benchmark networks, especially for those with an
ambiguous community structure.
C. Experiments on Real-World Networks
Table IV presents the N M I values of the proposed CSE
and the five compared community detection algorithms on
the six real-world networks, where the best NMI value
on each network is highlighted. From the table, it can be
found that the proposed CSE performs the best among all the
community detection algorithms under consideration on the
five out of six real-world networks, which achieves the largest
NMI value on these real-world networks. The main reason
may be attributed to the fact that most real-world networks
hold an ambiguous community structure, since the proposed
CSE is more suited to community detection in networks with
ambiguous community structures as analyzed in the previous
subsection.
Fig. 6 presents the accuracy of the proposed CSE algorithm
in correctly adding and removing links on the six real-world
networks. As shown in the figure, there are some links that
are incorrectly added and removed from the original real-world
networks, but these errors cannot hinder that the proposed CSE
holds a competitive performance on the real-world networks,
since the ambiguous community structure detection is always
a challenging task for all existing community detection algo-
rithms. The experimental results in Table IV demonstrate the
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 10
TABLE IV
NMI VAL UES O F TH E PROP OSE D CSE ALG OR ITH M AN D THE FI VE
COM PARE D COMMUNITY DETECTION ALGORITHMS ON SIX
REA L-W ORL D NET WOR KS .
Network CSE Louvain SSCF EB LMDR Walktrap
Karate 10.59 0.84 0.69 0.72 0.50
Dolphin 10.52 0.89 0.50 0.76 0.54
Football 0.93 0.89 0.93 0.89 0.92 0.89
Polbooks 0.49 0.45 0.49 0.50 0.54 0.49
Yeast 1 0.94 0.89 0.91 0.90 0.93 0.90
Yeast 2 0.26 0.17 0.06 0.21 0.24 0.25
”EB” represents the EdgeBoost algorothm.
Karate Dolphin Football Polbooks Yeast_1 Yeast_2
0
0.2
0.4
0.6
0.8
1
Real−world Networks
Added
Removed
0
0.2
0.4
0.6
0.8
1
Correct Ratio
Incorrect Ratio
Fig. 6. Accuracy of correctly adding and removing links by the proposed CSE
on the six real-world networks. A link is considered to be correctly added if it
is a link between nodes in a community and a link is regarded to be correctly
removed if it is a link between nodes in different communities.
effectiveness of the community structure enhancement method
suggested in the proposed CSE in addressing the challenge
on real-world networks, even for some networks on which the
proposed CSE will incorrectly add and remove a large number
of links (e.g., Yeast 2).
To visually illustrate the effectiveness of the community
structure enhancement method in CSE, Fig. 7 plots the original
Karate network and the enhanced network after performing
the operation of adding and removing links by the proposed
CSE. As can be seen from the figure, the community struc-
ture of Karate network has been significantly enhanced after
performing the operation of adding and removing links by
the proposed CSE, which enables the community detection in
the enhanced Karate network become much easier than the
original one.
Therefore, we can conclude that the proposed CSE is well
suited for community detection in real-world networks, in
comparison with existing community detection algorithms.
D. Sensitivity of parameters αand β
As mentioned in Algorithm 1, there is a parameter α,
0α1, in local community detection. There also exists
a parameter β,0β1, in merging local communities in
Algorithm 5. In what follows, we empirically investigate the
influence of αand βon the performance of the proposed CSE.
Fig. 8 presents the NMI values of CSE on LFR networks
under the value of parameter αvarying from 0to 1with
an interval 0.1. As shown in the figure, it can be found that
34
33
23
1
2
3
4
56
7
8
9
10
11
12
13
14
15
16
17
18
19 20
21
22
24
25
26
27
28 29
30
31
32
(a)
34
33
23
1
2
3
45
6
7
8
9
10
11
12 13
14
15
16
17
18
19
20
21
22
24
25
26
27
28 29
30
31
32
(b)
Fig. 7. The original Karate network and the enhanced Karate network after
performing the operation of adding and removing links by the proposed CSE.
(a) The original Karate network; (b) The enhanced Karate network after
performing the operation of adding and removing links. In each network,
different communities are marked by different colors.
the proposed CSE demonstrates a better overall performance
on LFR networks as the value of αincreases and the best
performance of the proposed CSE is achieved when the
parameter αis set to 1. The main reason for a large value
of αoften achieving a better performance is that the large
value of αcan make the detected local communities be small
enough, thus ensuring the accuracy of the operation of adding
and removing links in the proposed CSE.
Fig.9 presents the NMI values of the proposed CSE on the
six real-world networks under the value of αvarying from 0 to
1 with an interval of 0.1. Similar results can be observed from
the figure. The proposed CSE achieves the best performance
when αis set to 1, on all the real-world networks with the
exception of Dolphin and Polbooks networks. On the Dolphin
network, the proposed community detection algorithm CSE
holds the best detection result when αis fixed to a value
between 0.5and 0.8. When the value of αis larger than 0.8,
the performance of CSE has a little deterioration on Dolphin
network. On the Polbooks network, CSE holds the best result
when αis fixed to 0.55, and gives the second best result when
αis fixed to 1. In summary, the parameter αis suggested to
be set to 1 on all networks when the proposed CSE is adopted
for community detection.
Figs. 10 and 11 presents the NMI values of the proposed
CSE on LFR benchmark networks and real-world networks,
under the value of βvarying from 0 to 0.1 with an interval
of 0.01. We here do not present the performance of CSE for
β > 0.1since a large value of βoften leads to the detected
communities having a small size, which will considerably
deteriorate the performance of CSE. As shown in the figures,
the parameter βis suggested to be set to 0.05 for the proposed
CSE, which enables the CSE to achieve an overall better
performance on these networks.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 11
NMI
Parameter α
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(a)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter α
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(b)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter α
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(c)
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter α
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(d)
Fig. 8. N MI values of the proposed CSE on LFR networks under different values of parameter α, averaging over 30 networks with the same parameter
setting. (a) NMI values on networks with N= 5000,minc= 10,maxc= 50; (b) NMI values on networks with N= 5000,minc= 20,maxc= 100;
(c) NMI values on networks with N= 10000,minc= 10,maxc= 50; (d) NMI values on networks with N= 10000,minc= 20,maxc= 100.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.2
0.4
0.6
0.8
1
Real−world Network
NMI
Parameter a
Karate
Dolph
Football
Polbooks
Yeast1
Yeast2
Fig. 9. N M I values of the proposed CSE on the six real-world networks
under different settings of parameter α
.
V. CONCLUSIONS
In this paper, we have proposed an algorithm, namely CSE,
for community detection in complex networks, especially for
those with ambiguous community structures. The proposed
CSE achieves the community detection by enhancing the com-
munity structure by means of adding the links between nodes
that have a large possibility of belonging to one community
and removing the links between nodes in different communi-
ties. To this end, an effective method of adding and removing
links is suggested in the proposed CSE. Experimental results
on both synthetic networks and real-world networks have
demonstrated the competitive performance of the proposed
CSE in community detection, and the CSE has been shown
to be well suited to community detection in networks with an
ambiguous community structure.
The work in this paper has shown the effectiveness of the
community structure enhancement method suggested in the
proposed CSE, which could provide a promising idea to com-
munity detection in networks with an ambiguous community
structure. Hence, it deserves to further explore the potential of
the idea by developing more effective methods for community
structure enhancement, especially for networks like Yeast 2
whose community structure is very ambiguous. There are also
many other interesting problems related to the proposed CSE.
For example, is it possible to adopt this method for overlapping
community detection? Could the idea of community structure
enhancement be applied to other types of networks? These
problems all deserve to be investigated in the future.
REFERENCES
[1] X. Zeng, X. Zhang, Y. Liao, and L. Pan, “Prediction and validation
of association between micrornas and diseases by multipath methods,”
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 12
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter β
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(a)
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter β
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(b)
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter β
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(c)
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
NMI
Parameter β
µ−0.2
µ−0.4
µ−0.6
µ−0.8
(d)
Fig. 10. N M I values of the proposed CSE on LFR networks under different values of parameter β, averaging over 30 networks with the same parameter
setting. (a) NMI values on networks with N= 5000,minc= 10,maxc= 50; (b) NMI values on networks with N= 5000,minc= 20,maxc= 100;
(c) NMI values on networks with N= 10000,minc= 10,maxc= 50; (d) NMI values on networks with N= 10000,minc= 20,maxc= 100.
0 0.02 0.04 0.06 0.08 0.1
0
0.2
0.4
0.6
0.8
1
Real−world Network
NMI
Parameter β
Karate
Dolphin
Football
Polbooks
Yeast_1
Yeast_2
Fig. 11. NMI values of the proposed CSE on the six real-world networks
under different settings of parameter β
.
Biochimica Et Biophysica Acta, vol. 1860, no. 11, pp. 2735–2739, 2016.
[2] X. Zeng, X. Zhang, and Q. Zou, “Integrative approaches for predicting
microrna function and prioritizing disease-related microrna using biolog-
ical interaction networks,” Briefings in Bioinformatics, vol. 17, no. 2, pp.
193–203, 2016.
[3] K. Kandhway and J. Kuri, “Using node centrality and optimal control to
maximize information diffusion in social networks, IEEE Transactions
on Systems, Man and Cybernetics: Systems, vol. 47, no. 7, pp. 1099–
1110, 2017.
[4] S. Jaroszewicz and A. Wierzbicki, “Verifying social network models of
wikipedia knowledge community, Information Sciences, vol. 339, pp.
158–174, 2016.
[5] J. H. Dalton, M. J. Elias, and A. Wandersman, Community psychology:
linking individuals and communities. Belmont, California: Thomson
Wadsworth, 2007.
[6] A. Cheng, Y. Chen, Y. Huang, W. H. Hsu, and H. M. Liao, “Personalized
travel recommendation by mining people attributes from community-
contributed photos,” in Proceedings of the 19th ACM International
Conference on Multimedia, 2011, pp. 83–92.
[7] M. Krny and R. Herzallah, “Scalable harmonization of complex net-
works with local adaptive controllers, IEEE Transactions on Systems
Man and Cybernetics: Systems, vol. 47, no. 3, pp. 394–404, 2017.
[8] S. Deng, L. Huang, J. Taheri, J. Yin, M. C. Zhou, and A. Y. Zomaya,
“Mobility-aware service composition in mobile communities,” IEEE
Transactions on Systems, Man and Cybernetics: Systems, vol. 47, no. 3,
pp. 555–568, 2016.
[9] Z. Wang, D. Zhang, X. Zhou, D. Yang, Z. Yu, and Z. Yu, “Discovering
and profiling overlapping communities in location-based social net-
works,” IEEE Transactions on Systems, Man and Cybernetics: Systems,
vol. 44, no. 4, pp. 499–509, 2014.
[10] Z. Bu, C. Zhang, Z. Xia, and J. Wang, A fast parallel modularity
optimization algorithm (FPMQA) for community detection in online
social network,” Knowledge Based Systems, vol. 50, no. 3, pp. 246–
259, 2013.
[11] M. Girvan and M. E. Newman, “Community structure in social and
biological networks,” Proceedings of the National Academy of Sciences,
vol. 99, no. 12, pp. 7821–7826, 2002.
[12] C. Conaco and K. S. Kosik, “Functionalization of a protosynaptic gene
expression network.” Proceedings of the National Academy of Sciences,
vol. 109, no. Supplement1, pp. 10 612–10 618, 2012.
[13] Z. Lu, X. Sun, Y. Wen, G. Cao, and T. L. Porta, Algorithms and
applications for community detection in weighted networks,” IEEE
Transactions on Parallel and Distributed Systems, vol. 26, no. 11, pp.
2916–2926, 2015.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 13
[14] P. Hui, J. Crowcroft, and E. Yoneki, “BUBBLE rap: social-based
forwarding in delay-tolerant networks,” IEEE Transactions on Mobile
Computing, vol. 10, no. 11, pp. 1576–1589, 2010.
[15] S. Fortunato, “Community detection in graphs, Physics Reports, vol.
486, no. 3, pp. 75–174, 2010.
[16] X. Zhang, C. Wang, Y. Su, L. Pan, and H.-F. Zhang, A fast overlapping
community detection algorithm based on weak cliques for large-scale
networks,” IEEE Transactions on Computational Social Systems, vol. 4,
pp. 218–230, 2017.
[17] A. Pothen, Graph partitioning algorithms with applications to scientific
computing. Norfolk, Virginia, USA: Springer Netherlands, 1997.
[18] B. W. Kernighan and S. Lin, An efficient heuristic procedure for
partitioning graphs,” The Bell System Technical Journal, vol. 49, no. 2,
pp. 291–307, 1970.
[19] E. R. Barnes, An algorithm for partitioning the nodes of a graph,” SIAM
Journal on Algebraic Discrete Methods, vol. 3, no. 4, pp. 541–550, 1982.
[20] M. E. Newman and M. Girvan, “Finding and evaluating community
structure in networks,” Physical Review E, vol. 69, no. 2, p. 026113,
2004.
[21] W. E. Donath and A. J. Hoffman, “Lower bounds for the partitioning
of graphs,” IBM Journal of Research and Development, vol. 17, no. 5,
pp. 420–425, 1973.
[22] J. Shi and J. Malik, “Normalized cuts and image segmentation, IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 22,
no. 8, pp. 888–905, 2000.
[23] A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: analysis
and an algorithm,” in Proceedings of the 14th International Conference
on Neural Information Processing Systems: Natural and Synthetic, 2001,
pp. 849–856.
[24] A. Mahmood and M. Small, “Subspace based network community
detection using sparse linear coding,” IEEE Transactions on Knowledge
and Data Engineering, vol. 28, no. 3, pp. 801–812, 2016.
[25] M. E. Newman, “Fast algorithm for detecting community structure in
networks,” Physical Review E, vol. 69, no. 6, p. 066133, 2004.
[26] U. Brandes, D. Delling, M. Gaertler, R. G¨
orke, M. Hoefer, Z. Nikoloski,
and D. Wagner, “On modularity: NP-completeness and beyond,” Faculty
of Informatics, Univ. Karlsruhe, Wagner, Tech. Rep. 2006-19, 2006.
[27] C. Mu, J. Xie, Y. Liu, F. Chen, Y. Liu, and L. Jiao, “Memetic algorithm
with simulated annealing strategy and tightness greedy optimization for
community detection in networks,” Applied Soft Computing, vol. 34, pp.
485–501, 2015.
[28] C. Pizzuti, “GA-Net: a genetic algorithm for community detection in
social networks,” in Proceedings of 2008 International Conference on
Parallel Problem Solving from Nature, 2008, pp. 1081–1090.
[29] R. Shang, J. Bai, L. Jiao, and C. Jin, “Community detection based on
modularity and an improved genetic algorithm, Physica A, vol. 392,
no. 5, pp. 1215–1231, 2013.
[30] V. D. Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast
unfolding of communities in large networks, Journal of Statistical
Mechanics: Theory and Experiment, vol. 2008, no. 10, p. P10008, 2008.
[31] U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm
to detect community structures in large-scale networks, Physical Review
E, vol. 76, no. 3, p. 036106, 2007.
[32] M. J. Barber and J. W. Clark, “Detecting network communities by
propagating labels under constraints,” Physical Review E, vol. 80, no. 2,
p. 026129, 2009.
[33] Z. Lin, X. Zheng, N. Xin, and D. Chen, “CK-LPA: efficient community
detection algorithm based on label propagation with community kernel,”
Physica A, vol. 416, pp. 386–399, 2014.
[34] R. Lambiotte, J. C. Delvenne, and M. Barahona, “Random walks,
markov processes and the multiscale modular organization of complex
networks,” IEEE Transactions on Network Science and Engineering,
vol. 1, no. 2, pp. 76–90, 2014.
[35] P. Pons and M. Latapy, “Computing communities in large networks
using random walks,” in International Symposium on Computer and
Information Sciences, Istanbul, Turkey, 2005, pp. 284–293.
[36] Q. Chen, T. Wu, and M. Fang, “Detecting local community structures
in complex networks based on local degree central nodes, Physica A,
vol. 392, no. 3, pp. 529–537, 2013.
[37] C. Liu, J. Liu, and Z. Jiang, “A multiobjective evolutionary algorithm
based on similarity for community detection from signed social net-
works,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2274–
2287, 2014.
[38] M. Gong, Q. Cai, X. Chen, and L. Ma, “Complex network clustering by
multiobjective discrete particle swarm optimization based on decompo-
sition,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 1,
pp. 82–97, 2014.
[39] X. Zhang, K. Zhou, H. Pan, L. Zhang, X. Zeng, , and Y. Jin, A network
reduction based multi-objective evolutionary algorithm for community
detection in large-scale complex networks, IEEE Transactions on
Cybernetics, 2018, in press.
[40] Z. Y. Zhang, Y. Wang, and Y. Y. Ahn, “Overlapping community detection
in complex networks using symmetric binary matrix factorization,
Physical Review E, vol. 87, no. 6-1, p. 062803, 2013.
[41] U. Brandes, M. Gaertler, and D. Wagner, “Experiments on graph clus-
tering algorithms,” in Proceedings of the Eleventh European Symposium
on Algorithms, Heidelberg, Germany, 2003, pp. 568–579.
[42] P. Holme, M. Huss, and H. Jeong, “Subnetwork hierarchies of biochem-
ical pathways,” Bioinformatics, vol. 19, no. 4, p. 532, 2003.
[43] M. Burgess, E. Adar, and M. Cafarella, “Link-prediction enhanced
consensus clustering for complex networks,” Plos One, vol. 11, no. 5,
p. e0153384, 2016.
[44] H. M. Cheng, Y. Z. Ning, Z. Yin, C. Yan, X. Liu, and Z. Y. Zhang,
“Community detection in complex networks using link prediction,”
Modern Physics Letters B, vol. 32, no. 3, p. 1850004, 2016.
[45] Z. Zhang, K. Sun, and S. Wang, “Enhanced community structure
detection in complex networks with partial background information,”
Scientific Reports, vol. 3, no. 11, p. 3241, 2013.
[46] L. Yang, D. Jin, X. Wang, and X. Cao, Active link selection for efficient
semi-supervised community detection,” Scientific Reports, vol. 5, p.
9039, 2015.
[47] P. Jaccard, “Etude de la distribution florale dans une portion des alpes
et du jura,” Bulletin De La Societe Vaudoise Des Sciences Naturelles,
vol. 37, no. 142, pp. 547–579, 1901.
[48] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “Density-based spatial
clustering of applications with noise,” in Proceedings of the Second
International Conference on Knowledge Discovery and Data Mining,
1996, pp. 226–231.
[49] A. Lancichinetti, S. Fortunato, and F. Radicchi, “Benchmark graphs for
testing community detection algorithms.” Physical Review E, vol. 78,
no. 2, p. 046110, 2008.
[50] W. W. Zachary, “An information glow model for conflict and fission in
small groups,” Journal of Anthropological Research, vol. 33, no. 4, p.
473, 1977.
[51] D. Lusseau, “The emergent properties of a dolphin social network,
Proceedings of the Royal Society of London Series B, vol. 270, no.
Suppl 2, pp. 186–188, 2003.
[52] M. E. J. Newman, “Modularity and community structure in networks,
Proceedings of the National Academy of Sciences, vol. 103, no. 23, pp.
8577–82, 2006.
[53] A. C. Gavin, P. Aloy, P. Grandi, R. Krause, M. Boesche, M. Marzioch,
C. Rau, L. J. Jensen, S. Bastuck, and B. Dmpelfeld, “Proteome survey
reveals modularity of the yeast cell machinery,” Nature, vol. 440, no.
7084, pp. 631–636, 2006.
[54] D. Bu, Y. Zhao, L. Cai, H. Xue, X. Zhu, H. Lu, J. Zhang, S. Sun,
L. Ling, N. Zhang, G. Li, and R. Chen, “Topological structure analysis
of the protein-protein interaction network in budding yeast,” Nucleic
Acids Research, vol. 31, no. 9, pp. 2443–50, 2003.
[55] J. P. Bagrow, “Evaluating local community methods in networks,
Journal of Statistical Mechanics: Theory and Experiment, vol. 5, p.
P05001, 2008.
[56] Y. Su, B. Wang, and X. Zhang, A seed-expanding method based on
random walks for community detection in networks with ambiguous
community structures,” Scientific Reports, vol. 7, p. 41830, 2017.
[57] Z. Ding, X. Zhang, D. Sun, and B. Luo, “Overlapping community
detection based on network decomposition,” Scientific Reports, vol. 6,
p. 24115, 2016.
[58] L. Zhang, H. Pan, Y. Su, X. Zhang, and Y. Niu, “A mixed representation-
based multiobjective evolutionary algorithm for overlapping community
detection,” IEEE Transactions on Cybernetics, vol. 47, no. 9, pp. 2703–
2716, 2017.
[59] J. Ying, S. Zhang, N. Ding, X. Zeng, and X. Zhang, “Complex
network clustering by a multi-objective evolutionary algorithm based
on decomposition and membrane structure,” Scientific Reports, vol. 6,
p. 33870, 2016.
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS, VOL. , NO. , MONTH YEAR 14
Yansen Su received the B.Sc. degree from Tangshan
Normal University, Tangshan, China, in 2007, the
M.Sc. degree from Shandong University of Science
and Technology, Qingdao, China, in 2010, and the
Ph.D. degree from Huazhong University of Science
and Technology, Wuhan, China, in 2014. She is
currently an Associate Professor in the School of
Computer Science and Technology, Anhui Univer-
sity, Hefei, China. Her main research interests in-
clude complex networks, computational biology, and
multi-objective optimization.
Chunlong Liu received the B.Sc. degree from
Fuyang Normal College, Fuyang, China, in 2015.
He is currently pursuing the masters degree with the
School of Computer Science and Technology, Anhui
University, Hefei. His current research interest is
complex network clustering.
Yunyue Niu received the B.S. degree in applied
mathematics from Qufu Normal University, Jining,
China, in 2004, the M.S. degree in electric ma-
chines and electric apparatus from the Zhengzhou
University of Light Industry, Zhengzhou, China, in
2007, and the Ph.D. degree in systems analysis and
integration from the Huazhong University of Science
and Technology, Wuhan, China, in 2012. She is
currently an Associate Professor with the School
of Information Engineering, China University of
Geosciences, Beijing, China. Her current research
interests include membrane computing and artificial intelligence.
Fan Cheng received the B.Sc. in 2000, M.Sc. in
2003 from HeFei University of Technology, and
Ph.D. in 2012 from University of Science and Tech-
nology of China. Now he is an associate Professor
in School of Computer Science and Technology,
Anhui University, China. His main research interests
include machine learning, multi-objective optimiza-
tion, and complex network.
Xingyi Zhang (M’16-SM’18) received the B.Sc. de-
gree from Fuyang Normal College, Fuyang, China,
in 2003, and the M.Sc. and Ph.D. degrees from
Huazhong University of Science and Technology,
Wuhan, China, in 2006 and 2009, respectively. He
is currently a Professor with the Key Lab of In-
telligent Computing and Signal Processing of Min-
istry of Education, School of Computer Science
and Technology, Anhui University, Hefei, China.
His current research interests include unconventional
models and algorithms of computation, evolutionary
multiobjective optimization, and complex network analysis. He is the recipient
of the 2017 IEEE Transactions on Evolutionary Computation Outstanding
Paper Award.
... Specifically, community detection involves dividing networks into communities based on the topology structure, in which the connections between nodes in the same community are dense (denser is better) and the connections between nodes in different communities are sparse (sparser is better) [2]. Owing to the fundamental importance of community detection in complex network analysis, many algorithms have been proposed for community detection, e.g., graph partitioning [3], hierarchical clustering [4], spectral clustering [5], modularity optimization-based [6], label propagation-based [7], commu- nity structure enhancement-based [8], and evolutionary algorithms (EAs) [9]. ...
... In particular, for each ind in P F , it can be decoded as k communities C 1 , C 2 , ..., C k . For each community C i (i = 1, 2, ..., k), node v c in C i is considered as the central node of this community if the sum of the diffusion kernel similarity value between v c and the other nodes in C i is maximized, which is formally defined by Eq. (8). ...
... 2) Comparison algorithms: In this study, ten comparison community detection algorithms were adopted as baselines, which included two non-EA-based algorithms and eight EAbased algorithms. The two non-EA-based algorithms were CSE [8] and SSCF [5]. CSE is a state-of-the-art community structure enhancement-based algorithm, and SSCF is a representative spectral-clustering-based algorithm. ...
Article
Full-text available
Recently, multi-objective evolutionary algorithms (MOEAs) have shown promising performance in terms of community detection in complex networks. However, most studies have focused on designing different strategies to achieve good community detection performance based on a single population. Unlike these studies, this study proposes a macro-micro population-based co-evolutionary multi-objective algorithm called MMCoMO for community detection in complex networks to obtain a better trade-off between exploration and exploitation. This algorithm employs two populations, i.e., macro-population and micro-population, for co-evolution to obtain better community structures. In particular, the macro-population prefers exploration and is responsible for quickly determining approximate partitions of the network to obtain good rough community structures as early as possible, whereas the micro-population favors exploitation and is responsible for searching for good fine community structures through the local search process. Thus, these two populations can be used to improve each other through interactions in the co-evolutionary process. In particular, a guiding strategy is designed using the elite (non-dominated) solutions of the macro-population to guide the micro-population, and an influencing strategy is further designed using the elite solutions of the micro-population to positively influence the macro-population. Experiments on synthetic networks and 14 real-world networks demonstrate the superiority of the proposed algorithm over several state-of-the-art community detection algorithms.
... It should be noted that these truss cases are not enough to prove the effectiveness of the proposed GBSMA. In future work, the GBSMA can also be applied to more cases, such as optimization of machine learning models [104][105][106], text clustering [107], recommender system [108], kayak cycle phase segmentation [109], and network analysis [110,111]. Also, It can be applied to optimize the features of more complex scenarios in crystal structures optimization [112], bionic electronic skin sensing [113,114], active surveillance [115], and disease prediction [116,117]. ...
Article
Full-text available
Unlabelled: The slime mould algorithm (SMA) is a new meta-heuristic algorithm recently proposed. The algorithm is inspired by the foraging behavior of polycephalus slime moulds. It simulates the behavior and morphological changes of slime moulds during foraging through adaptive weights. Although the original SMA's performance is better than most swarm intelligence algorithms, it still has shortcomings, such as quickly falling into local optimal values and insufficient exploitation. This paper proposes a Gaussian barebone mutation enhanced SMA (GBSMA) to alleviate the original SMA's shortcomings. First of all, the Gaussian function in the Gaussian barebone accelerates the convergence while also expanding the search space, which improves the algorithm exploration and exploitation capabilities. Secondly, the differential evolution (DE) update strategy in the Gaussian barebone, using rand as the guiding vector. It also enhances the algorithm's global search performance to a certain extent. Also, the greedy selection is introduced on this basis, which prevents individuals from performing invalid position updates. In the IEEE CEC2017 test function, the proposed GBSMA is compared with a variety of meta-heuristic algorithms to verify the performance of GBSMA. Besides, GBSMA is applied to solve truss structure optimization problems. Experimental results show that the convergence speed and solution accuracy of the proposed GBSMA are significantly better than the original SMA and other similar products. Supplementary information: The online version contains supplementary material available at 10.1007/s10462-022-10370-7.
... Various clustering methods have been applied in research related to the classification problem in diverse fields. Clustering or community finding is a crucial and one of the most common tasks in the field of complex networks (Radicchi et al. 2004;Palla et al. 2005;Reichardt and Bornholdt 2006;Boccaletti et al. 2006;Rosvall and Bergstrom 2008;Gómez et al. 2009;Collingsworth and Menezes 2014;Jia et al. 2015;Duan et al. 2021;Kuikka 2021;Kumar and Dohare 2021;Su et al. 2021), where the topological structure of complex networks can be characterized by partitioning networks into densely connected subgraphs (Salter-Townshend et al. 2012). Recognizing cohesive clusters or communities and their boundaries allows the classification of nodes according to their topological position in the network (Fortunato 2010;Piccardi et al. 2010) or retaining the same properties. ...
Article
Full-text available
Cluster structure detection of the network is a basic problem of complex network analysis. This study investigates the structure of the value migration network using data from 499 stocks listed in the S&P500 as of the end of 2021. An examination is carried out whether the process of value migration creates a cluster structure in the network of companies according to economic activity. Specifically, the cohesion and segregation of the extracted modules in the network division according to (i) sector classification, (ii) community division, and (iii) network clustering decomposition are assessed. The results of this study show that the sector classification of the value migration network has a non-cohesive structure, which means that the flow of value in the financial market occurs between companies from various industries. Moreover, the divisions of the value migration network based on community detection and clustering algorithm are characterized by intra-cluster similarity between the vertices and have a strong community structure. The structure of the network division into modules corresponding to the classification of economic sectors differs significantly from the partition based on the algorithms applied.
Article
High-Dimensional and Incomplete (HDI) data are frequently encountered in various Big Data-related applications. Despite its incompleteness, an HDI data repository contains rich knowledge and patterns concerning the complex interactions among numerous nodes. Recently, a Neural Network (NN)-based approach to Latent Feature Analysis (LFA) model becomes popular owing to its strong representation learning ability to HDI data. Nevertheless, existing NN-based LFA models neglect the inherent nonnegativity in most HDI data, resulting in representation accuracy loss. Motivated by this discovery, this study innovatively proposes a Fast Nonnegative AutoEncoder (FNAE)-based approach to LFA on HDI data, whose ideas are three-fold: a) constructing a multilayered autoencoder subject to nonnegativity constraints for high representation learning ability; b) incorporating the data density-oriented modeling mechanism into FNAE's input and output layers for high computational and storage efficiency; and c) implementing an Adam-based single latent factor-dependent, nonnegative and multiplicative update algorithm for efficient model training as well as fulfilling the nonnegativity constraints. Experimental results on eight commonly-adopted HDI matrices from industrial applications demonstrate that the proposed FNAE significantly outperforms several state-of-the-art NN-based LFA models in both estimation accuracy for missing links of an HDI matrix and computational efficiency.
Article
The rapid development of community detection algorithms, while serving users in social networks, also brings about certain privacy problems. In this work, we study community deception, which aims to counter malicious community detection attacks by imperceptibly modifying a small part of the connections. However, it is computationally challenging to find an optimal edge set since it is an NP-hard problem. To address this issue, we propose a self-adaptive evolutionary deception (SAEP) framework. In SAEP, a novel fitness function that is able to capture local and global community change is being proposed. SAEP also provides a well-designed initialization mechanism to reduce the size of the solution space. In addition, we assign an indicator to each gene to reflect its strength within the chromosome that it belongs to, thereby a set of self-adaptive operations can be defined to enhance the algorithm’s stability and efficacy. Furthermore, we define a new “edge distance” to conserve the limited modification resource on the graph. In the experiment, the proposed method is tested against different community detection methods using various real-world datasets, and the experimental results demonstrate that SAEP improves significantly over state-of-the-art approaches in terms of effectiveness.
Article
Full-text available
Multi-view clustering has received substantial research because of its ability to discover heterogeneous information in the data. The weight distribution of each view of data has always been difficult problem in multi-view clustering. In order to solve this problem and improve computational efficiency at the same time, in this paper, Reweighted multi-view clustering with tissue-like P system (RMVCP) algorithm is proposed. RMVCP performs a two-step operation on data. Firstly, each similarity matrix is constructed by self-representation method, and each view is fused to obtain a unified similarity matrix and the updated similarity matrix of each view. Subsequently, the updated similarity matrix of each view obtained in the first step is taken as the input, and then the view fusion operation is carried out to obtain the final similarity matrix. At the same time, Constrained Laplacian Rank (CLR) is applied to the final matrix, so that the clustering result is directly obtained without additional clustering steps. In addition, in order to improve the computational efficiency of the RMVCP algorithm, the algorithm is embedded in the framework of the tissue-like P system, and the computational efficiency can be improved through the computational parallelism of the tissue-like P system. Finally, experiments verify that the effectiveness of the RMVCP algorithm is better than existing state-of-the-art algorithms.
Article
As the number of social network users grows exponentially with increasingly complex profiles, community detection algorithms play a critical role in user portrait analysis. The associated privacy concerns, however, have not sufficiently received the attention that it deserves. In this work, we investigate methods for obfuscating the original community structure by modifying a small number of connections imperceptibly so as to protect the privacy of users. The existing evolutionary models have some successes in this type of NP-hard problem but can only be applied to small-scale datasets, rendering them inadequate for real-world applications. To alleviate this problem, we propose an original and novel CoeCo, a cooperative evolutionary community obfuscation model. In CoeCo, we leverage the divide-and-conquer strategy and put forward a co-evolutionary optimization algorithm suitable for community structure, in which two different fitness functions promote each other to find the optimal edge set. In addition, the motif hypergraph and permanence are used to improve population initialization. The experimental results indicate that our proposed method can achieve excellent efficacy in obfuscating community structure and also greatly reduces running time.
Article
Full-text available
The advances in mobile technologies enable mobile devices to perform tasks that are traditionally run by personal computers as well as provide services to the others. Mobile users can form a service sharing community within an area by using their mobile devices. This paper highlights several challenges involved in building such service compositions in mobile communities when both service requesters and providers are mobile. To deal with them, we first propose a mobile service provision-ing architecture named a mobile service sharing community and then propose a service composition approach by utilizing the Krill-Herd algorithm. To evaluate the effectiveness and efficiency of our approach, we build a simulation tool. The experimental results demonstrate that our approach can obtain superior solutions as compared with current standard composition methods in mobile environments. It can yield near-optimal solutions and has a nearly linear complexity with respect to a problem size.
Article
Evolutionary algorithms have been demonstrated to be very competitive in community detection for complex networks. They, however, show poor scalability to large-scale networks due to the exponential increase of search space. In this paper, we suggest a network reduction based multi-objective evolutionary algorithm for community detection in large-scale networks, where the size of networks is recursively reduced as the evolution proceeds. In each reduction of the network, the local communities found by the elite individuals in the population are identified as nodes of the reduced network for further evolution, thereby considerably reducing the search space. A local community repairing strategy is also suggested to correct the misidentified nodes after each network reduction during the evolution. Experimental results on synthetic and real-world networks demonstrate the superiority of the proposed algorithm over several state-of-the-art community detection algorithms for large-scale networks, in terms of both computational efficiency and detection performance.
Article
Community detection and link prediction are both of great significance in network analysis, which provide very valuable insights into topological structures of the network from different perspectives. In this paper, we propose a novel community detection algorithm with inclusion of link prediction, motivated by the question whether link prediction can be devoted to improving the accuracy of community partition. For link prediction, we propose two novel indices to compute the similarity between each pair of nodes, one of which aims to add missing links, and the other tries to remove spurious edges. Extensive experiments are conducted on benchmark data sets, and the results of our proposed algorithm are compared with two classes of baselines. In conclusion, our proposed algorithm is competitive, revealing that link prediction does improve the precision of community detection.
Article
Community detection is an important tool to analyze hidden information such as functional module and topology structure in complex networks. Compared with traditional community detection, it is more challenging to find overlapping communities in complex networks, especially when the networks are of large scales. Among various overlapping community detection techniques, the well-known clique percolation method (CPM) has shown promising performance in terms of quality of found communities, but suffers from serious curse of dimensionality due to its high computational complexity, which makes it very unlikely to be applied to large-scale networks. To address this issue, in this paper, we propose a weak-CPM for overlapping community detection in large-scale networks. A new measure for characterizing the similarity between weak cliques is also suggested to check whether the weak cliques can be merged into a community. Experimental results on synthetic and real-world networks demonstrate the competitive performance of the proposed method over six popular overlapping community detection algorithms in terms of both computational efficiency and quality of found communities. In addition, the proposed method is also suitable for detecting large-scale networks with an unclear community structure under different levels of overlapping density and overlapping diversity, which is an important property of many real-world complex networks.
Article
Community detection has received a great deal of attention, since it could help to reveal the useful information hidden in complex networks. Although most previous modularity-based and local modularity-based community detection algorithms could detect strong communities, they may fail to exactly detect several weak communities. In this work, we define a network with clear or ambiguous community structures based on the types of its communities. A seed-expanding method based on random walks is proposed to detect communities for networks, especially for the networks with ambiguous community structures. We identify local maximum degree nodes, and detect seed communities in a network. Then, the probability of a node belonging to each community is calculated based on the total probability model and random walks, and each community is expanded by repeatedly adding the node which is most likely to belong to it. Finally, we use the community optimization method to ensure that each node is in a community. Experimental results on both computer-generated and real-world networks demonstrate that the quality of the communities detected by the proposed algorithm is superior to the-state-of-the-art algorithms in the networks with ambiguous community structures.
Article
The field of complex network clustering is gaining considerable attention in recent years. In this study, a multi-objective evolutionary algorithm based on membranes is proposed to solve the network clustering problem. Population are divided into different membrane structures on average. The evolutionary algorithm is carried out in the membrane structures. The population are eliminated by the vector of membranes. In the proposed method, two evaluation objectives termed as Kernel J-means and Ratio Cut are to be minimized. Extensive experimental studies comparison with state-of-the-art algorithms proves that the proposed algorithm is effective and promising.
Article
Community detection in complex network has become a vital step to understand the structure and dynamics of networks in various fields. However, traditional node clustering and relatively new proposed link clustering methods have inherent drawbacks to discover overlapping communities. Node clustering is inadequate to capture the pervasive overlaps, while link clustering is often criticized due to the high computational cost and ambiguous definition of communities. So, overlapping community detection is still a formidable challenge. In this work, we propose a new overlapping community detection algorithm based on network decomposition, called NDOCD. Specifically, NDOCD iteratively splits the network by removing all links in derived link communities, which are identified by utilizing node clustering technique. The network decomposition contributes to reducing the computation time and noise link elimination conduces to improving the quality of obtained communities. Besides, we employ node clustering technique rather than link similarity measure to discover link communities, thus NDOCD avoids an ambiguous definition of community and becomes less time-consuming. We test our approach on both synthetic and real-world networks. Results demonstrate the superior performance of our approach both in computation time and accuracy compared to state-of-the-art algorithms.
Article
Background: Deciphering the genetic basis of human diseases is an important goal in biomedical research. There is increasing evidence suggesting that microRNAs play critical roles in many key biological processes. So the identification of microRNAs associated with disease is very important for understanding the pathogenesis of diseases. Methods: Two multipath methods are introduced to predict the associations between microRNAs and diseases based on microRNA-disease heterogeneous network. The first method, HeteSim_MultiPath (HSMP), uses the HeteSim measure to calculate the similarity between objects and combines the HeteSim scores of different paths with a constant that dampens the contributions of longer paths. The second one, HeteSim_SVM (HSSVM), uses the HeteSim measure and the machine learning method used to combine HeteSim scores instead of a constant. Results: We use the leave-one-out cross-validation to evaluate our novel methods, and find our methods are better than other methods. We achieve an area under the ROC curve of 0.981 and 0.984 respectively. We also check the top-10 most similarity of microRNAs-diseases associations and find our predictions are reasonable and credible. Conclusions: The encouraging results suggest that multipath methods can provide help in identifying novel microRNA-disease associations, and guide biological experiments for scientific research.
Article
We model information dissemination as a susceptible-infected epidemic process and formulate a problem to jointly optimize seeds for the epidemic and time varying resource allocation over the period of a fixed duration campaign running on a social network with a given adjacency matrix. Individuals in the network are grouped according to their centrality measure and each group is influenced by an external control function---implemented through advertisements---during the campaign duration. The aim is to maximize an objective function which is a linear combination of the reward due to the fraction of informed individuals at the deadline, and the aggregated cost of applying controls (advertising) over the campaign duration. We also study a problem variant with a fixed budget constraint. We set up the optimality system using Pontryagin's Maximum Principle from optimal control theory and solve it numerically using the forward-backward sweep technique. Our formulation allows us to compare the performance of various centrality measures (pagerank, degree, closeness and betweenness) in maximizing the spread of a message in the optimal control framework. We find that degree---a simple and local measure---performs well on the three social networks used to demonstrate results: scientific collaboration, Slashdot and Facebook. The optimal strategy targets central nodes when the resource is scarce, but non-central nodes are targeted when the resource is in abundance. Our framework is general and can be used in similar studies for other disease or information spread models---that can be modeled using a system of ordinary differential equations---for a network with a known adjacency matrix.