ChapterPDF Available

Detection of Users’ Abnormal Behavior on Social Networks

March 2020

March 2020

DOI:10.1007/978-3-030-44041-1_55

In book: Advanced Information Networking and Applications (pp.617-629)

Authors:

Nour El Houda Ben Chaabene

Sorbonne Université

Amel Bouzeghoub

Ramzi Guetari

University of Tunis El Manar

Show all 5 authorsHide

In just a few years, social networking sites have become the most popular landmarks on the Internet. They revolutionized the way we communicate, and socialized the Web. However, while it is now impossible to deny their impact, it can take a variety of forms, not all of them are positive. As a result, the detection of anomalies on social networks is a topic of current research that has attracted researchers since the 2000s. This problem is of crucial importance to prevent abnormal activities. So far, all existing works have been devoted to one-dimensional networks. Our approach attempts to provide a new anomaly detection method based on examining relationships between OSN users using multidimensional networks.

Probability density of the estimated scores

…

Adjacency matrix of the studied network

…

Figures - uploaded by Nour El Houda Ben Chaabene

Content may be subject to copyright.

Content uploaded by Nour El Houda Ben Chaabene

Content may be subject to copyright.

Detection of Users’ Abnormal Behavior

on Social Networks

Nour El Houda Ben Chaabene1,2(B

), Amel Bouzeghoub1, Ramzi Guetari3,

Samar Balti3, and Henda Hajjami Ben Ghezala2

1SAMOVAR, Telecom SudParis, 19 Place Marguerite Perey, 91120 Palaiseau, France

{Nourelhouda.Benchaabene,Amel.Bouzeghoub}@telecom-sudparis.eu

2RIADI, National School of Computer Science,

Campus Universitaire de la Manouba, 2010 Manouba, Tunisia

{Nourelhouda.Benchaabene,Henda.Benghezala}@ensi.rnu.tn

3LIMTIC, Higher Institute of Computer Science, 2 Rue Abou Rayhane Bayrouni,

2080 Ariana, Tunisia

Ramzi.guetari@isi.utm.tn, Samarbaltia@gmail.com

Abstract. In just a few years, social networking sites have become the

most popular landmarks on the Internet. They revolutionized the way we

communicate, and socialized the Web. However, while it is now impos-

sible to deny their impact, it can take a variety of forms, not all of them

are positive. As a result, the detection of anomalies on social networks

is a topic of current research that has attracted researchers since the

2000s. This problem is of crucial importance to prevent abnormal activ-

ities. So far, all existing works have been devoted to one-dimensional

networks. Our approach attempts to provide a new anomaly detection

method based on examining relationships between OSN users using mul-

tidimensional networks.

1 Introduction

Social Networks are an ambiguous notion, because in the strict sense, an online

social network refers to the diﬀerent relationships that people have between

themselves and the way in which they are structured; these diﬀerent relation-

ships help us to understand the behavior of individuals. While today, people

are accustomed to using the notion of social network to designate an application

dedicated to communication or more speciﬁcally a social networking service that

through the Internet can maintain communication with their families, friends or

coworkers, and also enable them to meet new people. An Online Social Net-

work (OSN) is an advantage for those who want to learn, improve their culture,

discover new areas or communicate. However, OSN exposes its users to a mul-

titude of dangers, especially young ones who are the most vulnerable because

they do not have the hindsight or the experience to discern a risky situation or

a potentially harmful content.

Springer Nature Switzerland AG 2020

L. Barolli et al. (Eds.): AINA 2020, AISC 1151, pp. 617–629, 2020.

https://doi.org/10.1007/978-3-030-44041-1_55

618 N. E. H. Ben Chaabene et al.

Several methods have been addressed in the literature to solve the prob-

lem of anomaly detection in OSN. Two important techniques were discussed:

behavior-based anomaly detection and structure-based anomaly detection. The

behavioral approach is based on the analysis of the user’s behavior; it deals with

how interactions occur between pairs of users and other users in the system. In

contrast, the structural approach focuses primarily on a particular type of net-

work structure in the social network graph. The structural properties of a graph

show the importance of the structural approach in relation to the behavioral

approach [1]. Several works have addressed the problem of detecting anomalies

in OSN by using monodimensional graphs where the nodes are interconnected

by a single type of link. Given the evolution and synchronization of social net-

works, two users can have more relationships on several social networks at the

same time. In this context, the major advantages of the representation of the

links between the users by a multidimensional graph represent themselves by:

(1) the clariﬁcation of the type of link between users given the increase in the

communication rate, (2) the reduction of information loss and (3) the collection

of more information about each user.

The rest of the paper is as follows: Sect. 2presents a literature review of the

anomaly detection over OSN. Section 3describes our approach to detect abnor-

mal behaviors by examining users interactions on a multidimensional network.

Section 4illustrates an evaluation of the results obtained before concluding in

Sect. 5.

2 State of the Art

Abnormal activities in social networks are deviations from usual and legal activ-

ities. The detection of abnormal behavior is deﬁned in the literature by the

detection of a surprising data : a situation in which a point belonging to class A

but in reality is placed in class B [1–6]. Hawkins [7] deﬁned anomaly detection

as an observation that deviated so much from other observations that it arouses

the suspicion that it was generated by a diﬀerent mechanism. Kaur and Singh [8]

considered anomaly detection as analogous to novelty detection in which new

patterns are observed in the data. The researchers examined a signiﬁcant number

of solutions to detect anomalies in OSN. They described the methods using two

techniques : behavior-based techniques [9–12] and structure-based techniques

[13–18]. In structure-based approaches, researches have focused on a particular

type of structure in the social network graph created due to abnormal activities.

The results of these researches have shown the important role of the structural

properties in relation to the properties extracted from the user’s behavior [1],

as well as the eﬃciency of the collected functionalities of the graph topology

structure [14]. In this context, we are only interested in works based on the

exploitation of the graphs, because it is suﬃcient to detect abnormal users only

using topological functionalities of social networks. In this section, we summa-

rize some of these computing solutions developed for the analysis, detection and

prediction of users’ behaviors on social networks.

Detection of Users’ Abnormal Behavior on Social Networks 619

In Akoglu et al. [16], the OddBall algorithm presents a fast and unsuper-

vised method for detecting anomalous nodes in weighted graphs by mentioning

the appropriate rules to eliminate before classifying a node as an anomaly. This

algorithm detects the deviation of the abnormal behavior from a known nor-

mal behavior. The question that arises to this eﬀect is: what is a known normal

behavior? A normal behavior in 2019 may not be a normal behavior in 1970. The

change of the same behavior over time does not show the eﬀectiveness of Odd-

Ball especially that the graphs tested on this algorithm are not the time evolu-

tion graphs. Hassanzadeh et al. [17], recommended a new Framework based on

the calculation of a certain number of measures (ego, egonet, super egonet, cen-

trality, community, etc.) of a graph. This Framework aims to detect the general

appearance of a model followed by most nodes, then it calculates an aberrant

score of each node based on the distance of the adjustment line to distinguish

the users who may be abnormal and ﬁnally it calculates a threshold to mini-

mize the number of false negatives and the false positives’ rate. This work uses

the fact that social networks have a community structure. This proves that the

majority of users belong to a small number of communities. On the other hand,

users with abnormal behavior resort to establishing random relationships with

users belonging to diﬀerent communities. This method has been applied to static

datasets from online social networks. Rezaei et al. [18] proposed a methodology

based on the calculation of graph metrics. This methodology uses the OddBall

anomaly measurement formula [16], which is an eﬀective way to detect abnormal

behavior in OSN. This work managed to tag 100 nodes with a high probability

that these nodes could follow an abnormal pattern. For this purpose, the results

obtained proved that the abnormal behaviors have a small number of mutual

friends with their friends. Time constraints are lacking to add dynamism to this

method. Fire et al. [14] suggested an algorithm for the detection of spammers

and fake proﬁles in social networks. This algorithm takes into consideration the

communities built following the relationships between the users. It assumes that

abnormal users log randomly to other users belonging to diﬀerent communities.

Is it enough that this solution is based solely on the analysis of the topology of

the structure of social networks? The evaluation of the algorithm relies on its

execution on diﬀerent structures of static graphs of social networks. Zheleva et

al. [15] presented a Framework for the prediction of the type of relationships

between users of social networks. This work relies on the combination of social

networking and aﬃliation links to instantiate sub-graphs of friendship and fam-

ily whose purpose is to identify dynamic anomalies anticipating future events.

Results from three social media sites validated the eﬀectiveness of the proposed

framework. This method can only be applied to datasets where the links to the

groups are predeﬁned. Chen et al. [26], developed an algorithm for anomaly

detection based on building communities in a dynamic network. This evolution-

ary algorithm has shown the eﬀectiveness of communities in dynamic networks

compared to an unrepresentative algorithm on static networks. For this purpose,

communities are the most important metric extracted from a graph. This par-

ticular approach allowed the detection of six types of possible community-based

anomalies in evolutionary networks.

620 N. E. H. Ben Chaabene et al.

The previous works were developed for a common ob jective that is the detec-

tion of the anomalies in the graphs of the social networks. We can classify these

methods into four subclasses: (1) methods based on static graphs without the

construction of communities [16,18], (2) methods based on static graphs with

the construction of communities [14,17], (3) methods based on dynamic graphs

without the construction of communities [15] and (4) methods based on dynamic

graphs with the construction of communities [26]. The evolutionary structure of

social networks and the deviation of the user’s behavior over time require a

method based on calculating communities using a dynamic graph. In literature,

authors in [26] oﬀered a single and speciﬁc work, which meets these two require-

ments.

All these analyzed works studied information networks in the one-dimensional

context where the nodes are interconnected by a single type of link. As a result,

complex information networks represent the actual data. Optimizing the com-

plexity of interactions in one type of link reduces the richness of communication

and leads to a considerable loss of information. Chouchane [19] similarly focused

his vision on solving the problem of the detection of atypical nodes. His work is

based on the use of a particular type of graph: a multidimensional graph. The

proposed method uses a static multidimensional graph where the nodes represent

the users and the edges represent the relations between the users in the diﬀerent

dimensions of the graph. The elaborate solution addresses the hypothesis that

an anomaly is a node sparsely connected to other nodes of the network, in all

dimensions. As a result, nodes with common neighbors receive a high AS(u)

score versus randomly connected nodes that do not share enough neighbors with

the rest of the network nodes. Thus, nodes with atypical connections will have

low scores compared to densely connected nodes. To avoid the problem of the

choice of anomalous nodes according to the weak score, the distribution of Beta

[19,20] is used to classify anomalies automatically. Because this method is based

on the assumption that an anomaly is a node with sparse connections that do

not belong to any dense region in all dimensions of a multidimensional network,

a node can be separated from other nodes in the space because it may be new on

the network, which means that it does not have many links. Thus, this solution

deals with static networks and does not take into consideration the structure of

the network that forces the construction of user communities.

The need for anomaly detection, as well as the limited work presented pre-

viously in multidimensional networks [19], motivated our interest for the devel-

opment of a method that deals with the problem of anomaly detection.

3 Anomaly Detection Method

Nowadays, social networks are multiplying and they are sometimes diﬃcult

to manage. The same user can have multiple accounts on diﬀerent social net-

works. However, synchronization between these diﬀerent accounts is necessary.

For example, synchronization allows the user to publish a photo simultaneously

via Instagram and Facebook in one click. A little work has been done in the

Detection of Users’ Abnormal Behavior on Social Networks 621

ﬁeld of anomaly detection on multidimensional networks, so the lack of papers

clearly describes the diﬃculty that arises in this ﬁeld. We present in this section

a method for detecting atypical nodes based on the analysis of the topology of

a multidimensional graph.

3.1 Notation

In our approach, we inspire the notation used in [21] to analyze the structure

of a multidimensional graph. An undirected multi-graph G is deﬁned by the

triplet (V,E, D) where Vis a set of nodes, Eis a set of edges, and Dis a set

of dimensions. An edge e∈Eis a triplet (u, v, d) where u, v ∈Vare nodes and

d∈D={Twitter, Facebook, Instagram,...}is a dimension. The triplet (u, v, d)

speciﬁes that nodes uand vare connected by an edge that belongs to dimension

d. Figure 1shows an example of a multidimensional networks.

Fig. 1. Example of multidimensional networks

Local graph’s properties must be used to help us detect atypical nodes. These

properties designate a single node (an ego) and its neighborhood at a ﬁrst level

(an egonet). As mentioned earlier, our approach operates in three phases: (1)

detection of communities in the diﬀerent dimensions of the graph, (2) estimation

of an anomaly score for each node and (3) automatic classiﬁcation of estimated

anomaly scores via the Beta distribution.

3.2 Phase 1: Detection of Communities in Diﬀerent Dimensions

The goal of community detection is to cluster the nodes in the graph into groups

that share common characteristics. This provision is true in the context of online

social networks [22]. Social networks’ users behave in the way of forming com-

munities based on their preferences and common interests. A range of techniques

has been presented in various works to address this general problem. An inter-

esting work [17] showed that the contribution of community detection appeared

in the usefulness of information extracted from the structure of communities

formed. This information facilitates the analysis of a user’s behavior and allows

622 N. E. H. Ben Chaabene et al.

the identiﬁcation of an abnormal behavior. In [16], the authors deﬁned that an

egonet(u) forms a community with the egonet(v) if at least half of the nodes of

the smaller egonet connect to the other egonet. The application of Eq. (1)[17]

allows us to calculate the communities of a graph.

Com(u, v)=⎧

⎨

⎩

1, if deg ree (u, v)norm ≥min (|u|,|v|)/2

0, otherwise

(1)

knowing that:

•Equation (2): Normalized external degree

degree(u, v )norm =degree(u, v)

min(|u|,|v|)(2)

•Equation (3): External degree of egonet(u) to egonet(v)

degree (u, v)=



Vegonet(u)∩Vegonet(v)



+



uv ∈E:u∈Vegonet(u),v ∈Vegonet(v)



,u,v ∈G

(3)

Where Vegonet(u)is the set of nodes of egonet(u)andVegonet(v)is the set of

nodes of egonet(v).

3.3 Phase 2: Anomaly Score Estimation

In this section, we developed a method that estimates an anomaly score between

0 and 1 for each node of the multidimensional network to make the right deci-

sion about the nature of the user’s behavior. First, we calculated the anomaly

score AS(u) of each node in each dimension di. Then, we calculated two other

scores DE(u)andnbct(u) in order to specify the inﬂuence of the node uon the

nodes belonging to its commonality. Finally, a total anomaly score AST (u)is

estimated.

•Step 1: We started by calculating the anomaly score of each node in each

existing dimension in our multidimensional network. The node can have a

score of: 0, 1 or 0.5. This score is attributed according to the inﬂuence of the

node on its community (see Eq. (4)).

AS(u)di=⎧

⎪

⎨

⎪

⎩

1,if(u∈Com)and (u influences the construction of the communauty)

0.5,if(u∈Com)and (u does not influence the construction of the communauty)

0,if(u/∈Com)and (u∈di)

(4)

A question that arises in the ﬁrst case is; how can a node strongly inﬂuence

the construction of the community? The answer requires the calculation of

two scores. The ﬁrst score DE(u) denotes the distance between the ego(u)

and the ego(v), and the second score nbct(u) represents the total number of

direct links from ego(u)toego(v). The Eqs. (5), (6) and (7) respectively show

the way of calculating the three scores DE(u), nbct(u)andnbc(u)ineach

dimension.

DE(u)di=number of possible outgoing links f rom the node(u)to all the

nodes of the C om(u)−number of outgoing link s fr om node(u)to its neighbors (5)

Detection of Users’ Abnormal Behavior on Social Networks 623

nbct(u)di=nbc(u)di

number of nodes that form the Com(u)(6)

knowing that:

nbc(u)di=number of direct links f rom the egonet(u)to the egonet(v)(7)

The comparison of the two scores DE(u)andnbct(u) allows us to deduce the

degree of inﬂuence of the node(u) on the nodes of its community. So, if the

score DE(u) is greater than or equal to the score nbct(u) then the node(u)has

a relation of average degree with the nodes which belong to its community.

And if the score DE(u) is lower than the score nbct(u) then the node(u)

is strongly connected to the nodes that form its community as expressed in

Eq. (8).

•Step 2: The computation of the total anomaly score of each node uis expressed

as a function of the sum of the anomaly scores of the node uin each domain,

and the number of domains where the node uexists. The Eq. (9) represents

the formula for calculating the total anomaly score for each node.

AS(u)di=⎧

⎪

⎨

⎪

⎩

if (u∈Com)then

1,DE(u)<nbct(u)

0.5,DE(u)≥nbct(u)

0, otherwise

(8)

AST (u)= AS(u)di

number of domains where (u)exists (9)

3.4 Phase 3: Automatic Detection of Abnormal Behavior

In the literature, there are two frequent solutions to the problem of classiﬁcation.

A ﬁrst solution is to rank the scores AST (u) by increasing rank by selecting the

ﬁrst or last knodes. However, the main diﬃculty of this solution is to specify the

value of kappropriate for all datasets, which can lead to an error. A second solu-

tion is to set a separation threshold between AST (u) scores. The identiﬁcation

of an appropriate threshold for all types of data is a diﬃcult task. To remedy

this, we use the mixing model of the Beta law which is an eﬀective way to solve

this type of problem [23–25]. It is then suﬃcient to determine the conditions

of applying the mixture model probability for the recognition of the behavior’s

nature.

The Beta distribution admits adaptability and ﬂexibility to model complex

and variable situations unlike other statistical distributions [27]. For example, the

Gaussian distribution only allows us to model symmetric modes, which express

the possibility of obtaining a less adequate modeling of the data [28]. The Beta

distribution is also characterized by modeling on various forms: the U form,

the L form, and the form of a line [27], which shows its strong adaptability for

accurate modeling of our anomaly scores.

624 N. E. H. Ben Chaabene et al.

The automatic anomaly detection phase requires the application of two algo-

rithms: (1) the estimation of the optimal number of components (p)oftheBeta

law algorithm and (2) the automatic identiﬁcation of abnormal nodes algorithm

[19]. These two previously mentioned algorithms can be summed up in three

steps: (1) the estimation of the parameters of a component, (2) the application

of the EM algorithm (Expectation Maximization) for the Beta distribution and

(3) the estimation of the components’ optimal number.

3.4.1 Initiation to the Model of the Beta Distribution

In the theory of probabilities and in statistics, the beta law is a family of laws

of continuous probabilities, deﬁned on the interval [0, 1], parametrized by two

shape parameters, typically denoted αand β. In our approach, the AST (u) scores

estimated in the previous phase are between 0 and 1. This score is distributed

according to the Beta law AST ∼Be(α;β), hence its probability density is as

follows: (see Eq. (10)):

B(AST )= 1

Be(α;β)AST α−1(1 −AST )β−1(10)

Knowing that the function Beta Be(α;β) and the Γfunction are deﬁned by

respectively the Eqs. (11) and (12).

Be(α;β)=Γ(α)Γ(β)

Γ(α+β)(11)

Γ:z→+∞

tz−1exp−tdt (12)

3.4.2 Estimation of the Parameters of a Component

To estimate the αand βparameters of a component of the Beta distribution,

it is essential to calculate the empirical average x(see Eq. (13)) of our sample

(N: the number of nodes in the graph) and the variance v(see Eq. (14)) of a

component.

x=1



i=1

ASTi(13)

v=1



i=1

(ASTi−x)2(14)

The estimates αand βare calculated respectively by the Eqs. (15) and (16).

α=xx(1 −x)

v−1(15)

β=(1−x)x(1 −x)

v−1(16)

Detection of Users’ Abnormal Behavior on Social Networks 625

3.4.3 Application of the EM Algorithm for Beta Distribution

As its name indicates, the ML approach consists in maximizing the likelihood,

i.e. maximizing L(Θ, AST )=



i=1



k=1

βkBk(ASTi;αk)or equivalently maximizing

the log likelihood l(Θ, AST )=



i=1

log k



k=1

βkBk(ASTi;αk)in order to estimate

the unknown parameters, with Θ=(α1,β

1, ......, αk,β

k)the unknown param-

eters of the parametric model. However, this maximization problem cannot be

solved analytically due to the hidden data. We must ﬁnd solutions using iterative

algorithms. Among these algorithms is the EM algorithm [29].

This algorithm aims at providing an estimator when it is impossible to cal-

culate the solution because of the presence of hidden or missing data or rather,

when the knowledge of these data would make it possible to estimate the param-

eters. The EM algorithm takes its name from the fact that at each iteration it

operates two distinct steps:

(i) the “Expectation” phase, often referred to as “step E ”, proceeds to the

estimation of the unknown data, taking into account the observed data and

the value of the parameters determined at the previous iteration;

(ii) the “Maximization ” phase, or “M”stage, thus proceeds to the maximiza-

tion of the likelihood, made possible now by using the estimation of the

unknown data carried out in the previous step, and updates the value of the

parameter(s) for the next iteration.

The algorithm ensures that the likelihood increases with each iteration, which

leads to more and more accurate estimators.

3.4.4 Estimation of the Number of Components (p)oftheBeta

Distribution

To ﬁnd the right model for our data, we must estimate the number of components

pand the αand βparameters of each component. The number of components

pvaries between 1 and p−max. For each calculated component, performance

metrics are determined to identify the optimal number of components. To do

this, the Bayesian Information Criterion (BIC) criterion was used [30]. The

BIC criterion is written as follows: (see Eq. (17))

BIC(p)=−2 log(Lp)+kplog(N)(17)

with L: the likelihood of the estimated model, N: the number of observations in

the sample and k: the total number of estimated model parameters.

626 N. E. H. Ben Chaabene et al.

4 Experimentation and Evaluation

This section presents an empirical assessment of the performance of our app-

roach on a three-dimensional network. The diﬀerent dimensions of the tested

network are as follows : (1) Facebook, (2) Twitter and (3) Instagram. We have

no prior knowledge of node partitioning, i.e., prior knowledge of whether nodes

belong to a speciﬁc category (anomaly node or normal node), is missing. Because

of this, we cannot use supervised metrics that rely on the existence of a refer-

ence partitioning. In this context, we have adopted an objective approach that

consists in interpreting node deviations by manual investigation and graphical

visualization of the adjacency matrix of the network. For each node of the net-

work, we estimated a total anomaly score. We then modeled the distribution of

these scores according to our probabilistic model, which exploits the distribution

of the mixing law.

This is represented on the density curve of the anomaly scores of the studied

network (see Fig. 2). The density curve allowed us to note the great ﬂexibility

and adaptability of the Beta mixing model to model the distributions. In Fig. 2,

the ﬁrst component (where the values are close to zero) represents the values of

the lowest anomaly scores. As a result, the nodes associated with the scores that

are grouped in this component are identiﬁed as anomalies.

For the tested network, approximately 10Kof nodes in a set of 397Kwere

selected as nodes with atypical connections. Figure 3presents the adjacency

matrix of the three dimensions of the network so that the nodes are sorted

in an ascending order with respect to their anomaly scores. With the weakest

AST (u) scores, the anomalies are placed at the top of the matrix. Consequently,

these anomalies are connected in a sparse way on the network, whereas the nor-

mal nodes are closely connected and are manifested on the matrix by the dense

regions.

Fig. 2. Probability density of the estimated scores

Detection of Users’ Abnormal Behavior on Social Networks 627

Fig. 3. Adjacency matrix of the studied network

5 Conclusion

In this article, we have studied several methods and approaches for detecting

anomalies in OSNs. Reviewed works suﬀer from the lack of synchronization of

user accounts. Considering OSN as a multidimensional graph in our new app-

roach, these networks are analyzed based on the relationships between the nodes.

Deﬁned graph metrics are calculated to estimate the anomaly score of each node

and a classiﬁcation with the Beta distribution algorithm is established for the

detection of atypical nodes. To conclude, following the quality of the results

obtained, we believe that this work presents an eﬀective means that can be

applied in diﬀerent practical contexts. In our future research, we will explore

diﬀerent ways to extend this work. One of the possibilities to consider is the

upstream analysis of a node’s behavior, and the prediction of the inﬂuence of

the nodes with abnormal behavior on the rest of the network users.

References

1. Anand, K., Kumar, J., Anand, K.: Anomaly detection in online social network: a

survey. In: 2017 International Conference on Inventive Communication and Com-

putational Technologies (ICICCT), pp. 456–459 (2017)

2. Grubbs, F.E.: Procedures for detecting outlying observations in samples. Techno-

metrics 11, 1–21 (1969)

3. John, G.H.: Robust decision trees: removing outliers from databases. In: Proceed-

ings of KDD, pp. 174–179 (1995)

4. Aggarwal, C.C., Yu, P.S. Outlier detection for high dimensional data. ACM SIG-

MOD Rec. (2002). https://doi.org/10.1145/376284.375668

5. Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Com-

put. Surv. 41, 1–72 (2009)

6. Savage, D., Zhanga, X., Yua, X., Chouab, P., Wanga, Q.: Anomaly detection in

online social networks. Soc. Netw. 39, 62–70 (2014)

7. Hawkins, D.M.: Identiﬁcation of Outliers, vol. 11. Springer, Dordrecht (1980)

8. Kaur, R., Singh, S.: A survey of data mining and social network analysis based

anomaly detection techniques. Egypt. Inf. J. 17, 199–216 (2016)

628 N. E. H. Ben Chaabene et al.

9. Vanetti, M., Binaghi, E., Carminati, B., Carullo, M., Ferrari, E.: Content-based

ﬁltering in on-line social networks. In: Dimitrakakis, C., Gkoulalas-Divanis, A.,

Mitrokotsa, A., Verykios, V.S., Saygin, Y. (eds.) Privacy and Security Issues in

Data Mining and ML, vol. 6549, pp. 127–140. Springer, Heidelberg (2011)

10. Holland, P.W., Leinhardt, S.: The structural implications of measurement error in

sociometry. J. Math. Sociol. 3(1), 85–111 (1973)

11. Viswanath, B., Bashir M.A, Crovella, M., Guha, S., Gummadi, K.P., Krishna-

murthy, B., Mislove, A.: Towards detecting anomalous user behavior in online social

networks. In: Proceedings of the 23rd USENIX Security Symposium (USENIX

Security) (2014)

12. Xiao, C., Freeman, D.M., Hwa, T.: Detecting clusters of fake accounts in online

social networks. In: Proceedings of the Eighth ACM Workshop on Artiﬁcial Intel-

ligence and Security, pp. 91–101 (2015)

13. Getoor, L., Dieh, C.P.: Link mining - a survey. ACM SIGKDD Explor. Newslett.

7, 3–12 (2005)

14. Fire, M., Katz, G., Elovici, Y.: Strangers intrusion detection - detecting spammers

and fake proﬁles in social networks based on topology anomalies. ASE Hum. J.

1(1), 26–39 (2012)

15. Zheleva, E., Getoor, L., Golbeck, J., Kuter, U.: Using friendship ties and fam-

ily circles for link prediction. In: Giles, L., Smith, M., Yen, J., Zhang, H. (eds.)

Advances in Social Network Mining and Analysis, vol. 5498, pp. 97–113. Springer,

Heidelberg (2008)

16. Akoglu, L., McGlohon, M., Faloutsos, C.: OddBall: spotting anomalies in weighted

graphs. In: Paciﬁc-Asia Conference on Knowledge Discovery and Data Mining, vol.

13, pp. 410–421 (2010)

17. Hassanzadeh, R., Nayak, R., Stebila, D.: Analyzing the eﬀectiveness of graph met-

rics for anomaly detection in online social networks. In: Proceedings of the 13th

International Conference on Web Information Systems Engineering (2012)

18. Rezaei, A., Kasirun, Z.M., Rohani, V.A., Khodadadi, T.: Anomaly detection in

online social networks using structure based technique. In: Eighth International

Conference on Internet Technology and Secured Transactions (ICITST), pp. 619–

622 (2013)

19. Chouchane, A., Bouguessa, M.: Identifying anomalous nodes in multidimensional

networks. In: International Conference on Data Science and Advanced Analytics

(2017)

20. Kruegel, C., Mutz, D., Robertson, W., Valeur, F.: Bayesian event classiﬁcation

for intrusion detection. In: Proceedings of the 19th Annual Computer Security

Applications Conference, pp. 14–23 (2003)

21. Boccaletti, S., Bianconi, G., Criado, R., Del Genio, C., G´omez-Garde˜nes, J.,

Romance, M., Sendi˜na-Nadal, I., Wang, Z., Zanin, M.: The structure and dynamics

of multilayer networks. Phys. Rep. 544, 1–22 (2014)

22. Yang, Y., Guo Y.C., Ma. Y.N.: Characterization of communities in online social

network. In: Proceedings of 2010 Cross-Strait Conference on Information Science

and Technology, pp. 600–605 (2010)

23. Sawadogo, I., Odongo, L., Ly, I.: Maximum likelihood estimation of the parameters

of exponentiated generalized weibull based on progressive type ii censored data.

Open J. Stat. 7(6), 956–963 (2018)

24. Djrobie, D.: Mod`ele de m`elange et classiﬁcation. Open J. Stat (2016)

25. Couton, F., Danech, M., Broniatousk, M.: Application des m´elanges de lois de

probabilit´e`a la reconnaissance de regime traﬁc routier. RTS-Recherche n53, 49–57

(1996)

Detection of Users’ Abnormal Behavior on Social Networks 629

26. Chen, Z., Hendrix, W., Samatova, N.F.: Community-based anomaly detection in

evolutionary networks. J. Intell. Inf. Syst. 39, 59–85 (2012)

27. Ma, Z., Leijon, A.: Bayesian estimation of beta mixture models with variational

inference. IEEE Trans. Pattern Anal. Mach. Intell. 33, 2160–2173 (2011)

28. Boutemedjet, S., Ziou, D., Bouguila, N.: Model-based subspace clustering of non-

gaussian data. Neurocomputing 73, 1730–1739 (2010)

29. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete

data via the EM algorithm. J. Roy. Stat. Soc. 39, 1–38 (1977)

30. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

Deep learning methods for anomalies detection in social networks using multidimensional networks and multimodal data: a survey

Article

Full-text available

Jan 2021
MULTIMEDIA SYST

Anomaly in Online Social Network can be designated as an unusual or illegal activity of an individual. It can also be considered as an outlier or a surprising truth. Due to the emergence of social networking sites such as Facebook, Instagram, etc., the number of negative impacts of aggressive and bullying phenomena has increased exponentially. Anomaly detection is a problem of crucial importance which has attracted researchers since the 2000s. This problem is often carried out, thanks to deep learning, artificial intelligence and statistics. Several methods have been devoted to solving the problem of detecting abnormal behavior on social media, which are kept under three different types: structural methods which are based on the analysis of graphs of social networks, behavioral methods which are based on the extraction and analysis of user activities and hybrid methods which combine the two types of methods mentioned above. This survey reviews various methods of data mining for the detection of anomalies to provide a better assessment that can facilitate the understanding of this area.

Abnormal Behavior Analysis Based on Truth Discovery and Machine Learning

Conference Paper

Jul 2022

Maximum Likelihood Estimation of the Parameters of Exponentiated Generalized Weibull Based on Progressive Type II Censored Data

Article

Full-text available

Jan 2017

A survey of data mining and social network analysis based anomaly detection techniques

Article

Full-text available

Dec 2015

With the increasing trend of online social networks in different domains, social network analysis has recently become the center of research. Online Social Networks (OSNs) have fetched the interest of researchers for their analysis of usage as well as detection of abnormal activities. Anomalous activities in social networks represent unusual and illegal activities exhibiting different behaviors than others present in the same structure. This paper discusses different types of anomalies and their novel categorization based on various characteristics. A review of number of techniques for preventing and detecting anomalies along with underlying assumptions and reasons for the presence of such anomalies is covered in this paper. The paper presents a review of number of data mining approaches used to detect anomalies. A special reference is made to the analysis of social network centric anomaly detection techniques which are broadly classified as behavior based, structure based and spectral based. Each one of this classification further incorporates number of techniques which are discussed in the paper. The paper has been concluded with different future directions and areas of research that could be addressed and worked upon.

Identifying Anomalous Nodes in Multidimensional Networks

Conference Paper

Oct 2017

Anomaly detection in online social network: A survey

Conference Paper

Mar 2017

Procedures for detecting outlying observations in samples

Article

Jan 1956
TECHNOMETRICS

F.E. Grubbs

Towards detecting anomalous user behavior in online social networks

Article

Jan 2014

Detecting Clusters of Fake Accounts in Online Social Networks

Conference Paper

Oct 2015

Fake accounts are a preferred means for malicious users of online social networks to send spam, commit fraud, or otherwise abuse the system. A single malicious actor may create dozens to thousands of fake accounts in order to scale their operation to reach the maximum number of legitimate members. Detecting and taking action on these accounts as quickly as possible is imperative in order to protect legitimate members and maintain the trustworthiness of the network. However, any individual fake account may appear to be legitimate on first inspection, for example by having a real-sounding name or a believable profile. In this work we describe a scalable approach to finding groups of fake accounts registered by the same actor. The main technique is a supervised machine learning pipeline for classifying {\em an entire cluster} of accounts as malicious or legitimate. The key features used in the model are statistics on fields of user-generated text such as name, email address, company or university; these include both frequencies of patterns {\em within} the cluster (e.g., do all of the emails share a common letter/digit pattern) and comparison of text frequencies across the entire user base (e.g., are all of the names rare?). We apply our framework to analyze account data on LinkedIn grouped by registration IP address and registration date. Our model achieved AUC 0.98 on a held-out test set and AUC 0.95 on out-of-sample testing data. The model has been productionalized and has identified more than 250,000 fake accounts since deployment.

Identification of Outliers.

Article

Dec 1981

Estimating the Dimension of a Model

Article

Jan 1978
ANN STAT

Gideon Schwarz

The Structure and Dynamics of Multilayer Networks

Article

Nov 2014
PHYS REP

In the past years, network theory has successfully characterized the interaction among the constituents of a variety of complex systems, ranging from biological to technological, and social systems. However, up until recently, attention was almost exclusively given to networks in which all components were treated on equivalent footing, while neglecting all the extra information about the temporal- or context-related properties of the interactions under study. Only in the last years, taking advantage of the enhanced resolution in real data sets, network scientists have directed their interest to the multiplex character of real-world systems, and explicitly considered the time-varying and multilayer nature of networks. We offer here a comprehensive review on both structural and dynamical organization of graphs made of diverse relationships (layers) between its constituents, and cover several relevant issues, from a full redefinition of the basic structural measures, to understanding how the multilayer nature of the network affects processes and dynamics.

Detection of Users’ Abnormal Behavior on Social Networks

Abstract and Figures

Recommended publications

Deep learning methods for anomalies detection in social networks using multidimensional networks and...

Anomaly Detection in Social Networks

A Novel Graph Centrality Based Approach to Analyze Anomalous Nodes with Negative Behavior

Détection d'utilisateurs violents et de menaces dans les réseaux sociaux