ArticlePDF Available

P³: Privacy-Preserving Scheme Against Poisoning Attacks in Mobile-Edge Computing

Authors:

Abstract and Figures

Mobile-edge computing (MEC) has emerged to enable users to offload their location data into the MEC server, and at the same time, the MEC server executes the location-aware data processing to compute the statistical results about these collected locations. However, malicious users may deliberately generate poisoning locations and send these poisoning locations to the MEC server, aiming to poison the statistical results learned by the MEC server and even the other users' location privacy. Existing work concerning privacy preservation in MEC has not studied such poisoning attacks in MEC. Another line of somehow related work focused on poisoning attacks in a different scenario-adversarial machine learning. However, MEC exhibits different features with the machine learning settings, and thus, the privacy preservation against poisoning attacks in MEC faces significantly new challenges. To address the problem, we propose the privacy-preserving scheme, i.e., privacy-preserving scheme against poisoning (P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> ), that utilizes the feature learning model to infer the social relationships among users from their location data and then constructs the inferred social graph. Thereafter, it searches the optimal map between the inferred social graph and the social graph from social networks to identify the poisoning locations. Experiments on two real-world data sets, two baseline works, and two kinds of poisoning attacks have demonstrated the privacy preservation against the poisoning attacks in MEC P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> provides.
Content may be subject to copyright.
818 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 3, JUNE 2020
P3: Privacy-Preserving Scheme Against Poisoning
Attacks in Mobile-Edge Computing
Ping Zhao, Member, IEEE, Haojun Huang ,Member, IEEE, Xiaohui Zhao, and Daiyu Huang
Abstract— Mobile-edge computing (MEC) has emerged to
enable users to offload their location data into the MEC server,
and at the same time, the MEC server executes the location-aware
data processing to compute the statistical results about these
collected locations. However, malicious users may deliberately
generate poisoning locations and send these poisoning locations
to the MEC server, aiming to poison the statistical results learned
by the MEC server and even the other users’ location privacy.
Existing work concerning privacy preservation in MEC has
not studied such poisoning attacks in MEC. Another line of
somehow related work focused on poisoning attacks in a different
scenario—adversarial machine learning. However, MEC exhibits
different features with the machine learning settings, and thus,
the privacy preservation against poisoning attacks in MEC faces
significantly new challenges. To address the problem, we propose
the privacy-preserving scheme, i.e., privacy-preserving scheme
against poisoning (P3), that utilizes the feature learning model
to infer the social relationships among users from their location
data and then constructs the inferred social graph. Thereafter,
it searches the optimal map between the inferred social graph and
the social graph from social networks to identify the poisoning
locations. Experiments on two real-world data sets, two baseline
works, and two kinds of poisoning attacks have demonstrated
the privacy preservation against the poisoning attacks in MEC
P3provides.
Index Terms—Feature learning, location privacy, mobile-ed ge
computing (MEC), poisoning attacks, social relationship.
I. INTRODUCTION
WITH the development of the mobile Internet, mobile-
edge computing (MEC) has been widely applied.
In MEC systems, as shown in Fig. 1, users offload their
location data into the MEC server, and the MEC server
computes the statistical results about these locations to support
Manuscript received July 1, 2019; revised October 7, 2019; accepted
November 18, 2019. Date of publication February 20, 2020; date of current
version June 10, 2020. This work was supported in part by the National
Natural Science Foundation of China under Grant 61902060, Grant 61801106,
Grant 61671216, Grant 61977064, and Grant 61871436, in part by the Shang-
hai Sailing Program under Grant 19YF1402100, in part by the Chenguang
Program supported by the Shanghai Education Development Foundation and
the Shanghai Municipal Education Commission, in part by the Fundamental
Research Funds for the Central Universities under Grant 2232019D3-51,
in part by Initial Research Funds for Young Teachers of Donghua University,
and in part by the Shanghai Rising-Star Program under Grant 19QA1400300.
(Corresponding author: Haojun Huang.)
Ping Zhao, Xiaohui Zhao, and Daiyu Huang are with the College of
Information Science and Technology, Donghua University, Shanghai 201620,
China (e-mail: pingzhao2018ph@dhu.edu.cn; 160910903@mail.dhu.edu.cn;
160910309@mail.dhu.edu.cn).
Haojun Huang is with the School of Electronic Information and
Communications, Huazhong University of Science and Technology, Wuhan
430074, China (e-mail: hjhuang@hust.edu.cn).
Digital Object Identifier 10.1109/TCSS.2019.2960824
new edge intelligence applications. However, malicious users
(hereafter, adversaries) may deliberately generate and send
poisoning locations to the MEC server to poison the statistical
results about users’ locations, incurring the MEC server and
even other users suffering from the poisoning attacks. Such
poisoning attacks have been the bottleneck for the wide
development and applications of MEC systems and, thus, have
attracted widespread concern [1]–[3].
Existing work concerning privacy preservation in
MEC [4]–[11] mainly used anonymous mechanisms and
traffic detection techniques combined with machine learning
to protect users’ location privacy or the proposed relevant
algorithms to minimize the loss of privacy in MEC systems,
such as task offloading, mobile support system (MSS),
blockchain, and so on. However, these studies neither study
the defense mechanisms against poisoning attacks nor focus
on poisoning attack schemes. Another line of somehow related
work [12]–[19] focused on poisoning attacks in an adversarial
machine learning scenario. Nevertheless, compared with the
machine learning settings, MEC is a different scenario and
exhibits different features, e.g., task offloading and time-delay
sensitivity. Other works [20]–[27] focused on poisoning attack
schemes in the Internet of Things (IoT). Unfortunately, these
works did not study defense mechanisms against poisoning
attacks. In addition, several works [28]–[32] focused on the
defenses against poisoning attacks in scenarios, machine
learning, named data networks, and IoT. However, these
works considered different scenarios rather than the scenario,
MEC, and more importantly, MEC exhibits different features
that bring in extremely new challenges to the defense
mechanisms against poisoning attacks in MEC. In summary,
it is necessary to study defense mechanisms against poisoning
attacks in MEC.
To address the above problem, in this article, we propose P3,
i.e., the Privacy-Preserving scheme against Poisoning attacks
in MEC. Specifically, it first utilizes the feature learning model
to infer the social relationship among users from their location
data and then constructs the inferred social graph with the
help of the inferred relationships. On this basis, it searches the
optimal map between the inferred social graph and the social
graph from social networks to identify the poisoning locations.
Finally, it validates the performance utilizing real-world data
sets. In summary, we make the following contributions.
1) We propose to utilize the feature learning model to
construct the inferred social graph without any domain
experts’ knowledge. Most existing works relied on
2329-924X © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
ZHAO et al.:P
3ATTAC K S I N M E C 819
Fig. 1. Illustration of poisoning attacks in MEC.
the simple heuristic algorithm that involves too much
domain experts’ knowledge and, thus, incurs too many
errors. On the contrary, with the knowledge of loca-
tions collected by the MEC server, we propose to first
characterize the social behavior of users’ mobility using
feature learning and then, on this basis, infer the social
relationships and construct the inferred social graph.
2) We propose to mirror the inferred relationships to their
social relationships in social networks to identify the
poisoning locations. To concrete, the inferred social
graph is structurally correlated with the social graph
in social networks, and thus, the inferred relationships
can be mapped to the relationships in a social graph.
As a result, the poisoning locations can be identified via
searching the optimal map between the inferred social
graph and the social graph from social networks.
3) We use two real-world data sets to validate the effective-
ness of the proposed scheme P3, and simulation results
validate the location privacy preservation against the
poisoning attacks. Specifically, we use loc-Gwalla and
loc-Brightkite data sets, compare P3with two baseline
work, and investigate the performance in two kinds of
poisoning attacks. Moreover, simulation results prove
that P3outperforms the two baselines in both the two
data sets and the two kinds of poisoning attacks.
The remainder of this article is organized as follows.
Section II introduces the adversary model. Then, Section III
proceeds to describe the design of the privacy-preserving
scheme P3in detail, followed by the evaluation in Section IV.
Section V reviews the related work. Finally, Section VI
concludes this article.
II. ADVER SA RY MODEL
As shown in Fig. 1, users offload the location data to
the MEC server, and the MEC server computes the corre-
sponding results and returns to the users. At the same time,
the MEC server collects these users’ locations and executes
a certain kind of statistical function to calculate the statistical
results about these collected locations. These statistical results
are expected to support new edge intelligence applications.
However, malicious users (i.e., adversaries) generate poisoning
locations and send these poisoning locations to the MEC
server, aiming to poison the statistical results learned by the
MEC server [33], [34]. What’s more, these poisoned statistical
results will further lead to serious errors in applications,
e.g., map inference and smart transportation. For example,
in the hearing “The Dawn of AI,” the vulnerabilities of systems
to poisoning attacks have attracted a widespread concern of
experts from academia and industry.
Adversaries have the background knowledge of the locations
of a small present of users. They conduct poisoning attacks
via generating poisoning locations and send these poisoning
locations to the MEC server on the basis of these known
locations. Adversaries’ capability is limited by the number of
these poisoning points. In this article, adversaries are assumed
to be able to control a small present of users, and the rate
of these poisoning locations is set less than 20%. In the
scenario, MEC, adversaries indeed can inject a small present of
poisoning locations since a large number of users involve the
MEC applications and send locations to the MEC server. Both
the MEC server and the cloud server are assumed to be trusted
and honestly perform the proposed scheme P3. The goal of P3
is to identify these poisoning locations.
III. DESIGN OF P3
The main idea behind the proposed scheme P3,asshown
in Fig. 2, is that it first characterizes the social behavior of
users’ mobility using feature learning since users’ mobility
behavior is also shaped by their social relationships (i.e., why
they move) [1]–[3]. Then, on this basis, it infers the social
relationships among users and constructs the inferred social
graph. Thereafter, it searches the optimal map between the
inferred social graph and the social graph from social networks
utilizing the structural correlations between the two graphs
to identify the poisoning locations. In summary, it mainly
includes two steps: constructing inferred social graph and
mapping the inferred social graph and the social graph from
social networks.
A. Construction of Inferred Social Graph
We first search the mobility neighborhoods using the
random walk, which is shown in Fig. 3. Specifically, we
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
820 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 3, JUNE 2020
Fig. 2. Work flow of P3.
Fig. 3. Illustration of construction of inferred social graph. (a) Users and locations. (b) Weighted bipartite graph Go. (c) Random walk traces.
construct the weighted bipartite graph Go={U,L,E}
[see Fig. 3(a) and (b)], where Uis the set of users, Lis the set
of locations of users, and Eis the set of edges connecting users
and locations. The weight wu,lof a specific edge (u,l)E
(uU,lL) is the times of the user uchecking in the
location l. We define the graph neighborhoods of a specific
node xULby gn(x), which is the set of nodes connecting
with x. Then, we generate πrandom walk traces [see Fig. 3(c)]
for each user, and the walk length of each trace is wl.Fora
specific user u, we denote the current node in the random
walk trace by nownand the next node by nextn. Thereafter,
we define the next node in the random walk trace nextnin (1),
shown at the bottom of this page. For example, the next
node in the random walk trace nextnis sampled with the
probability P(nextn=x/nown)[see (1)]. Then, the mobility
neighborhoods gn(u)of the user uare the nodes before and
after the node u(i.e., user u) in all the πrandom walk
traces.
Then, we use the machine learning model, skip-gram,
to map the random walk traces to vectors and infer the rela-
tionships among users using such vectors. To make concrete,
we assume that the random walk traces of each user (e.g., u)
are mapped to a vector α(u), and the vectors of all nodes are
denoted by α. We define the objective function as
arg max
α|ULd
nULim(n)
exp(α(i)·α(n))
jULexp(α( j)·α(n)) .(2)
Then, we use the negative sampling approach to reduce the
computation cost and redefine the objective function as
arg max
α|ULd
nULim(n)log 1
1+exp(α(i).α(n))
+nULim(n)log 1
1+exp(α(i).α(n)) .(3)
Thereafter, in the learning process, we use the stochastic gradi-
ent descend. Finally, we compare the cosine similarity θ(u,v)
of any two users uand v. When the cosine similarity θ(u,v)
meets θ(u,v) =(α(u)·α(v))/(α(u)2×α(v) 2)θo
(θois the threshold), the users uand vare regarded to be
friends. As a result, a inferred social graph G[see Fig. 4(a)] is
constructed, where nodes represent users, and edges represent
friendships.
B. Optimal Map Between Inferred Social Graph and Social
Graph
We first select klandmarks. Specifically, in both inferred
social graph and the social graph from social networks,
the nodes with high betweenness are selected as the landmarks.
The betweenness of a specific node quantifies the number of
shortest paths pass through the node and is measured via an
opportunity network [35]. For example, in Fig. 4, the red nodes
in an inferred social graph Gand a social graph Gare picked
out and regarded as the landmarks.
P(nextn=x/nown)=
wnown,x
|gn(nown)|
i=1wnown,i
if nownUxgn(nown)(nown,x)E
wx,nown
|gn(nown)|
i=1wi,nown
if nownLxgn(nown)(nown,x)E
0else
(1)
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
ZHAO et al.:P
3ATTAC K S I N M E C 821
Fig. 4. Illustration of map between inferred social graph and social graph from social networks in the loc-Gwalla data set [37]. (a) Nodes represent users,
and edges indicate the inferred social relationships. (b) Nodes represent users, and edges indicate the social relationships from social networks. Red nodes
in (a) and (b) are selected as the landmarks, and k=14.
Then, based on the klandmarks, we search for the optimal
map between the inferred social graph Gand social graph G.
Specifically, for each node cin inferred social graph G,
the distance between the node cand the klandmarks in Gis
{dc1,dc2,...,dck}. Likewise, in social graph G, the distance
between a specific node sand the klandmarks in Gis
{ds1,ds2,...,dsk}. Then, we define the map score between
node cin inferred social graph Gand node sin social graph
Gas (k
i=1(dci dsi)2)1/2.Giventheklandmarks in G
and kand landmarks in G,therearek!possible maps. For
each map between the landmarks in Gand G,weusethe
Hungarian algorithm [36] to search for maps for the remaining
nodes in inferred social graph Gand social graph G.For
k!possible maps, it repeats this operations k!times. Denote
the outputs of the k!operations by O1,O2,...,Oi,...,Ok!,
which records the maps among nodes in Gand G. Denote
Oi={(c1,s1),...,(cj,sj),...},wherecjand sjare nodes
in inferred social graph Gand social graph G, respectively,
and node cjin Gis mapped to node sjin the jth operations.
Finally, we select the output (e.g., Oi) with the largest match
score (j=1(k
i=1(dcjidsji)2)1/2) as the optimal map
between the inferred social graph Gand social graph Gfrom
social networks. The locations of users in inferred social graph
Gunmapped to the users in social graph Gare identified as
the poisoning locations.
Note that existing work [38] has validated that the between-
ness of nodes in the social graph follows a heavy-tailed
distribution and that there exist only a small number of nodes
with high betweenness. Thus, k| G|,where|G|is the
number of nodes in G. Therefore, although there are k!maps
between the klandmarks in Gand klandmarks in G,itis
possible to brute force enumeration.
IV. PERFORMANCE EVAL U AT I O N
A. Setup
1) Data Set: We use two real-world data sets, i.e., loc-
Gwalla and loc-Brightkite [37], to validate the performance
of P3. The loc-Gwalla data set consists of 196 591 users
(i.e., nodes), 950 327 edges (i.e., friendships), and 6.4 million
locations from February 2009 to October 2010. Another data
set, i.e., loc-Brightkite, records 4.5 million locations of 58 228
users from April 2008 to October 2010 and 214 078 edges
among users.
2) Baseline Work for Comparison: In addition, we compare
this article P3with two baseline works dubbed as Base-
line1 and Baseline2 since there is no existing work concerning
location privacy protection against poisoning attacks in MEC.
Specifically, in Baseline1, algorithms know the number of
poisoning locations nin the location data set and randomly
pick out nlocations from the Nlocations in the location data
set as the poisoning locations. On the contrary, in Baseline2,
algorithms randomly pick out a certain number of locations
from the Nlocations without knowing the number of poison-
ing locations and regard these selected locations as poisoning
locations. Nis the number of locations in the location data
set. Note that both the two baseline works indeed quantify
the performance of random guessing, but the first baseline
work randomly guesses with the knowledge of the number of
poisoning locations in the location data set.
3) Poisoning Attacks: Furthermore, to quantify the privacy
preservation that P3provides, we consider two kinds of
poisoning attacks. To make concrete, in the first kind of
poisoning attacks, adversaries use the locations synthesized in
the existing work [39] as the poisoning locations. The existing
work [39] considered the geographic and semantic features
of locations when synthesizing locations. In the second kind
of poisoning attacks, adversaries randomly generate locations
in the region bounded by users’ locations as the poisoning
locations. Therefore, the poisoning locations in the first kind
of poisoning attacks are more plausible to users’ locations.
Note that, in fact, the two kinds of poisoning attacks repre-
sent the sophisticated and straightforward poisoning strategies,
respectively. Thereafter, the two kinds of poisoning attacks are
dubbed as Poison1 and Poison2 for the ease of presentation.
4) Metrics: Whatsmore,weusethemetric, identification
rate, i.e., the failure rate of poisoning attacks. The definition of
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
822 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 3, JUNE 2020
Fig. 5. Identification rate varying with the number of locations Nin both loc-Gwalla and loc-Brightkite with the rate of poisoning locations n/N=0.2and
the length of each user’s trace le =200. (a) Identification rate in the loc-Gwalla data set. (b) Identification rate in the loc-Brightkite data set.
the identification rate is as follows. In addition, we investigate
the impact of the number of landmarks k, the number of
locations N, the rate of poisoning locations n/N(the number
of poisoning locations n), the walk length wl, the length of
each user’s trace le, and the threshold θoon the identification
rate.
Definition 1: Assume that npoisoning locations are iden-
tified. Then, the identification rate, i.e., the failure rate of
poisoning attacks, is n/n.
5) Parameter Settings: Other default parameters are set as
follows: the number of landmarks k(3,7); the number of
locations N(300,640)million in the loc-Gwalla data set,
and N(200,450)million in the loc-Brightkite data set; the
rate of poisoning locations n/N(1%,20%); the length of a
specific user’s trace is within (200,500); the walk length of
the random walk trace wl(10,100); the threshold θo=0.8;
and the dimension of the learned vectors d=128. Simulations
are implemented in C++ and conducted on a desktop PC with
an Intel Core i7 3.41GHz processor and 8GB RAM.
B. Impact of Number of Locations N on Identification Rate
Fig. 5 shows the impact of the number of locations Non
the identification rate. It can be observed that the identification
rates in the three algorithms, Baseline1, Baseline2, and P3,
decrease with the increasing number of locations N.The
reason is that, intuitively, the poisoning locations are more
likely to be cloaked with the locations of users when the
number of locations Nis enlarged, thereby decreasing the
identification rate. Moreover, in the first step, constructing
the inferred social graph, more users definitely result in
incorrect social relationships inferred by the algorithm P3and
further lead to the failure of identifying poisoning locations.
In Baseline1 and Baseline2, it is more difficult for algorithms
to pick out the poisoning locations via random guess when the
number of locations Nincreases. In addition, the identification
rate in P3is less affected by the number of locations Nthan
that in baselines, as P3executes the sophisticated algorithms to
analyze the relationships among users and the map between the
inferred social graph and social graph from social networks.
In addition, the identification rate in the Baseline1 algorithm
is much larger than that in Baseline2 since the algorithm in
Baseline1 obtains more side information about the poisoning
locations, i.e., the number of poisoning locations n.
In addition, when adversaries launch the first kind of
poisoning attacks, the identification rate in P3varies within
(0.6029,0.796), and the identification rates in baselines
decrease to 0.174 and 0.0198 from 0.2and0.06, respectively.
On the contrary, when adversaries launch the second kind
of poisoning attacks, the identification rates in P3, Baseline1,
and Baseline2 vary from (0.7485,0.7975),(0.1744,0.2),and
(0.0198,0.066), respectively. It is obvious that the first kind
of poisoning attacks, i.e., Poson1, is more affected by the
increment of the number of locations N. What’s more, poi-
soning attacks, Poson1, are more effective than the second
kind of poisoning attacks, Poson2, as the identification rates
in the three algorithms in Poson1 are less than that in Poson2.
It is attributed to that the poisoning locations generated in
Poson1 exhibit much similar mobility models and are plausible
to users’ locations.
Furthermore, it is interesting to observe that the identifi-
cation rates in the loc-Gwalla data set are larger than those
in the loc-Brightkite data set. To this end, we analyze the
two data sets and find that the average degree and graph
density in loc-Gwalla and loc-Brightkite are 9.7, 4.92 E5,
and 7.5, 1.32E5, respectively. It means that the loc-Gwalla
data set includes much more relationships among users than
the loc-Brightkite data set, and thus, it is difficult for the
three algorithms to identify the poisoning locations in the
loc-Brightkite data set with less relationship.
C. Impact of Rate of Poisoning Locations n/Non
Identification Rate
Fig. 6 shows the impact of the rate of poisoning locations
n/Non the identification rate. We can observe that the
identification rate in the algorithm, P3, increases with the
rate of poisoning attacks n/Nwhen n/Nincreases from 1%
to 15% in the loc-Gwalla data set and from 1% to 13% in
the loc-Brightkite data set. Then, the identification rate in P3
slowly changes when the rate of poisoning attacks n/Ncontin-
ually increases in both the loc-Gwalla and loc-Brightkite data
sets. The reason is that, in algorithm P3, when a small number
of poisoning locations exist in data sets, poisoning locations
are more likely to be cloaked in the users’ locations both in
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
ZHAO et al.:P
3ATTAC K S I N M E C 823
Fig. 6. Impact of the rate of poisoning locations n/Non the identification rate in both (a) loc-Gwalla and (b) loc-Brightkite with the number of locations
in loc-Gwalla N=300 billion, the loc-Brightkite N=200 billion, and the rate of poisoning locations n/N=0.2.
Fig. 7. Impact of the rate of the length of user’s trace le on the identification rate in both (a) loc-Gwalla and (b) loc-Brightkite with the number of locations
in loc-Gwalla N=300 billion, the loc-Brightkite N=200 billion, and the length of each user’s trace le =200.
the two steps: constructing inferred social graph and mapping
the inferred social graph and the social graph from the social
networks. As a result, P3can identify more poisoning loca-
tions when the rate of poisoning locations n/Nvaries within
(1%,15%). Likewise, it is easier for the Baseline1 algorithm to
pick out poisoning locations when n/Nis increased. When the
rate of poisoning locations n/Nincreases to a certain value,
the identification rates in the algorithms, P3and Baseline1,
do not increase as a larger n/Ndenoted more locations and,
thus, fewer probabilities for algorithms to find the poisoning
locations. Moreover, the Baseline1 algorithm is less affected
by n/N, with the identification rates increasing from 0.005
to 0.2, as the sophisticated algorithm P3is more sensitive to
the n/Nthan Baseline1. Finally, it is interesting to observe
that the identification rate in Baseline2 decreases with n/N,
as more locations decrease the probability of random guess.
In addition, the identification rate in the poisoning attacks,
Poison1, is less than that in Poison2. The reasons are the same
as analyzed earlier, i.e., the poisoning locations in Poison1 are
more plausible to users’ locations than that in Poison2. As a
result, it is more difficult to identify such poisoning locations.
Furthermore, the identification rate in the poisoning attacks,
Poison1, is much affected by the varying rate of poison-
ing locations n/Ncompared with the identification rate in
Poison2. To make concrete, in P3, the identification rate in
Poison1 increases to 0.8 from 0.566, while the identification
rate in Poison2 varies within (0.73,0.8). The reasons are the
same as analyzed earlier, i.e., Poison2 is a straightforward
poisoning strategy, and Poison1 is a sophisticated poisoning
strategy. As such, Poison1 is more likely to be affected by the
rate of poisoning locations n/N.
In addition, in P3, the identification rate in the loc-Gwalla
data set is larger than that in the loc-Brightkite data set with the
identification rate varying within (0.56,0.8)and (0.608,0.8),
respectively. The reasons are the same as analyzed earlier,
i.e., loc-Gwalla contains much more relationships among users
than the loc-Brightkite data set with an average degree of
9.7 and graph density 4.92E5. Thus, it is difficult for
the three algorithms to identify the poisoning locations in the
loc-Brightkite data set with less relationship.
D. Impact of Length of User’s Trace le on Identification Rate
Fig. 7 shows the impact of the length of the user’s trace le on
the identification rate. First, we can see that the identification
rates in algorithms, P3, Baseline1, and Baseline2, decrease
with the length of the user’s trace le. Specifically, in the
loc-Gwalla data set, the identification rate in P3decreases
to 0.809 from 0.6568, and in the loc-Brightkite data set,
it decreases by 0.162. In Baseline1, the identification rate
decreases to 0.01 in the loc-Gwalla data set and 0.0067 in
loc-Brightkite. Similarly, in Baseline2, the identification rate
varies within (0.094,0.064)in loc-Gwalla and (0.096,0.065)
in loc-Brightkite, respectively. The reasons are that poisoning
locations generated in poisoning attacks Poison1 are more
plausible to users’ locations when the length of the user’s trace
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
824 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 3, JUNE 2020
Fig. 8. Impact of (a) number of landmarks k, (b) walk length of random trace wl, and (c) threshold θoon the identification rate in both loc-Gwalla and
loc-Brightkite.
le is enlarged. As such, it is difficult for P3to identify these
plausible locations (i.e., poisoning locations). Regarding the
identification rates in Baseline1 and Baseline2, it is attributed
to the increasing number of locations resulted from the length
of the user’s trace le.
Furthermore, in P3, the identification rate in poisoning
attacks Poison1 is largely affected by the length of the user’s
trace le compared with that in Poison2. Specifically, in P3,
the identification rate in Poison1 varies within 0.6389 and
0.8057, while in Poison2, it decreases to 0.7308 from 0.809.
As analyzed earlier, the adversaries launching Poison1 can
more credibly imitate the locations of users when the length
of the user’s trace le is enlarged, resulting in the poisoning
locations more plausible to users’ locations. Finally, the iden-
tification rate in the loc-Brightkite data set is larger than that
in the loc-Gwalla data set.
E. Impact of Number of Landmarks k on Identification Rate
In the following, as shown in Fig. 8, we investigate the
impact of the number of landmarks k, the walk length of the
random walk trace wl, and the threshold θoon the identification
rates in algorithm P3since Baseline1 and Baseline2 are not
affected by these parameters.
The impact of the number of landmarks kon the iden-
tification rate is shown in Fig. 8(a). We observe that ini-
tially as kincreases, the identification rate rapidly increases
since the increasing number of landmarks improves the pre-
cision of the map between the inferred social graph and
the social graph from social networks. Specifically, in loc-
Gwalla, the identification rate in Poison1 increases to 0.79 and,
in Poison2, increases to 0.8. On the contrary, in loc-Brightkite,
the identification rates in Poison1 and Poison2 vary within
(0.74,0.8)and (0.78,0.83), respectively. However, when the
number of landmarks continually increases, the identification
rate decreases, as more errors are injected during the process
of selecting landmarks, and thereby, the set of landmarks in
the inferred graph and the social graph from social networks
are less identical. We can see that the identification rates in
the two kinds of poisoning attacks and two data sets decrease
to 0.66, 0.62, 0.74, and 0.79, respectively.
F. Impact of Walk Length wl on Identification Rate
Fig. 8(b) shows the impact of the walk length of the
random walk trace wl on the identification rate. It shows the
identification rates in loc-Gwalla and loc-Brightkite increase
sharply when wl increases from 10 to 50 and 10 to 70, respec-
tively. Thereafter, the identification rates in the loc-Gwalla
and loc-Brightkite data sets saturate. Moreover, the identifi-
cation rate in loc-Gwalla is larger than that in loc-Brightkite.
It is attributed to that when the walk length of the random
walk trace wl is enlarged, it improves the precision of the
inferred social relationships and, thereby, the identification
rate. Moreover, the larger average degree and graph density
in loc-Gwalla contribute to the better privacy preservation
algorithms provided in the loc-Gwalla data set.
G. Impact of Threshold θoon Identification Rate
The identification rates in two data sets and two poisoning
attacks in P3are shown in Fig. 8(c) when we vary the threshold
θofrom 0.4to0.8. It shows the identification rates increase
with the threshold θosince a larger threshold θoimproves
the precision of the inferred social relationships and further
enlarges the identification rates. Moreover, the identification
rate in poisoning attacks, Poison1, is less than that in Poison2,
as the poisoning locations in Poison1 exhibit more similar
features with users’ locations.
V. R ELATED WORK
A. Privacy Preservation in MEC
One kind of work focused on launching attacks to disclose
location privacy in MEC scenarios. Specifically, the work [7]
used the chaff service to protect the location privacy of mobile
users in MECs. Another work [8] proposed that a third party
whose identity is not reliable accesses the MEC platform; it
will pose a potential threat. Similarly, Vratonjic et al. [40]
proposed that the use of shared public IP addresses would
pose a threat to location privacy. Li et al. [9] studied the
problem of online security-aware MEC under jamming attacks
and proposed a secure edge computing method based on the
MAB framework with sleeping arms to adaptively select the
trusted MEC server to protect the user’s location privacy.
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
ZHAO et al.:P
3ATTAC K S I N M E C 825
Another kind of related work studied privacy preservation
in MEC. To make concrete, Zhang et al. [41] proposed to
solve the problem of data security and privacy in MEC by
using cryptographic-based technology. Moreover, Du et al. [8]
analyzed the privacy issues in MEC from the aspects of
data aggregation and data mining and used the anonymous
mechanism and traffic detection techniques combined with
machine learning. What’s more, Li et al. [42] proposed a data
aggregation scheme to protect the privacy of terminal devices
in the MEC-assisted IoT scenario.
In addition, other studies focused on privacy protec-
tion in application scenarios such as task offloading, MSS,
blockchain, and big data support in MEC. For example, liter-
ature [10] established a safe MEC framework for pilgrimage
and used it to switch between fog computing terminal (FCT)
and cloud. Studies, such as [11], focused on the privacy
problems caused by the wireless task offloading characteristics
of MEC and proposed the privacy awareness task offloading
algorithm based on CMDP to minimize the delay and energy
consumption. The work [5] proposed an MSS with MEC as
the core to protect the network privacy of mobile users. The
follow-up study [6] found anonymous security and privacy
issues in blockchain therapy under the background of MEC
application in-home therapy management and proposed a safe
treatment framework. Moreover, the work [4] proposed an
online learning algorithm to predict user learning and defined
a strict attack model to minimize privacy loss in a long time.
However, the above-mentioned literature either focused on
the privacy protection or launched attacks to disclose the data
privacy in MEC and did not study both the strategy of poi-
soning attacks and the defense mechanisms against poisoning
attacks in MEC. On the contrary, this article dedicates to
design the privacy-preserving algorithm against the poisoning
attacks in MEC.
B. Studies About Poisoning Attacks
Another line of somehow related works [12]–[19] focused
on the poisoning attacks in an adversarial machine learn-
ing scenario. Specifically, literature [12] reviewed security
threats to machine learning including the poisoning attacks.
The work [13] explored poisoning attacks on neural nets.
Moreover, the study [14] made systematic research of data
poisoning attacks for online learning. Another work [15]
proposed a new method for generating undetectable attacks
automatically using the backpropagation characteristics of the
trained deep neural network (DNN). The work [16] developed
three new attacks that can bypass the extensive data disinfec-
tion defenses. On the contrary, the study in [17] discussed the
vulnerability of the weighted method of domain adaptation to
poisoning attacks in an adversarial machine learning environ-
ment. The latest work [18] focused on the optimal poisoning
attack under multitask learning (MTL) model. Furthermore,
literature [19] presented a new backdoor attack without label
poisoning, which proves that it is possible.
However, the above-mentioned work mainly focused
on designing a poisoning attack algorithm in machine
learning scenario without considering the corresponding
defense mechanisms. On the contrary, this article dedicates to
the privacy-preserving scheme against the poisoning attacks
in MEC. Furthermore, the MEC exhibits different character-
istics compared with the machine learning settings, e.g., task
offloading and time-delay sensitivity, and thus, P3attacks in
MEC face significant new challenges.
Another kind of work [20]–[23] focused on poisoning
attacks in the IoT. To make concrete, the work in [20] studied
how to effectively carry out two types of data poisoning
attacks, namely, exploitable attack and target attack, and
proposed an optimal attack framework. Another work [21]
proposed a method to identify harmful data using the con-
textual information of the origin and transformation of data
points in the training set. Moreover, literature [22] developed
an optimization problem that solved the problem of fake user
ratings. The latest work [23] designed an intelligent attack
mechanism to achieve the maximum attack effectiveness while
covering up the attack behavior.
Unfortunately, these works only launched the poisoning
attacks in the IoT without investigating the defense mech-
anisms against the poisoning attacks in IoT. In this article,
we concentrate on the privacy-preserving algorithm against
the poisoning attacks in MEC rather than the IoT.
The third kind of related work focused on the poisoning
attack in threat intelligence systems [24], malware detection
systems [25], unsupervised node embedding methods [26], and
naive Bayes spam filters, respectively [27]. Nevertheless, these
works considered different scenarios from that considered
in this article. More importantly, these works did not study
the defense mechanisms, while this article indeed focuses on
the privacy-preserving algorithm against the poisoning attacks
in MEC.
In addition, there are several works [28]–[32] focused on the
defenses against poisoning attacks. Specifically, studies [28]
and [29] studied an efficient tag inversion poisoning attack
optimization algorithm, suspicious data point detection, and
relabeling mechanism to mitigate the impact of such poisoning
attacks respectively. Aiming at the poisoning attacks, litera-
ture [30] studied the function of transforming the data set
from the source domain to the target domain with cluster
separability under the adversarial settings. Moreover, stud-
ies [31] and [32] considered feedback-based content poisoning
mitigation in named data networks and prevention of ARP
poisoning in the IoT, respectively.
However, these works studied the defenses in scenarios,
machine learning, named data networks, and IoT, which are
different from the scenario considered in this article. Fur-
thermore, the scenario MEC exhibits different characteristics,
e.g., task offloading and time-delay sensitivity, and thus,
the above-mentioned work is not applicable to MEC.
VI. CONCLUSION
In this article, we propose P3, the first attempt toward the
scheme against poisoning attacks in MEC. The main idea is
to construct the inferred social graph utilizing feature learning
and search the optimal map between the inferred social graph
and the social graph from social networks to identify the
poisoning locations. Extensive experiments on two real-world
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
826 IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. 7, NO. 3, JUNE 2020
data sets, two baseline works, and two kinds of poisoning
attacks have demonstrated the effectiveness of P3.
REFERENCES
[1] F.-Y. Wang, Y. Tang, X. Liu, and Y. Yuan, “Social education: Opportuni-
ties and challenges in cyber-physical-social space,” IEEE Trans. Comput.
Social Syst., vol. 6, no. 2, pp. 191–196, Apr. 2019.
[2] R. Basak, S. Sural, N. Ganguly, and S. K. Ghosh, “Online
public shaming on Twitter: Detection, analysis, and mitigation,”
IEEE Trans. Comput. Soc. Syst., vol. 6, no. 2, pp. 208–220,
Apr. 2019.
[3] S. H. Sajadi, M. Fazli, and J. Habibi, “The affective evolution of social
norms in social networks,” IEEE Trans. Comput. Social Syst.,vol.5,
no. 3, pp. 727–735, Sep. 2018.
[4] P. Zhou, K. Wang, J. Xu, and D. Wu, “Differentially-private and
trustworthy online social multimedia big data retrieval in edge com-
puting,” IEEE Trans. Multimedia, vol. 21, no. 3, pp. 539–554,
Mar. 2019.
[5] P. Zhang, M. Durresi, and A. Durresi, “Mobile privacy protection
enhanced with multi-access edge computing,” in Proc. IEEE 32nd Int.
Conf. Adv. Inf. Netw. Appl. (AINA), May 2018.
[6]M.A.Rahmanet al., “Blockchain-based mobile edge computing
framework for secure therapy applications,IEEE Access,vol.6,
pp. 72469–72478, 2018.
[7] T. He, E. N. Ciftcioglu, S. Wang, and K. S. Chan, “Location privacy
in mobile edge clouds,” in Proc. IEEE 37th Int. Conf. Distrib. Comput.
Syst. (ICDCS), Jun. 2017.
[8] M. Du, K. Wang, Y. Chen, X. Wang, and Y. Sun, “Big data privacy
preserving in multi-access edge computing for heterogeneous Inter-
net of Things,” IEEE Commun. Mag., vol. 56, no. 8, pp. 62–67,
Aug. 2018.
[9] B. Li, T. Chen, X. Wang, and G. B. Giannakis, “Secure edge computing
in IoT via online learning,” in Proc. 52nd Asilomar Conf. Signals, Syst.,
Comput., Oct. 2018.
[10] A. Rahman, E. Hassanain, and M. S. Hossain, “Towards a secure
mobile edge computing framework for Hajj,IEEE Access,vol.5,
pp. 11768–11781, 2017.
[11] X. He, J. Liu, R. Jin, and H. Dai, “Privacy-aware offloading in mobile-
edge computing,” in Proc. GLOBECOM IEEE Global Commun. Conf.,
Dec. 2017.
[12] Q. Liu, P. Li, W. Zhao, W. Cai, S. Yu, and V. C. M. Leung, “A survey
on security threats and defensive techniques of machine learning: A data
driven view,” IEEE Access, vol. 6, pp. 12103–12117, 2018.
[13] A. Shafahi et al., “Poison frogs! Targeted clean-label poisoning
attacks on neural networks,” in Proc. Adv. Neural Inf. Process. Syst.,
2018.
[14] Y. Wang and K. Chaudhuri, “Data poisoning attacks against
online learning,” Aug. 2018, arXiv:1808.08994. [Online]. Available:
https://arxiv.org/abs/1808.08994
[15] F. Khalid, M. A. Hanif, S. Rehman, and M. Shafique, “TrISec:
Training data-unaware imperceptible security attacks on deep
neural networks,” Nov. 2018, arXiv:1811.01031. [Online]. Available:
https://arxiv.org/abs/1811.01031
[16] P. W. Koh, J. Steinhardt, and P. Liang, “Stronger data poisoning
attacks break data sanitization defenses,” Nov. 2018, arXiv:1811.00741.
[Online]. Available: https://arxiv.org/abs/1811.00741
[17] M. Umer, C. Frederickson, and R. Polikar, “Adversarial poisoning of
importance weighting in domain adaptation,” in Proc. IEEE Symp. Ser.
Comput. Intell. (SSCI), Nov. 2018.
[18] M. Zhao, B. An, Y. Yu, S. Liu, and S. J. Pan, “Data poisoning attacks on
multi-task relationship learning,” in Proc. 32nd AAAI Conf. Artif. Intell.,
2018, pp. 2628–2635.
[19] M. Barni, K. Kallas, and B. Tondi, “A new backdoor attack in
CNNs by training set corruption without label poisoning,” Feb. 2019,
arXiv:1902.11237. [Online]. Available: https://arxiv.org/abs/1902.11237
[20] C. Miao, Q. Li, H. Xiao, W. Jiang, M. Huai, and L. Su, “Towards data
poisoning attacks in crowd sensing systems,” in Proc. 18th ACM Int.
Symp. Mobile Ad Hoc Netw. Comput. (Mobihoc), 2018.
[21] N. Baracaldo, B. Chen, H. Ludwig, A. Safavi, and R. Zhang, “Detecting
poisoning attacks on machine learning in IoT environments,” in Proc.
IEEE Int. Congr. Internet Things (ICIOT), Jul. 2018.
[22] M. Fang, G. Yang, N. Z. Gong, and J. Liu, “Poisoning attacks to graph-
based recommender systems,” in Proc. 34th Annu. Comput. Secur. Appl.
Conf. (ACSAC), 2018, pp. 1–12.
[23] C. Miao, Q. Li, L. Su, M. Huai, W. Jiang, and J. Gao, “Attack under
disguise: An intelligent data poisoning attack mechanism in crowdsourc-
ing,” in Proc. World Wide Web Conf. World Wide Web (WWW), 2018,
pp. 13–22.
[24] N. Khurana, S. Mittal, and A. Joshi, “Preventing poisoning attacks
on AI based threat intelligence systems,” Jul. 2018, arXiv:1807.07418.
[Online]. Available: https://arxiv.org/abs/1807.07418
[25] S. Chen et al., “Automated poisoning attacks and defenses in malware
detection systems: An adversarial machine learning approach,” Comput.
Secur., vol. 73, pp. 326–344, Mar. 2018.
[26] M. Sun et al., “Data poisoning attack against unsupervised node
embedding methods,” Oct. 2018, arXiv:1810.12881. [Online]. Available:
https://arxiv.org/abs/1810.12881
[27] D. J. Miller, X. Hu, Z. Xiang, and G. Kesidis, “A mixture
model based defense for data poisoning attacks against naive Bayes
spam filters,” Oct. 2018, arXiv:1811.00121. [Online]. Available:
https://arxiv.org/abs/1811.00121
[28] A. Paudice, L. Muñoz-González, and E. C. Lupu, “Label sanitization
against label flipping poisoning attacks,” in Proc. Joint Eur. Conf. Mach.
Learn. Knowl. Discovery Databases, 2018, pp. 5–15.
[29] A. Paudice, L. Muñoz-González, A. Gyorgy, and E. C. Lupu, “Detec-
tion of adversarial training examples in poisoning attacks through
anomaly detection,” Feb. 2018, arXiv:1802.03041. [Online]. Available:
https://arxiv.org/abs/1802.03041
[30] C. V. S. Praven and C. S. Kumar, “Domain adversarial representation
learning for data independent defenses against poisoning attacks,” in
Proc. ICLR, 2018, pp. 1–3.
[31] W. Cui, Y. Li, Y. Xin, and C. Liu, “Feedback-based content poisoning
mitigation in named data networking,” in Proc. IEEE Symp. Comput.
Commun. (ISCC), Jun. 2018, pp. 759–765.
[32] W. Gao et al., “ARP poisoning prevention in Internet of Things,” in Proc.
9th Int. Conf. Inf. Technol. Med. Edu. (ITME), Oct. 2018, pp. 733–736.
[33] P. Zhao, J. Li, F. Zeng, F. Xiao, C. Wang, and H. Jiang, “ILLIA: Enabling
k-anonymity-based privacy preserving against location injection attacks
in continuous LBS queries,” IEEE Internet Things J., vol. 5, no. 2,
pp. 1033–1042, Apr. 2018.
[34] P. Zhao et al., “P3-LOC: A privacy-preserving paradigm-driven frame-
work for indoor localization,” IEEE/ACM Trans. Netw., vol. 26, no. 6,
pp. 2856–2869, Dec. 2018.
[35] T. Opsahl, F. Agneessens, and J. Skvoretz, “Node centrality in weighted
networks: Generalizing degree and shortest paths,” Social Netw., vol. 32,
no. 3, pp. 245–251, Jul. 2010.
[36] R. E. Bellman, “Book review: Combinatorial optimization: Networks
and matroids,” Bull. Amer. Math. Soc., vol. 84, no. 3, pp. 461–464,
May 1978.
[37] Gwalla and Brightkite Data. Accessed: Dec. 21, 2019. [Online]. Avail-
able: http://snap.stanford.722edu/data/index.html
[38] W. Gao, G. Cao, A. Iyengar, and M. Srivatsa, “Supporting cooperative
caching in disruption tolerant networks,” in Proc. 31st Int. Conf. Distrib.
Comput. Syst., Jun. 2011, pp. 1–12.
[39] V. Bindschaedler and R. Shokri, “Synthesizing plausible privacy-
preserving location traces,” in Proc. IEEE Symp. Secur. Privacy (SP),
May 2016, pp. 1–18.
[40] N. Vratonjic, K. Huguenin, V. Bindschaedler, and J.-P. Hubaux,
“A location-privacy threat stemming from the use of shared pub-
lic IP addresses,IEEE Trans. Mobile Comput., vol. 13, no. 11,
pp. 2445–2457, Nov. 2014.
[41] J. Zhang, B. Chen, Y. Zhao, X. Cheng, and F. Hu, “Data security
and privacy-preserving in edge computing paradigm: Survey and open
issues,IEEE Access, vol. 6, pp. 18209–18237, 2018.
[42] X. Li, S. Liu, F. Wu, S. Kumari, and J. J. P. C. Rodrigues, “Privacy
preserving data aggregation scheme for mobile edge computing assisted
IoT applications,” IEEE Internet Things J., vol. 6, no. 3, pp. 4755–4763,
Jun. 2019, doi: 10.1109/jiot.2018.2874473.
Authorized licensed use limited to: Donghua University. Downloaded on September 11,2020 at 01:59:52 UTC from IEEE Xplore. Restrictions apply.
... The concept of privacy known as l-diversity was accomplished by further improving k-anonymity. Recent work published in [26] established a feature learning approach to achieve privacy-preserving protection against poisoning. They first built an inferred social graph by employing a feature learning approach to define the social relationships that exist among the users of the social platform. ...
... OSNs have more personal data exposed to adversaries than banks/hospitals, making privacy maintenance harder. The research gaps that remain unaddressed, according to contemporary graph-based OSN privacy preservation methods [20][21][22][23][24][25][26][27][28][29][30][31][32][33], are outlined further. (1) Current OSN k-anonymization techniques use static/or sophisticated clustering, which restricts privacy protection due to under-/ over-clustering, (2) Due to AI, attackers may utilize AI to assault OSN privacy. ...
Article
Full-text available
Over the past few years, global use of Online Social Networks (OSNs) has increased. The rising use of OSN makes protecting users’ privacy from OSN attacks difficult. Finally, it affects the basic commitment to protect OSN users from such invasions. The lack of a distributed, dynamic, and artificial intelligence (AI)-based privacy-preserving strategy for performance trade-offs is a research challenge. We propose the Distributed Privacy Preservation (DPP) for OSN using Artificial Intelligence (DPP-OSN-AI) to reduce Information Loss (IL) and improve privacy preservation from different OSN threats. DPP-OSN-AI uses AI to design privacy notions in distributed OSNs. DPP-OSN-AI consists of AI-based clustering, l-diversity, and t-closeness phases to achieve the DPP for OSN. The AI-based clustering is proposed for dynamic and optimal clustering of OSN users to ensure personalized k-anonymization to protect from AI-based threats. First, the optimal number of clusters is discovered dynamically with simple computations, and then the Whale Optimization Algorithm is designed to optimally place the OSN users across the clusters such that it helps to protect them from AI-based threats. Because k-anonymized OSN clusters are insufficient to handle all privacy concerns in a distributed OSN environment, we systematically applied the l-diversity privacy idea followed by the t-closeness to it, resulting in higher DPP and lower IL. The DPP-OSN-AI model is assessed for IL Efficiency (ILE), Degree of Anonymization (DoA,) and computational complexity using publically accessible OSN datasets. Compared to state-of-the-art, DPP-OSN-AI model DoA is 15.57% higher, ILE is 17.85% higher, and computational complexity is 3.61% lower.
Chapter
Mobile Edge Computing (MEC) enables mobile users to run various delay-sensitive applications via offloading computation tasks to MEC servers. However, the location privacy and the usage pattern privacy are disclosed to the untrusted MEC servers. The most related work concerning privacy-preserving offloading schemes in MEC either consider an impractical MEC scenario consisting of a single user or take a large amount of computation and communication cost. In this paper, we propose a deep reinforcement learning-based joint optimization of delay and privacy preservation during offloading for multiple-user wireless powered MEC systems, preserving users’ both location privacy and usage pattern privacy. The main idea is that, to protect both the two kinds of privacy, we propose to disguise users’ offloading decisions and deliberately offloading redundant tasks along with the actual tasks to the MEC servers. On this basis, we further formalize the task offloading as an optimization problem of computation rate and privacy preservation. Then, we design a deep reinforcement learning-based offloading algorithm to solve such a non-convex problem, aiming to obtain the better tradeoff between the computation rate and the privacy preservation. Finally, extensive simulation results demonstrate that our algorithm can maintain a high level of computation rate while protecting users’ usage pattern privacy and location privacy, compared with two learning-based methods and two Baselines.
Article
Since social media such as Facebook and Twitter have permeated various aspects of daily life, people have strong incentives to influence information dissemination on these platforms and differentiate their content from the fierce competition. Existing dissemination strategies typically employ marketing techniques, such as seeking publicity through renowned actors or targeted advertising placements. Despite their various forms, most simply spread information to strengthen user impressions without conducting formal analyses of specific influence enhancement. And coupled with high costs, most fall short of expectations. To this end, we ingeniously formulate the task of social media dissemination as poisoning attacks, which influence specified content’s dissemination among target users by intervening in some users’ social media behaviors (including retweeting, following, and profile modifying). Correspondingly, we propose a novel poisoning attack, I nfluence-based S ocial M edia A ttack (ISMA) to generate discrete poisoning behaviors, which is difficult to achieve with existing attacks. In ISMA, we first contribute an efficient influence evaluator to quantify the spread influence of poisoning behaviors. Based on the estimated influence, we then present an imperceptible hierarchical selector and a profile modification method ProMix to select influential behaviors to poison. Notably, our attack is driven by custom attack objectives, which allows one to flexibly design different optimization goals to change the information flow, which could solve the blindness of existing influence maximization methods. Besides, behaviors such as retweeting are gentle and simple to implement. These properties make our attack more cost-effective and practical. Extensive experiments on two large-scale real-world datasets demonstrate the superiority of our method as it significantly outperforms baselines, and additionally, the proposed evaluator’s analysis of user influence provides new insights for influence maximization on social media.
Article
Mobile edge computing (MEC) technology is widely used for real‐time and bandwidth‐intensive services, but its underlying heterogeneous architecture may lead to a variety of security and privacy issues. Blockchain provides novel solutions for data security and privacy protection in MEC. However, the scalability of traditional blockchain is difficult to meet the requirements of real‐time data processing, and the consensus mechanism is not suitable for resource‐constrained devices. Moreover, the access control of MEC data needs to be further improved. Given the above problems, a data privacy protection model based on sharding blockchain and access control is designed in this paper. First, a privacy‐preserving platform based on a sharding blockchain is designed. Reputation calculation and improved Proof‐of‐Work (PoW) consensus mechanism are proposed to accommodate resource‐constrained edge devices. The incentive mechanism with rewards and punishments is designed to constrain node behavior. A reward allocation algorithm is proposed to encourage nodes to actively contribute to obtaining more rewards. Second, an access control strategy using ciphertext policy attribute‐based encryption (CP‐ABE) and RSA is designed. A smart contract is deployed to implement the automatic access control function. The InterPlanetary File System is introduced to alleviate the blockchain storage burden. Finally, we analyze the security of the proposed privacy protection model and statistics of the GAS consumed by the access control policy. The experimental results show that the proposed data privacy protection model achieves fine‐grained control of access rights, and has higher throughput and security than traditional blockchain.
Article
In recent years, the widespread adoption of Machine Learning (ML) at the core of complex IT systems has driven researchers to investigate the security and reliability of ML techniques. A very specific kind of threats concerns the adversary mechanisms through which an attacker could induce a classification algorithm to provide the desired output. Such strategies, known as Adversarial Machine Learning (AML), have a twofold purpose: to calculate a perturbation to be applied to the classifier’s input such that the outcome is subverted, while maintaining the underlying intent of the original data. Although any manipulation that accomplishes these goals is theoretically acceptable, in real scenarios perturbations must correspond to a set of permissible manipulations of the input, which is rarely considered in the literature. In this paper, we present AdverSPAM , an AML technique designed to fool the spam account detection system of an Online Social Network (OSN). The proposed black-box evasion attack is formulated as an optimization problem that computes the adversarial sample while maintaining two important properties of the feature space, namely statistical correlation and semantic dependency . Although being demonstrated in an OSN security scenario, such an approach might be applied in other context where the aim is to perturb data described by mutually related features. Experiments conducted on a public dataset show the effectiveness of AdverSPAM compared to five state-of-the-art competitors, even in the presence of adversarial defense mechanisms.
Article
Full-text available
Industry 4.0 is moving towards deployment using 5G as one of the main underlying communication infrastructures. Thus, the vision of the Industry of the future is getting more attention in research. Industry X (InX) is a significant thrust beyond the state-of-the-art of current Industry 4.0, towards a mix of cyber and physical systems through novel technological developments. In this survey, we define InX as the combination of Industry 4.0 and 5.0 paradigms. Most of the novel technologies, such as cyber-physical systems, industrial internet of things, machine learning, advances in cloud computing, such as edge and fog computing, and blockchain, to name a few, are converged through advanced communication networks. Since communication networks are usually targeted for security attacks, these new technologies upon which InX relies must be secured to avoid security vulnerabilities propagating into InX and its components. Therefore, in this article, we break down the security concerns of the converged InX-communication networks into the core technologies that tie these, once considered distinct, fields together. The security challenges of each technology are highlighted and potential solutions are discussed. The existing vulnerabilities or research gaps are brought forth to stir further research in this direction. New emerging visions in the context of InX are provided towards the end of the article to provoke further curiosity of researchers.
Article
Full-text available
Because of the processing of continuous unstructured large streams of data, mining real-time streaming data is a more challenging research issue than mining static data. The privacy issue persists when sensitive data is included in streaming data. In recent years, there has been significant progress in research on the anonymization of static data. For the anonymization of quasi-identifiers, two typical strategies are generalization and suppression. However, the high dynamicity and potential infinite properties of the streaming data make it a challenging task. To end this, we propose a novel Efficient Approximation and Privacy Preservation Algorithms (EAPPA) framework in this paper to achieve efficient data pre-processing from the live streaming and its privacy preservation with minimum Information Loss (IL) and computational requirements. As the existing privacy preservation solutions for streaming data suffer from the challenges of redundant data, we first propose the efficient technique of data approximation with data pre-processing. We design the Flajolet Martin (FM) algorithm for robust and efficient approximation of unique elements in the data stream with a data cleaning mechanism. We fed the periodically approximated and pre-processed streaming data to the anonymization algorithm. Using adaptive clustering, we propose innovative k-anonymization and l-diversity privacy principles for data streams. The proposed approach scans a stream to detect and reuse clusters that fulfill the k-anonymity and l-diversity criteria for reducing anonymization time and IL. The experimental results reveal the efficiency of the EAPPA framework compared to state-of-art methods.
Article
Mobile Edge Computing (MEC) is a new computing paradigm that enables cloud computing and information technology (IT) services to be delivered at the network’s edge. By shifting the load of cloud computing to individual local servers, MEC helps meet the requirements of ultralow latency, localized data processing, and extends the potential of Internet of Things (IoT) for end-users. However, the crosscutting nature of MEC and the multidisciplinary components necessary for its deployment have presented additional security and privacy concerns. Fortunately, Artificial Intelligence (AI) algorithms can cope with excessively unpredictable and complex data, which offers a distinct advantage in dealing with sophisticated and developing adversaries in the security industry. Hence, in this paper we comprehensively provide a survey of security and privacy in MEC from the perspective of AI. On the one hand, we use European Telecommunications Standards Institute (ETSI) MEC reference architecture as our based framework while merging the Software Defined Network (SDN) and Network Function Virtualization (NFV) to better illustrate a serviceable platform of MEC. On the other hand, we focus on new security and privacy issues, as well as potential solutions from the viewpoints of AI. Finally, we comprehensively discuss the opportunities and challenges associated with applying AI to MEC security and privacy as possible future research directions.
Article
Full-text available
Machine learning models trained on data from the outside world can be corrupted by data poisoning attacks that inject malicious points into the models’ training sets. A common defense against these attacks is data sanitization: first filter out anomalous training points before training the model. In this paper, we develop three attacks that can bypass a broad range of common data sanitization defenses, including anomaly detectors based on nearest neighbors, training loss, and singular-value decomposition. By adding just 3% poisoned data, our attacks successfully increase test error on the Enron spam detection dataset from 3 to 24% and on the IMDB sentiment classification dataset from 12 to 29%. In contrast, existing attacks which do not explicitly account for these data sanitization defenses are defeated by them. Our attacks are based on two ideas: (i) we coordinate our attacks to place poisoned points near one another, and (ii) we formulate each attack as a constrained optimization problem, with constraints designed to ensure that the poisoned points evade detection. As this optimization involves solving an expensive bilevel problem, our three attacks correspond to different ways of approximating this problem, based on influence functions; minimax duality; and the Karush–Kuhn–Tucker (KKT) conditions. Our results underscore the need to develop more robust defenses against data poisoning attacks.
Conference Paper
Full-text available
Backdoor attacks against CNNs represent a new threat against deep learning systems, due to the possibility of corrupting the training set so to induce an incorrect behaviour at test time. To avoid that the trainer recognises the presence of the corrupted samples, the corruption of the training set must be as stealthy as possible. Previous works have focused on the stealthiness of the perturbation injected into the training samples, however they all assume that the labels of the corrupted samples are also poisoned. This greatly reduces the stealthiness of the attack, since samples whose content does not agree with the label can be identified by visual inspection of the training set or by running a pre-classification step. In this paper we present a new backdoor attack without label poisoning Since the attack works by corrupting only samples of the target class, it has the additional advantage that it does not need to identify beforehand the class of the samples to be attacked at test time. Results obtained on the MNIST digits recognition task and the traffic signs classification task show that backdoor attacks without label poisoning are indeed possible, thus raising a new alarm regarding the use of deep learning in security-critical applications. Index Terms-Adversarial learning, security of deep learning, backdoor poisoning attacks, training with poisoned data.
Article
Full-text available
Public shaming in online social networks and related online public forums like Twitter has been increasing in recent years. These events are known to have a devastating impact on the victim's social, political, and financial life. Notwithstanding its known ill effects, little has been done in popular online social media to remedy this, often by the excuse of large volume and diversity of such comments and, therefore, unfeasible number of human moderators required to achieve the task. In this paper, we automate the task of public shaming detection in Twitter from the perspective of victims and explore primarily two aspects, namely, events and shamers. Shaming tweets are categorized into six types: abusive, comparison, passing judgment, religious/ethnic, sarcasm/joke, and whataboutery, and each tweet is classified into one of these types or as nonshaming. It is observed that out of all the participating users who post comments in a particular shaming event, majority of them are likely to shame the victim. Interestingly, it is also the shamers whose follower counts increase faster than that of the nonshamers in Twitter. Finally, based on categorization and classification of shaming tweets, a web application called BlockShame has been designed and deployed for on-the-fly muting/blocking of shamers attacking a victim on the Twitter.
Article
Multi-task learning (MTL) is a machine learning paradigm that improves the performance of each task by exploiting useful information contained in multiple related tasks. However, the relatedness of tasks can be exploited by attackers to launch data poisoning attacks, which has been demonstrated a big threat to single-task learning. In this paper, we provide the first study on the vulnerability of MTL. Specifically, we focus on multi-task relationship learning (MTRL) models, a popular subclass of MTL models where task relationships are quantized and are learned directly from training data. We formulate the problem of computing optimal poisoning attacks on MTRL as a bilevel program that is adaptive to arbitrary choice of target tasks and attacking tasks. We propose an efficient algorithm called PATOM for computing optimal attack strategies. PATOM leverages the optimality conditions of the subproblem of MTRL to compute the implicit gradients of the upper level objective function. Experimental results on real-world datasets show that MTRL models are very sensitive to poisoning attacks and the attacker can significantly degrade the performance of target tasks, by either directly poisoning the target tasks or indirectly poisoning the related tasks exploiting the task relatedness. We also found that the tasks being attacked are always strongly correlated, which provides a clue for defending against such attacks.
Article
We are making good progresses in our impact and reputation over the last year. According to the latest data released by Scopus on February 11, 2019, our CiteScore hits its historical high to 3.94, and TCSS ranks 8th out of the 226 journals (top 3.54%) in the field of social sciences. This is a solid improvement compared with the corresponding data in 2017 (CiteScore: 2.36, Rank: 17/226, top 8%). Thanks and congratulations to our authors, reviewers, and members of our editorial board. The current issue includes 17 regular papers and a brief discussion on social education.
Chapter
Many machine learning systems rely on data collected in the wild from untrusted sources, exposing the learning algorithms to data poisoning. Attackers can inject malicious data in the training dataset to subvert the learning process, compromising the performance of the algorithm producing errors in a targeted or an indiscriminate way. Label flipping attacks are a special case of data poisoning, where the attacker can control the labels assigned to a fraction of the training points. Even if the capabilities of the attacker are constrained, these attacks have been shown to be effective to significantly degrade the performance of the system. In this paper we propose an efficient algorithm to perform optimal label flipping poisoning attacks and a mechanism to detect and relabel suspicious data points, mitigating the effect of such poisoning attacks.