ArticlePDF Available

RR-LADP: A Privacy-Enhanced Federated Learning Scheme for Internet of Everything

Authors:

Abstract and Figures

While the widespread use of ubiquitously connected devices in IoE offers enormous benets, it also raises serious privacy concerns. Federated learning, as one of the promising solutions to alleviate such problems, is considered as capable of performing data training without exposing raw data that kept by multiple devices. However, either malicious attackers or untrusted servers, can deduce users privacy from the local updates of each device. Previous studies mainly focus on privacy-preserving approaches inside the servers, which requires the framework to be built on trusted servers. In this paper, we propose a privacy-enhanced federated learning scheme for IoE. Two mechanisms are adopted in our approach, namely the randomized response (RR) mechanism and the local adaptive differential privacy (LADP) mechanism. RR is adopted to prevent the server from knowing whose updates are collected in each round. LADP enables devices to add noise adaptively to its local updates before submitting them to the server. Experiments demonstrate the feasibility and effectiveness of our approach.
Content may be subject to copyright.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 1
RR-LADP: A Privacy-Enhanced Federated
Learning Scheme for Internet of Everything
Zerui Li
Harbin Institute of
Technology, Shenzhen
Qing Liao
Harbin Institute of
Technology, Shenzhen
Mohsen Guizani
Qatar University, Doha
Yuchen Tian
Harbin Institute of
Technology, Shenzhen
Yang Liu
Harbin Institute of
Technology, Shenzhen
Peng Cheng Laboratory
Weizhe Zhang
Harbin Institute of
Technology, Shenzhen
Peng Cheng Laboratory
Xiaojiang Du
Temple University,
Philadelphia
Abstract—While the widespread use of ubiqui-
tously connected devices in IoE offers enormous
benefits, it also raises serious privacy concerns. Fed-
erated learning, as one of the promising solutions
to alleviate such problems, is considered as capable
of performing data training without exposing raw
data that kept by multiple devices. However, either
malicious attackers or untrusted servers, can deduce
users’ privacy from the local updates of each device.
Previous studies mainly focus on privacy-preserving
approaches inside the servers, which requires the
framework to be built on trusted servers. In this pa-
per, we propose a privacy-enhanced federated learn-
ing scheme for IoE. Two mechanisms are adopted in
our approach, namely the randomized response (RR)
mechanism and the local adaptive differential privacy
(LADP) mechanism. RR is adopted to prevent the
server from knowing whose updates are collected
in each round. LADP enables devices to add noise
adaptively to its local updates before submitting them
to the server. Experiments demonstrate the feasibility
and effectiveness of our approach.
I. INTRODUCTION
THE Internet of Everything (IoE) redefines the
connection between people, things and data
and changes the way to interact devices. In this era,
any object can be transformed into network data
through corresponding sensors. At the same time,
advances in communication technology and en-
hancement of edge computing capabilities facilitate
the application of machine learning in the Internet
of Everything, which means that network data can
be effectively mined to support intelligent services.
For example, by collecting static and dynamic in-
formation of users, smart furniture and intelligence
software can improve the efficiency of work and
life. In general, more data means better service. In
this environment, most IoT devices continuously
collect and upload private data during operation,
which is potential to compromise users’ privacy [?].
Recently, federated learning, which localizes the
training process, is seen as a potential machine
learning mechanism to solve the problem of using
private data. It distributes the tasks of model train-
ing to multiple participants and aggregates local
updates to iteratively generate a global model. The
advantage is that the information delivered to server
is model weights difference or training gradient [?],
instead of the raw data. This distributed learning
setup decouples the training tasks from the need
for the server to centralize data. It can use the
private data of different users to learn a high-quality
sharing model, while leaving raw data on the local
IoT devices. Therefore, the risk of privacy leakage
caused by data transmission, cloud storage, and
centralized training can be effectively decreased. At
the same time, the availability of training data is
guaranteed as there is no need to encrypt the data
before using it.
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 2
In fact, similar to most privacy solutions, feder-
ated learning is also based on an important assump-
tion that the process is scheduled by a trusted server.
Untrusted servers and malicious attackers may per-
form model inversion attacks by obtaining the
communication parameters of federated learning. In
the IoE solutions, data collected from sensors will
be transmitted through multiple layers to generate
services. The multi-layer structure composed of
hardware and software makes federated learning
more vulnerable to potential attacks.
In this paper, we propose a privacy-enhanced
federated learning scheme for IoE. Two mecha-
nisms are adopted in our approach, namely the
randomized response (RR) mechanism and the local
adaptive differential privacy (LADP) mechanism.
The main contributions are as follows.
We propose the RR mechanism, which is com-
pleted by each device to enhance the privacy of
devices selection. In each training round of RR
federated learning, the server cannot determine
whether a particular device is participated the
training. The mechanism, therefore, can pre-
vent untrusted servers and malicious attackers
from knowing which devices’ updates are in-
cluded in the communication content.
We adopt the LADP mechanism in the stage
of local training. Gaussian noise is added to
the local updates of each device adaptively
before the the updates are uploaded to the
server. Hence, the mechanism can even prevent
untrusted server and malicious attackers from
deducing relevant information of the training
data with local updates.
II. BACKGROU ND A ND M OTI VATION
A. Federated learning
The general flow of federated learning using Fed-
eratedAveraging (FedAvg) algorithm to aggregate
updates is as follows [?].
Suppose there are a total of Kclients and each
client has a private dataset. In each training round
t, the server randomly selects K0(K0K)clients
and sends them the global model with the weight
ωt1. Each client kselected trains the model on
its private data, and uploads the weights difference
ωk
t. Finally, the server averages these local up-
dates and generates a new global model, and the
process repeats as:
ωt=ωt1+1
K0Xωk
t(1)
B. Differential privacy
Differential privacy provides a strong privacy
guarantee for aggregate data [?]. Its definition is
as follows.
Define two datasets to be adjacent if they dif-
fer only in a single record. A given mechanism
M:D → R has domain Dand range R. We define
the mechanism Msatisfies ε-differential privacy, if
for any two adjacent inputs d, d0∈ D and for any
subset of outputs S ∈ R the following inequality
holds:
P r[M(d)∈ S]eεP r[M(d0)∈ S] + δ(2)
The privacy budget εlimits the bounds of privacy
loss and the slack variable δallows the definition
break with a given probability.
The general way to realize this mechanism
is to add Gaussian noise to approximate a real
value function f:D → R with differential pri-
vacy. The noise is calibrated to sensitivity Sf,
which is the maximum value of absolute distance
|f(d)f(d0)|.f(d)and f(d0)are function value
corresponding to the adjacent input dand d0. We
define a Gaussian noise addition mechanism as
M(d) = f(d) + N(0, Sf2σ2), where N(0, Sf2σ2)
is the Gaussian distributed noise with mean 0and
standard deviation Sfσ.
C. Motivation
The privacy-preserving federated learning can
be realized by leveraging differential privacy [?],
which effectively reduce the possibility to infer
extra information through data transferred in each
training round. Ideally, it requires a trusted server
to complete the noise addition operation. In the real
world training process, we consider the following
situations:
The server is curious: The server can nor-
mally complete the privacy processing steps
such as noise addition after FedAvg. At the
same time, it wants to infer the private data of
clients.
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 3
The server is incompetent: The server may
fail to add noise into the averaged updates for
some reason before releasing the global model.
It puts all participating clients under great risks
of privacy leakage.
Due to the deficiency of centralized privacy-
preserving approach, we adjust the client selection
mechanism and shifting the noise addition to client
side in order to reduce the dependence on the server.
In the context of FedAvg, we are more interested
in the weights difference contributed by each client.
In this way, the noise contained in the aggregation
is the sum of the noise added by each client. It will
satisfies differential privacy if each client process
satisfies [?].
III. STATE-OF-THE-ART
Instead of collecting data from clients and train-
ing the model on the server in a centralized way,
federated learning allows multiple clients to learn
a model collaboratively while keeping data locally.
It provides a new solution for preserving privacy
in machine learning. Google first proposed feder-
ated learning [?], a privacy-preserving collaborative
modeling mechanism. They applied federated learn-
ing to the input prediction and query suggestions of
Gboard [?], [?]. Konecny et al. [?] used structured
updates and model compression to reduce uplink
and downlink communication costs. Bonawitz et al.
proposed a protocol user to improve the robustness
of federated learning [?]. However, they did not
consider the privacy risks of the federated learning
mechanism.
The research of Fredrikson et al. [?] show that
after training, sample data involved in the model
training can be reconstructed via model parameters,
even if data is remained locally. To minimize such
disclosure, Geyer et al. [?] incorporated differential
privacy in the aggregation update on the server side.
Differential privacy can indeed reduce the correla-
tion between final model and aggregated updates.
However, it is a post-processing method with some
limitations. Dealing with the results of FedAvg
directly and ignoring the training process may re-
duce the usability of the model and increase the
difficulty of observing the true expression of raw
data for noises. Moreover, such approach ignores
the protection of the updates transmitted during
the communication, making the model vulnerable
to inversion attacks. Agarwal et al. [?] proposed
to add noise distributedly to approximate global
privacy. Wei et al. [?] applied this method in
federated learning. The global model will satisfy
differential privacy when each part satisfies. How-
ever, the noise addition is performed by the server.
Such privacy preservation is invalid for untrusted
servers. Some researchers [?], [?] incorporated ho-
momorphic encryption into federated learning. For
such encryption based method, local updates are
transmitted and calculated in the form of ciphertext.
Therefore, it can well preserve the privacy without
losing model accuracy. However, the calculation
types supported by homomorphic encryption are
limited. For ciphertext, the encryption/decryption
and the calculation of it are with great computation
overhead [?]. The transmission of it is also time
consuming. These limitations make the approach
impractical to be used in IoE.
IV. PROPOSED METHOD
A. The system design
Considering the diversity of devices, we uni-
formly term the actual devices in IoE as clients for
the convenience of analysis. Fig. 1 illustrates the
main components and process of RR-LADP. For
a collaborative learning task, the server distributes
the initial model and samples clients. Then each
client randomly responds to the training request.
After that, the actual participants train the model
locally with their private data, and appropriately
add noise to weights difference for privacy pre-
serving. Finally, an edge computing node (i.e., edge
server with secure multi-party summation protocol)
aggregates all local updates and returns the result to
the server to update global model. Through several
rounds, the global model that incorporates contri-
butions from multiple clients can perform well.
B. RR federated learning
Due to the huge number of devices in IoE envi-
ronment, it is impracticable to involve all devices
that meet the training requirements in each training
round. Training tasks involving a large number of
devices is more likely to be interrupted, which
poses a great challenge to the distributed decision
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 4
Fig. 1: The RR-LADP Framework.
making capabilities of the server [?], [?]. In fact,
only a fraction of them will suffice to generate a
desirable model, and in this way, the communica-
tion pressure can be effectively reduced. However,
in traditional federated learning flow, the server
masters the whole process of client selection. We
propose a disturbance mechanism termed as RR,
which could cause some deviation between actual
participants and server sampling results. Therefore,
it is hard for the curious server and attackers to
correspond the final model to a particular client.
Before each round of communication, the server
checks and establishes communication with IoT
devices which meet the training requirements. Sup-
pose a total of Keligible clients participate in
global model construction. Server randomly selects
K0(K0K)clients for training. We define a state
parameter λk
twith value of 0or 1represents
whether client kparticipates in round t, and a
response probability p. Based on server sampling
results, clients initialize their state parameter. All
clients keep their state parameters unchanged with
probability pand flip them with probability (1 p).
Then, we can calculate participation probability of
each client kas follows:
P r[λk
t= 1] = K0
Kp+ (1 K0
K)(1 p)(3)
According to equation 3, server can estimate the
number of participants with ˆ
Kt=P r[λk
t= 1] K.
By constructing a maximum likelihood function, we
can verify that ˆ
Ktis the unbiased estimate of Kt,
which is the number of actual participants in round
t. In this way, server can get a value that is similar
to Ktfor FedAvg [?] in each round.
Additionally, we set p=eε
eε+1 to satisfy ε-
differential privacy so that RR federated learning
can meet rigorous rather than intuitive privacy guar-
antees [?]. The response probability pand privacy
budget εare positively correlated. The higher the
privacy budget, the more likely selected clients
response, which means that actual participation is
similar to the server sampling results. It may cause
the risk of privacy leakage. Conversely, a lower
privacy budget leads to a higher flip probability and
lower risk.
C. Local adaptive differential privacy
Centralized privacy enhancing process has cer-
tain risks, because the local updates of each par-
ticipant can be obtained before the aggregation.
According to the composition theorems in [?], the
global process can satisfy (ε, δ)-differential privacy,
if each local process satisfies (εk, δ)-differential
privacy and PK
k=1 εkε. Therefore, we consider
incorporating differential privacy into the client.
The following details our solution based on Fe-
dAvg.
Algorithm 1gives the basic process of LADP.
For each responding client, we train the weights
matrix to get the difference ωk
t=ωk
tωt1from
the global model generated by the last round. In
each epoch, we optimize loss function by batch
gradient descent (BGD) and record the 2-norm of
the difference matrix kωωt1k2. When epochs
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 5
Algorithm 1 Local update of LADP
update(k, ωt1)
initialize λk
t
if λk
t= 1 then
ωωt1
B ← split Setkinto batches
for each local epoch i = 1,2,3, ... do
for batch b ∈ B do
ωωη∇L(ω, b)
Ci← kωωt1k2
C=Ci(i= 1,2,3, ... |B|
|b|)
ω(ωωt1+1
|B| N(0, σ2S2))
ωω·min(1,C
kωk2)
else
ω0
return ω
reach the upper bound, we stop training and clip
difference matrix with C, which is the mean of
the norms. If kωωt1k2< C, keep elements
of difference matrix unchanged. Otherwise scale
down the elements with C
kωk2
. It can effectively
reduce the expression of private data and improve
the generalization ability of global model. Before
sending updates to the aggregator, we add noise to
it to enhance privacy.
We adopt the Gaussian mechanism distort local
updates of each client. Noise variance σ2S2deter-
mines the retention of contributions from clients.
Excessive noise means updates is highly distorted,
but less noise cannot meet the privacy preserving.
In each training, σis fixed, and the values of S
will be adjusted adaptively. On the one hand, we set
S=Cto adjust the noise addition according to the
updates itself. If a single updates is outstanding, the
noise will increase. On the other hand, we expect
the clients with different amounts of data could
contribute similarly to the global model. Thus, we
scale down the noise with 1
|B| .
D. Track privacy loss globally
Privacy loss reflects the risk of data privacy leak-
age. We adopt moments accountants [?] to track and
limit privacy loss. In model training with multiple
rounds, we consider the knowledge inheritance.
Suppose ξis the observation result of adjacent
datasets dand d0under M, we define privacy
loss with L(ξ)
M(pre,d)||M(pre,d0)= ln P r[M(pre,d)=ξ]
P r[M(pre,d0)=ξ].
The pre is including all previous outputs. Privacy
loss increases if the probability that the observation
comes from the original set is higher. We de-
fine α(τ)
M(pre,d)||M(pre,d0)as the cumulant generating
function of L(ξ)
M(pre,d)||M(pre,d0)at value τ. Con-
sidering all the adjacent datasets and all possible
previous outputs, we track the privacy loss of client
kas follows:
α(τ)
Mk
,max
pre,d,d0α(τ)
Mk(pre,d)||Mk(pre,d0)(4)
Then we can track the global privacy loss with
α(τ)
M=Pα(τ)
Mk. For any fixed privacy budget ε,
we can calculate the current value of slack variable
with δ= min
τeα(τ)
Mτ ε. When δreaches bound, the
accumulated privacy loss after current round is out
of tolerance. Thus, we stop training and return the
result. The setting of bound usually depends on
the sample space. Considering the disturbance from
both RR and LADP, we set the bound to 1
|KB| .
V. EXPERIMENT AND ANA LYSI S
We simulate the training process of federated
learning and apply our proposed mechanism to it.
These experimental results verified the feasibility
and effectiveness of our proposed mechanism. Each
client trains a fully connected neural network with
the same structure, which contains two hidden
layers with 600 and 400 neurons. The simple neural
network structure allows us to better evaluate the
impact of the mechanism. Cross entropy is chosen
as the loss function. The learning rate is 0.1and
the optimization method perform within clients
is BGD. In order to simulate non-IID distributed
data, we divide MNIST into different subsets, and
each of them contains only two or three digit
samples. A model trained on the dataset of a single
client cannot accurately recognize all digits. Before
training, each client divide its dataset into multiple
batches B={b1, b2, b3, ...}. For comparison, the
value of K0follows the CDP setting [?], that
is K0= 30,100,300 when K= 100,1000,10000.
Similarly, the batch size is set to 10 and the number
of epochs trained by each client is 4.
Separate performance: In the early stages of
the experiment, we evaluate the effects of RR and
LADP separately. To verify the feasibility of RR,
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 6
Fig. 2: Training process of RR federated learning. (We do not apply privacy bound here, but set the
max round to 100. The legend in plot 3also applies to the other two plots.)
Fig. 3: Results on the Accuracy for LADP and CDP. (σ= 1)
we conducted experiments on RR federated learn-
ing under different disturbances, which depends
on ε, when K= 100 and K0= 30. Although
the actual number of participants fluctuates around
estimated result, it does not have consequences on
the normal convergence and accuracy of the global
model(see Fig. 2). At the same time, we design
the comparative experiment of LADP and CDP to
observe the actual performance of LADP. Fig. 3
illustrates that the accuracy of LADP rises smoothly
and performs well with different privacy budgets,
especially when the participants are few and the
privacy budget is low. LADP directly adds noise
to difference matrices ω. Suppose the noise is ζ.
We can get ω+ζ=η(∇L +ζ
η). It is equivalent
to disturbing gradient so that the influence on the
final model is traceable and controllable. However,
CDP deals with the average, and treats the training
process of each client as a black box. Then, its
performance usually fluctuates.
Comparison with CDP: To compare RR-LADP
and CDP, we track privacy budget in each mecha-
nism by calculating δ. In fact, accumulated knowl-
edge inheritance pre makes the privacy loss of
each round increase rapidly. Saving privacy budget
of the same order of magnitude can not support
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 7
Fig. 4: Training process of RR-LADP and CDP
when K= 1000 and K0= 100. The solid lines
represent loss and the dotted lines represent δ.
(ε= 8, σ = 1)
more training rounds. Therefore, careful allocation
means efficient data usage. Fig. 4 tracks the loss
and δas communication rounds increased under
two mechanisms. It shows that RR-LADP allocated
the privacy budget more carefully, while the loss of
global model drops as similar to CDP.
TABLE I: Accuracy and time spent in each round
of RR-LADP and CDP. (ε= 8,σ= 1)
Clients Rounds CDP RR-LADP
100 100 0.73 |25.9s 0.80 |26.5s
1000 200 0.91 |85.7s 0.92 |89.2s
10000 400 0.96 |315.9s 0.97 |336.1s
Table I shows the average accuracy of mul-
tiple training and training time per round. RR-
LADP achieves higher accuracy than CDP, while
increasing the time cost. The accuracy of RR-LADP
depends on the effects of RR and LADP, which
has been described in the separate experiments.
For CDP, the noise addition operation for privacy
appears only once on the server side in each training
round. However for RR-LADP, all clients in one
training round should perform the noise addition
operation, bringing the more time delay.
Factors affecting RR-LADP: The accuracy of
the final model is affected by multiple factors,
such as the batch size, learning rate and other
common parameters in machine learning. We have
Fig. 5: Results on the accuracy and δfor different
privacy budgets. The solid lines represent
accuracy and the dotted lines represent δ.
(σ= 1, K = 1000, K0= 100)
not struggled to find the best combination of these
parameters, but only discuss some significant fac-
tors in RR-LADP, including number of clients,
privacy budget, and the noise.
The number of clients determines the training
data and determines the possibility of privacy leak-
age by affecting the sampling probability. The lower
privacy loss in each round means more training
rounds and higher accuracy. At the same time,
the privacy bound, which is set to 1
|KB| , decreases
significantly if the number of clients increases.
Privacy budget εplays a pivotal role in RR-
LADP (see Fig. 5). It controls the training process
of in two ways. First, it determines the response
probability p, which affects the disturbance in the
RR mechanism. In fact, K0
Kpin equation 3 rep-
resents the part selected by the server to participate
in training. The server infer the local updates of
a particular client easily if overlap ratio is high.
Second, after updating global model in each round,
we track δwith the fixed ε. The lower ε, the higher
δ, which means it is easier to reach the boundary
and the fewer rounds are allowed.
Our mechanism allows to control model perfor-
mance by choosing the value of σ. The level of
noise added in the local updates can directly affect
the accuracy of the model. As shown in Fig. 6,
the independent variable is the noise parameter σ.
When less noise is added, the privacy loss in each
round increases. After few rounds of training, δ
reaches bound when accuracy of the model is low.
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 8
Fig. 6: Results on the accuracy for different noise
parameter. (ε= 8, K = 1000, K0= 100)
By adding more noise, the model gains more train-
ing rounds with a fixed privacy budget. However,
much noise will reduce data availability, and limit
the accuracy.
VI. CO NC LU SI ON S
This paper proposes RR-LADP, a federated
learning mechanism for IoE, based on randomized
response client selection and differentially private
client model training. Different from existing ap-
proaches that require a trusted server to take charge
of the privacy-enhanced process, the core strategy
of our approach is to enhance the clients privacy
locally to adapt the approach into an environment
without trusted servers. It does so by preventing
the server from knowing which clients’ updates
are collected in each round, as well as adding
noise adaptively to clients’ local updates before
submitting them to the server. We show the reliable
performance of RR-LADP through experiments
with different parameter settings. While providing
a higher level privacy-preserving capability, our
approach achieves 0.97 training accuracy in the
experiments on MNIST. Additionally, as a modified
version of traditional federated learning framework,
our approach has a potential to be used to train
various machine learning models, instead of a
single structural model. In the current format of
RR-LADP, a global privacy budget is introduced
to control the whole training process, with RR
focusing on preserving the privacy of a client set
and LADP focusing on preserving the private data
of each client. Therefore, while the objections to
be protected are different, the two mechanisms are
sharing a privacy budget. Potential improvements
can be achieved by revising such structure. Hence,
a future study will give consideration to explore
a more delicate allocation of privacy budgets by
tracking privacy losses in the two mechanisms
separately. In addition, we will improve RR-LADP
by applying the mechanism into real-world IoE
environment, such as intelligent wearable devices
and Internet of Vehicles.
ACK NOWLEDG ME NT
This work was supported by the National Key
Research and Development Program of China
(2017YFB0802204), Key-Area Research and De-
velopment Program for Guangdong Province,
China (2019B010136001), Basic Research Project
of Shenzhen, China (JCYJ20190806143418198),
and Basic Research Project of Shenzhen, China
(JCYJ20190806142601687). Corresponding au-
thors: Weizhe Zhang and Yang Liu.
Zerui Li is currently a MSc student with School of Computer
Science and Technology, Harbin Institute of Technology (Shen-
zhen), China. His research interests include information secu-
rity and privacy. Contact him at 18S151552@stu.hit.edu.cn.
Yuchen Tian is currently a MSc student with School
of Computer Science and Technology, Harbin Institute of
Technology (Shenzhen), China. His research interests in-
clude information security and privacy. Contact him at
19S051060@stu.hit.edu.cn.
Weizhe Zhang is currently a professor in the School of
Computer Science and Technology at Harbin Institute of Tech-
nology, China. He has published more than 100 academic
papers in journals, books, and conference proceedings. He is
a senior member of the IEEE. He is the corresponding author
of this article. Contact him at wzzhang@hit.edu.cn.
Qing Liao is currently an associate professor with School
of Computer Science and Technology, Harbin Institute of
Technology (Shenzhen), China. She received her Ph.D degree
from the Hong Kong University of Science and Technology.
Contact her at liaoqing@hit.edu.cn.
Yang Liu is currently an assistant professor with School
of Computer Science and Technology, Harbin Institute of
Technology (Shenzhen), China. He received his D.Phil (Ph.D)
degree in department of computer science from University of
Oxford. He is the corresponding author of this article. Contact
him at liu.yang@hit.edu.cn.
Xiaojiang Du is a tenured Full Professor and the Director of
the Security And Networking (SAN) Lab in the Department
of Computer and Information Sciences at Temple University,
Philadelphia, USA. He has authored over 400 journal and
conference papers in these areas, as well as a book published
by Springer. He is an IEEE Fellow and a Life Member of
ACM. Contact him at dxj@ieee.org.
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
2162-2248 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/MCE.2021.3059958, IEEE Consumer
Electronics Magazine
IEEE CONSUMER ELECTRONICS MAGAZINE 9
Mohsen Guizani is currently a Professor with the CSE
Department, Qatar University, Qatar. He is currently the
Editor-in-Chief of the IEEE Network Magazine, serves on
the editorial boards of several international technical journals,
and the Founder and Editor-in-Chief of Wireless Commu-
nications and Mobile Computing (Wiley). Contact him at
mguizani@ieee.org.
Authorized licensed use limited to: University Town Library of Shenzhen. Downloaded on May 30,2021 at 02:50:56 UTC from IEEE Xplore. Restrictions apply.
... This section briefly describes the materials and methods of privacy-based IoT applications in the applying review process. A privacy-enhanced federated learning scheme for IoE was presented by Li et al. [6]. Two mechanisms, namely, the Random Response (RR) mechanism and the Local Adaptive Differential Security (LADP) mechanism, are applied in this approach. ...
... This sub-section presents a short analysis of the security/privacy-based IoT applications shown in refs. [3,[6][7][8][9][10][11][12][13][14][15][16][17][18][19]. IoT applications generate enormous amounts of data. ...
Article
Full-text available
The Internet of Things (IoT) is a self-configuring, intelligent system in which autonomous things connect to the Internet and communicate with each other. As ‘things’ are autonomous, it may raise privacy concerns. In this study, the authors describe the background of IoT systems and privacy and security measures, including (a) approaches to preserving privacy in IoT-based systems, (b) existing privacy solutions, and (c) recommending privacy models for different layers of IoT applications. Based on the results of our study, it is clear that new methods such as Blockchain, Machine Learning, Data Minimisation, and Data Encryption can greatly impact privacy issues to ensure security and privacy. Moreover, it makes sense that users can protect their personal information easier if there is fewer data to collect, store, and share by smart devices. Thus, this study proposes a machine learning-based data minimisation method that, in these networks, can be very beneficial for privacy-preserving.
... Edge computing relocates some storage and compute resources away from the central data center and closer to the data source. The three pillars of edge computing, namely autonomy, edge device security, and data sovereignty, are among its most signicant advantages [1,2]. While the disadvantages of the edge computing could be summarized in its low redundancy, potential loss as well as data poisoning, longer outrage time. ...
... Rate of a security breach due to breach at Level 1, i.e., User Id/Password authentication g 3 Rate of a security breach due to breach at Level 2, i.e., OTP authentication b 1 Transfer rate of exposed data to susceptible data at Level 1, i.e., User ID/Password b 2 Transfer rate of exposed data to breached data Eigen values. This is lucidly formed through the linearization of the Eq. ...
... This section briefly describes the materials and methods of privacy-based IoT applications in the applying review process. A privacy-enhanced federated learning scheme for IoE was presented by Li et al. [6]. Two mechanisms, namely, the Random Response (RR) mechanism and the Local Adaptive Differential Security (LADP) mechanism, are applied in this approach. ...
... This sub-section presents a short analysis of the security/privacy-based IoT applications shown in refs. [3,[6][7][8][9][10][11][12][13][14][15][16][17][18][19]. IoT applications generate enormous amounts of data. ...
... This section briefly describes the materials and methods of privacy-based IoT applications in the applying review process. A privacy-enhanced federated learning scheme for IoE was presented by Li et al. [6]. Two mechanisms, namely, the Random Response (RR) mechanism and the Local Adaptive Differential Security (LADP) mechanism, are applied in this approach. ...
... This sub-section presents a short analysis of the security/privacy-based IoT applications shown in refs. [3,[6][7][8][9][10][11][12][13][14][15][16][17][18][19]. IoT applications generate enormous amounts of data. ...
Article
Full-text available
The Internet of Things (IoT) is a self‐configuring, intelligent system in which autonomous things connect to the Internet and communicate with each other. As ‘things’ are autonomous, it may raise privacy concerns. In this study, the authors describe the background of IoT systems and privacy and security measures, including (a) approaches to preserving privacy in IoT‐based systems, (b) existing privacy solutions, and (c) recommending privacy models for different layers of IoT applications. Based on the results of our study, it is clear that new methods such as Blockchain, Machine Learning, Data Minimisation, and Data Encryption can greatly impact privacy issues to ensure security and privacy. Moreover, it makes sense that users can protect their personal information easier if there is fewer data to collect, store, and share by smart devices. Thus, this study proposes a machine learning‐based data minimisation method that, in these networks, can be very beneficial for privacy‐preserving.
... Edge computing relocates some storage and compute resources away from the central data center and closer to the data source. The three pillars of edge computing, namely autonomy, edge device security, and data sovereignty, are among its most significant advantages [1,2]. While the disadvantages of the edge computing could be summarized in its low redundancy, potential loss as well as data poisoning, longer outrage time. ...
... Rate of a security breach due to breach at Level 1, i.e., User Id/Password authentication g 3 Rate of a security breach due to breach at Level 2, i.e., OTP authentication b 1 Transfer rate of exposed data to susceptible data at Level 1, i.e., User ID/Password b 2 Transfer rate of exposed data to breached data C Per ineffective attack rate d 1 Transfer rate of suspected data to already breached Data d 2 ...
Article
Full-text available
It is difficult to manage massive amounts of data in an overlying environment with a single server. Therefore, it is necessary to comprehend the security provisions for erratic data in a dynamic environment. The authors are concerned about the security risk of vulnerable data in a Mobile Edge based distributive environment. As a result, edge computing appears to be an excellent perspective in which training can be done in an Edge-based environment. The combination of Edge computing and consensus approach of Blockchain in conjunction with machine learning techniques can further improve data security, mitigate the possibility of exposed data, and it reduces the risk of a data breach. As a result, the concept of federated learning provides a path for training the shared data. A dataset was collected that contained several vulnerable, exposed, recovered, and secured data and data security was precepted under the surveillance of two-factor authentication. This paper discusses the evolution of data and security flaws and their corresponding solutions in smart edge computing devices. The proposed model incorporates data security using consensus approach of Blockchain and machine learning techniques that include several classifiers and optimization techniques. Further, the authors applied the proposed algorithms in an edge computing environment by distributing several batches of data to different clients. As a result, the client privacy was maintained by using Blockchain servers. Furthermore, the authors segregated the client data into batches that were trained using the federated learning technique. The results obtained in this paper demonstrate the implementation of a Blockchain-based training model in an edge-based computing environment.
... The purpose of presented infrastructure is for reducing the latency and energy, and make sure enhanced security features with Blockchain technologies. Li et al. [16] presented a privacy-enhanced federated learning model for IoE. 2 processes that were executed in our systems such as local adaptive differential privacy (LADP) and randomized response (RR) processes. The RR was implemented for preventing the server from discovering if upgrades are gathered from all the rounds. ...
... The equality is replaced by estimated equality same as the Cheon, Kim, Kim and Song (CKKS) technique to estimate arithmetic [16]. ...
Article
Full-text available
Recent developments of semiconductor and communication technologies have resulted in the interconnection of numerous devices in offering seamless communication and services, which is termed as Internet of Everything (IoE). It is a subset of Internet of Things (IoT) which finds helpful in several applications namely smart city, smart home, precise agriculture, healthcare, logistics, etc. Despite the benefits of IoE, it is limited to processing and storage abilities, resulting in the degradation of device safety, privacy, and efficiency. Security and privacy become major concerns in the transmission of multimedia data over the IoE network. Encryption and image steganography is considered effective solutions to accomplish secure data transmission in the IoE environment. For resolving the limitations of the existing works, this article proposes an optimal multikey homomorphic encryption with steganography approach for multimedia security (OMKHES-MS) technique in the IoE environment. Primarily, singular value decomposition (SVD) model is applied for the separation of cover images into RGB elements. Besides, optimum pixel selection process is carried out using coyote optimization algorithm (COA). At the same time, the encryption of secret images is performed using poor and rich optimization (PRO) with multikey homomorphic encryption (MKHE) technique. Finally, the cipher image is embedded into the chosen pixel values of the cover image to generate stego image. For assessing the better outcomes of the OMKHES-MS model, a wide range of experiments were carried out. The extensive comparative analysis reported the supremacy of the proposed model over the rennet approaches interms of different measures.
Article
As numerous consumer electronics applications like smartphones and wearables generate lots of distributed data daily, consumer desire to safely and efficiently tackle private and isolated data. Federated learning (FL) is hopeful to satisfy the above requirement due to strong data security and applicability to large-scale scenarios. But diverse clients inevitably cause non-independent and identically distributed (non-iid) data among clients, which severely hinders performance analysis. Besides, affected by non-iid data, the participating clients are typically heterogeneous, which induces the client sampling problem. More importantly, albeit FL can enhance privacy via data localization, for highly secret data like physiological data from wearables, FL should possess stronger security to prevent third-party attacks. For data heterogeneity, client sampling, and privacy security, we propose differential privacy (DP) enabled and importance-aware FL algorithm DPFLICS to jointly handle these problems. Specifically, we utilize the truncated concentrated DP to tightly track the end-to-end privacy loss. To attain better sampling, the server selects partial clients with the probability derived from our importance client sampling. Moreover, to further improve performance, we also leverage the adaptive YOGI optimizer on the server side, which is an adaptive gradient method improved from the widely-used ADAM optimization. Finally, the multiple experiments exhibit the effectiveness of our method.
Article
Matrix factorization is a popular recommendation method used in many fields, but it often lacks sufficient privacy protection. To address privacy concerns in connected living, we propose a user-centric federated matrix factorization model based on differential privacy (FMF-DP). First, we design a federated matrix factorization framework where each user is a participant, and only uploads their gradients after completing the training task locally. Second, we introduce the user-average-rating bias into the matrix factorization method to improve accuracy. Finally, we use differential privacy to enhance privacy protection without compromising computational efficiency. We have implemented a prototype of FMF-DP and tested it on several datasets, showing promising results. Compared with the baseline model, the RMSE is reduced by 0.25 on average, and the computational efficiency is also improved.
Article
Full-text available
Federated Learning (FL) is concept that has been adopted in medical field to analyze data in individual devices through aggregation of machine learning model in global server. It also provides data privacy being that the sampled devices are not allowed to share data among themselves. Therefore, it minimizes computation costs and privacy risks to some extent compared to conventional methods of machine learning. However, federation learning provides a different use case in health as compared to other sectors. Preservation of patients' sensitive information such as electronic health record (EHR) when sharing data among different medical practitioners is of greatest concern. So the question is, how should FL techniques be structured in the current clinical environment where heterogeneity is the order of the day?The EU's General Data Protection Regulation (GDPR) and Health Insurance Portability and Accountability Act of 1996 (HIPPA) regulations recommends health providers to gain authorizations from patients before sharing their private data for medical analytical progression. This leads to some bottlenecks in clinical analysis. Although attempts have been made to address some of the challenges, privacy, performance, implementation, computation and adversaries still pose some threats. This paper provides a comprehensive review that covers literature, mathematical notations, architecture, process flow, challenges and frameworks used to implement FL with respect to healthcare. Possible solutions on how to address privacy challenges in accordance with HIPPA act and GDPR is discussed. Finally, the study gives future direction of FL in clinical health and a list of practical tools to conduct analysis on patients' data.
Article
Full-text available
Data have always been a major priority for businesses of all sizes. Businesses tend to enhance their ability in contextualizing data and draw new insights from it as the data itself proliferates with the advancement of technologies. Federated learning acts as a special form of privacy-preserving machine learning technique and can contextualize the data. It is a decentralized training approach for privately collecting and training the data provided by mobile devices, which are located at different geographical locations. Furthermore, users can benefit from obtaining a well-trained machine learning model without sending their privacy-sensitive personal data to the cloud. This article focuses on the most significant challenges associated with the preservation of data privacy via federated learning. Valuable attack mechanisms are discussed, and associated solutions are highlighted to the corresponding attack. Several research aspects along with promising future directions and applications via federated learning are additionally discussed.
Article
Full-text available
Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, which could originate from any party contributing during federated optimization. In such an attack, a client's contribution during training and information about their data set is revealed through analyzing the distributed model. We tackle this problem and propose an algorithm for client sided differential privacy preserving federated optimization. The aim is to hide clients' contributions during training, balancing the trade-off between privacy loss and model performance. Empirical studies suggest that given a sufficiently large number of participating clients, our proposed procedure can maintain client-level differential privacy at only a minor cost in model performance.
Conference Paper
Full-text available
We design a novel, communication-efficient, failure-robust protocol for secure aggregation of high-dimensional data. Our protocol allows a server to compute the sum of large, user-held data vectors from mobile devices in a secure manner (i.e. without learning each user's individual contribution), and can be used, for example, in a federated learning setting, to aggregate user-provided model updates for a deep neural network. We prove the security of our protocol in the honest-but-curious and active adversary settings, and show that security is maintained even if an arbitrarily chosen subset of users drop out at any time. We evaluate the efficiency of our protocol and show, by complexity analysis and a concrete implementation, that its runtime and communication overhead remain low even on large data sets and client pools. For 16-bit input values, our protocol offers $1.73 x communication expansion for 2¹⁰ users and 2²⁰-dimensional vectors, and 1.98 x expansion for 2¹⁴ users and 2²⁴-dimensional vectors over sending data in the clear.
Article
Federated learning (FL), as a type of distributed machine learning, is capable of significantly preserving clients’ private data from being exposed to adversaries. Nevertheless, private information can still be divulged by analyzing uploaded parameters from clients, e.g., weights trained in deep neural networks. In this paper, to effectively prevent information leakage, we propose a novel framework based on the concept of differential privacy (DP), in which artificial noise is added to parameters at the clients’ side before aggregating, namely, noising before model aggregation FL (NbAFL). First, we prove that the NbAFL can satisfy DP under distinct protection levels by properly adapting different variances of artificial noise. Then we develop a theoretical convergence bound on the loss function of the trained FL model in the NbAFL. Specifically, the theoretical bound reveals the following three key properties: 1) there is a tradeoff between convergence performance and privacy protection levels, i.e., better convergence performance leads to a lower protection level; 2) given a fixed privacy protection level, increasing the number $N$ of overall clients participating in FL can improve the convergence performance; and 3) there is an optimal number aggregation times (communication rounds) in terms of convergence performance for a given protection level. Furthermore, we propose a $K$ -client random scheduling strategy, where $K$ ( $1\leq K< N$ ) clients are randomly selected from the $N$ overall clients to participate in each aggregation. We also develop a corresponding convergence bound for the loss function in this case and the $K$ -client random scheduling strategy also retains the above three properties. Moreover, we find that there is an optimal $K$ that achieves the best convergence performance at a fixed privacy level. Evaluations demonstrate that our theoretical results are consistent with simulations, thereby facilitating the design of various privacy-preserving FL algorithms with different tradeoff requirements on convergence performance and privacy levels.
Article
Internet of Things (IoT) devices and the edge jointly broaden the IoT’s sensing capability and the monitoring scope for various applications. Though accessing sensing data and making decisions through IoT smart devices turns out to be commonplace, it is challenging to guarantee user privacy and preserve the accuracy (integrity) of the collected data. The IoT smart devices frequently lose either IoT user’s privacy or data integrity. This also makes it crucial to put a threshold on the cost of computation and load of the IoT devices, as gradually more IoT services demand access to the resources that devices offer. In this article, we propose BalancePIC , a scheme that attempts to preserve a balance in the three aspects (user privacy, data integrity in edge-assisted IoT devices, and the computational cost). It achieves the balance through a balanced truth discovery approach and a proposed enhanced technique for data privacy, which are used in IoT devices and edge server interactions. It authenticates the IoT user participation with privacy in the truth discovery process through a biometric-ECC-based authentication algorithm. The nature of the BalancePIC scheme is to straightforwardly provide the likelihood for a simple amendment on the cryptography technique and weight assignment. This lessens the overall computational cost for the IoT user devices but also restricts the communications between the user devices and the edge server, which is important for data integrity. We present an enhanced technique to preserve privacy by guarding the user from potential threats and suspicious data collection parties. To achieve this, BalancePIC takes steps to blur the original sensory data of the device by processing results in groups called zones. Simulation result analysis provides evidence for the balance preservation in the three aspects.
Article
Recent advances in Internet of Things (IoT) have enabled myriad domains such as smart homes, personal monitoring devices, and enhanced manufacturing. IoT is now pervasive—new applications are being used in nearly every conceivable environment, which leads to the adoption of device-based interaction and automation. However, IoT has also raised issues about the security and privacy of these digitally augmented spaces. Program analysis is crucial in identifying those issues, yet the application and scope of program analysis in IoT remains largely unexplored by the technical community. In this article, we study privacy and security issues in IoT that require program-analysis techniques with an emphasis on identified attacks against these systems and defenses implemented so far. Based on a study of five IoT programming platforms, we identify the key insights that result from research efforts in both the program analysis and security communities and relate the efficacy of program-analysis techniques to security and privacy issues. We conclude by studying recent IoT analysis systems and exploring their implementations. Through these explorations, we highlight key challenges and opportunities in calibrating for the environments in which IoT systems will be used.
Conference Paper
Machine learning techniques based on neural networks are achieving remarkable results in a wide variety of domains. Often, the training of models requires large, representative datasets, which may be crowdsourced and contain sensitive information. The models should not expose private information in these datasets. Addressing this goal, we develop new algorithmic techniques for learning and a refined analysis of privacy costs within the framework of differential privacy. Our implementation and experiments demonstrate that we can train deep neural networks with non-convex objectives, under a modest privacy budget, and at a manageable cost in software complexity, training efficiency, and model quality.
Conference Paper
Machine-learning (ML) algorithms are increasingly utilized in privacy-sensitive applications such as predicting lifestyle choices, making medical diagnoses, and facial recognition. In a model inversion attack, recently introduced in a case study of linear classifiers in personalized medicine by Fredrikson et al., adversarial access to an ML model is abused to learn sensitive genomic information about individuals. Whether model inversion attacks apply to settings outside theirs, however, is unknown. We develop a new class of model inversion attack that exploits confidence values revealed along with predictions. Our new attacks are applicable in a variety of settings, and we explore two in depth: decision trees for lifestyle surveys as used on machine-learning-as-a-service systems and neural networks for facial recognition. In both cases confidence values are revealed to those with the ability to make prediction queries to models. We experimentally show attacks that are able to estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other and, in the other context, show how to recover recognizable images of people's faces given only their name and access to the ML model. We also initiate experimental exploration of natural countermeasures, investigating a privacy-aware decision tree training algorithm that is a simple variant of CART learning, as well as revealing only rounded confidence values. The lesson that emerges is that one can avoid these kinds of MI attacks with negligible degradation to utility.
Article
The problem of privacy-preserving data analysis has a long history spanning multiple disciplines. As electronic data about individuals becomes increasingly detailed, and as technology enables ever more powerful collection and curation of these data, the need increases for a robust, meaningful, and mathematically rigorous definition of privacy, together with a computationally rich class of algorithms that satisfy this definition. Differential Privacy is such a definition. After motivating and discussing the meaning of differential privacy, the preponderance of this monograph is devoted to fundamental techniques for achieving differential privacy, and application of these techniques in creative combinations, using the query-release problem as an ongoing example. A key point is that, by rethinking the computational goal, one can often obtain far better results than would be achieved by methodically replacing each step of a non-private computation with a differentially private implementation. Despite some astonishingly powerful computational results, there are still fundamental limitations – not just on what can be achieved with differential privacy but on what can be achieved with any method that protects against a complete breakdown in privacy. Virtually all the algorithms discussed herein maintain differential privacy against adversaries of arbitrary computational power. Certain algorithms are computationally intensive, others are efficient. Computational complexity for the adversary and the algorithm are both discussed. We then turn from fundamentals to applications other than query-release, discussing differentially private methods for mechanism design and machine learning. The vast majority of the literature on differentially private algorithms considers a single, static, database that is subject to many analyses. Differential privacy in other models, including distributed databases and computations on data streams is discussed. Finally, we note that this work is meant as a thorough introduction to the problems and techniques of differential privacy, but is not intended to be an exhaustive survey – there is by now a vast amount of work in differential privacy, and we can cover only a small portion of it.