PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Existing traffic flow forecasting technologies achieve great success based on deep learning models on a large number of datasets gathered by organizations. However, there are two critical challenges. One is that data exists in the form of "isolated islands". The other is the data privacy and security issue, which is becoming more significant than ever before. In this paper, we propose a Federated Learning-based Gated Recurrent Unit neural network framework (FedGRU) for traffic flow prediction (TFP) to address these challenges. Specifically, FedGRU model differs from current centralized learning methods and updates a universe learning model through a secure aggregation parameter mechanism rather than sharing data among organizations. In the secure parameter aggregation mechanism, we introduce a Federated Averaging algorithm to control the communication overhead during parameter transmission. Through extensive case studies on the Performance Measurement System (PeMS) dataset, it is shown that FedGRU model can achieve accurate and timely traffic prediction without compromising privacy.
Content may be subject to copyright.
FedGRU: Privacy-preserving Traffic Flow Prediction via Federated
Learning
Yi Liu1,2,, Shuyu Zhang1,, Chenhan Zhang1, James J.Q. Yu1
Abstract Existing traffic flow forecasting technologies
achieve great success based on deep learning models on a
large number of datasets gathered by organizations. However,
there are two critical challenges. One is that data exists in the
form of “isolated islands”. The other is the data privacy and
security issue, which is becoming more significant than ever
before. In this paper, we propose a Federated Learning-based
Gated Recurrent Unit neural network framework (FedGRU)
for traffic flow prediction (TFP) to address these challenges.
Specifically, FedGRU model differs from current centralized
learning methods and updates a universe learning model
through a secure aggregation parameter mechanism rather than
sharing data among organizations. In the secure parameter
aggregation mechanism, we introduce a Federated Averaging
algorithm to control the communication overhead during pa-
rameter transmission. Through extensive case studies on the
Performance Measurement System (PeMS) dataset, it is shown
that FedGRU model can achieve accurate and timely traffic
prediction without compromising privacy.
I. INTRODUCTION
Urban residents, taxi drivers, business sectors, and govern-
ment agencies have an immediate requirement on accurate
and timely traffic flow information [1]. Such information
can help the traffic sector to alleviate traffic congestion,
control traffic light, and improve the efficiency of traffic
operations and residents for developing better traveling plans
[2]. Traffic flow prediction (TFP) is to provide such traffic
flow information by using historical traffic flow data to
predict the future [3]. TFP is regarded as a critical technology
of the deployment of Intelligent Transportation System (ITS)
subsystems, particularly the advanced traveler information,
online car-hailing, and traffic management systems.
In the previous TFP literature, Convolutional Neural Net-
works (CNN), Recurrent Neural Networks (RNN), and their
variants have achieved gratifying results in predicting traffic
flow. Such centralized machine learning methods are typical-
ly utilized to predict traffic flow by training with sufficient
sensor data from mobile phones, cameras, radars, etc. In this
context, these methods generally require data aggregation
among public agencies and private companies. Indeed, the
general public witnessed partnerships among public agencies
Equal contributions. This work is supported by the Generation Program
of Guangdong Natural Science Foundation (Grant No. 2019A1515011032)
and the Ministry of Education of China and the School of Entrepreneurship
Education of Heilongjiang University (Grant No. 201910212133).
1Yi Liu, Shuyu Zhang, Chenhan Zhang, and James J.Q. Yu are with
Department of Computer Science and Engineering, Southern University
of Science and Technology, Shenzhen, China 97liuyi@ieee.org,
{11712122, zhangch}@mail.sustech.edu.cn,
yujq3@sustech.edu.cn
2Yi Liu is also with School of Data Science and Technology, Heilongjiang
University, Harbin, China
and mobile service providers such as DiDi Chuxing, Uber,
and Hellobike in recent years. These partnerships extend the
capability and services of companies that provide real-time
traffic flow forecasting, traffic management, car sharing, and
personal travel applications.
Nevertheless, it is often overlooked that the data may
contain sensitive private information (e.g., user’s traveling
track, home address etc.), which leads to potential privacy
leakage. Therefore, different organizations should store their
user data locally and avoid exchanges to protect users’
privacy, which makes it challenging to train an effective
model with the valuable data. While the assumption that
an organization owns all the data is widely made in the
literature, the acquisition of massive user data is not possible
in real applications respecting privacy. To predict traffic flow
in ITS without compromising privacy, reference [4] intro-
duced a privacy control mechanism based on “k-anonymous
diffusion,” which can complete taxi order scheduling with-
out leaking user privacy. Le Ny et al. in [5] proposed a
differentially private real-time traffic state estimator system
to predict traffic flow. However, these privacy-preserving
methods cannot achieve the trade-off between accuracy and
privacy, rendering degraded system performance.
To address the data privacy leakage issue, we incorporate a
privacy-preserving machine learning technique named feder-
ated learning (FL) [6] for TFP in this work. In FL, distributed
organizations cooperatively train a globally shared model
through their local data without exchanging the raw data.
To accurately predict traffic flow, we propose an enhanced
federated learning algorithm with a Gated Recurrent Unit
neural network (FedGRU) in this paper. Through FL and
its aggregation mechanism [7], FedGRU aggregates model
parameters from different geographically located organiza-
tions to build a global deep learning model under privacy
well-preserved conditions. Furthermore, contributed by the
outstanding data regression capability of GRU neural net-
works, FedGRU can achieve accurate and timely traffic flow
prediction for multiple organizations.
The main contributions are summarized as follows:
We propose a novel privacy-preserving algorithm that
integrates emerging federated learning with a practical
GRU neural network for traffic flow prediction. Such
an algorithm provides reliable data privacy preservation
through a locally training model without raw data
exchange.
We introduce a Federated Averaging (FedAVG) algo-
rithm in the secure parameter aggregation mechanism
which runs Stochastic Gradient Descent (SGD) on a
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
selected subset of all organizations to aggregate model
parameters to update the global model.
FedGRU is a standard, stable, and extensible FL frame-
work for ITS. As a privacy-preserving framework, it not
only achieves accuracy close to traditional models but
also can be combined with other and future state-of-the-
art deep learning models.
The remainder of this paper is organized as follows.
Section II reviews the literature on short-term TFP and
privacy research in ITS. Section III defines the Centralized
TFP Learning problem and Federated TFP Learning problem
and proposes a security parameter aggregation mechanism.
Section IV presents the FedGRU framework. Section V
discusses the experimental results. Concluding remarks are
described in Section VI.
II. RELATED WORK
A. Traffic Flow Prediction
Traffic flow prediction (TFP) has always been a hot issue
in ITS, which can improve the efficiency of real-time traffic
control and urban planning. Although researchers have pro-
posed many new models and methods, they can generally be
divided into two categories: parametric and non-parametric
models.
1) Parametric models: Parametric models predict future
data by capturing existing data feature within its parameters.
M. S. Ahmed et al. in [8] proposed the Autoregressive Inte-
grated Moving Average (ARIMA) model in the 1970s to pre-
dict short-term freeway traffic. Since then, many researchers
have proposed variants of ARIMA such as Kohonen-ARIMA
(KARIMA), subset ARIMA, seasonal ARIMA, etc. These
models further improve the accuracy of TP by focusing on
the statistical correlation of the data.
2) Non-parametric models: With the improvement of
data storage and computing, non-parametric models have
achieved great success in TFP [9]. Davis and Nihan et
al. in [10] proposed k-NN model for short-term traffic
flow prediction. Lv et al. in [1] first applied the stacked
autoencoder (SAE) model for TFP. Furthermore, SAE adopts
a hierarchical greedy network structure to learn non-linear
features and has better performance than support vector ma-
chines (SVM) [11] and feed-forward neural network (FFNN)
[12]. Considering the temporal correlation of the data, Ma et
al. in [13] and Tian et al. in [14] applied Long Short-Term
Memory (LSTM) to achieve accurate and timely TFP. Fu
et al. in [15] first proposed GRU neural network methods
for TFP. In recent years, due to the success of convolutional
networks and graph networks, Yu et al. in [16], [17] proposed
graph convolutional generative autoencoder to address the
real-time traffic speed estimation problem.
B. Privacy Research in Intelligent Transportation Systems
In ITS, many models and methods rely on training data
from users or organizations. However, with the increasing
privacy awareness of users and organizations, direct ex-
changes of data between users or organizations are advocated
against the law. Brian et al. in [18] designed a data sharing
algorithm based on information-theoretic k-anonymity prin-
ciple. However, this algorithm may leak privacy during data
sharing operations. Furthermore, the EU has promulgated
GDPR, which means that as long as the organization has the
possibility of revealing privacy in the data sharing process,
such data transactions violate the law.
Although researchers have proposed some privacy-
preserving methods to predict traffic flow in the literature,
they still cannot meet the requirements of GDPR. In this
paper, we explore a powerful privacy-preserving method with
GRU for traffic flow prediction.
III. PROBLEM DEFINITION
We use the term “organization” throughout the paper to
describe entities in TFP, such as urban agencies, private
companies, and detector stations. We use the term “client” to
describe computing nodes that correspond to one or multiple
sensors in FL and use the term “device” to describe the sen-
sor in the organizations. Let C={C1,C
2,···,C
n}and O=
{O1,O
2,···,O
m}denote the client set and organization set
in ITS, respectively. Each client has qorganizations. Each
organization has kidevices and their respective database Di.
We aim to predict the number of vehicles with historical
traffic flow information from different organizations without
sharing raw data and privacy leakage. We design a secure
parameter aggregation mechanism as follows:
Secure Parameter Aggregation Mechanism: Detector
station Oihas Ndevices, and the traffic flow data collected
by the Ndevices constitute a database Di. The deep
learning model constructed in Oicalculates updated model
parameters piusing the local training data from Di. When
all detector stations finish the same operation, they upload
their respective pito the cloud and aggregate a new global
model.
According to Secure Parameters Aggregation, no traffic
flow data is exchanged among different detector stations. The
cloud aggregates the gradients uploaded by organizations to
obtain a new global model without exchanging data.
In this paper, tand vtrepresent the t-th timestamp in the
time-series and traffic flow at the t-th timestamp, respec-
tively. Let f(·)be the traffic flow prediction function, the
centralized and federated TFP learning problems are defined
as follows:
Centralized TFP Learning: Given organizations O, each
organization’s devices ki, and an aggregated database D=
D1D2D3∪···∪DN, the centralized TFP problem is
to calculate vt+s=f(t+s, D), where sis the prediction
window after t.
Federated TFP Learning: Given organizations Oand
each organization’s devices ki, and their respective database
Di, the federated TFP problem is to calculate vt+s=fi(t+
s, Di)where fi(·,·)is the local version of f(·,·)and sis
the prediction window after t. Subsequently, the produced
results are aggregated by a secure parameter aggregation
mechanism.
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
IV. METHODOLOGY
A. Federated Learning and Gated Recurrent Unit
Federated Learning (FL) [6] is a distributed machine
learning (ML) paradigm that has been designed to train ML
models without compromising privacy. With this scheme,
different organizations can contribute to the overall model
training while keeping the training data locally.
Particularly, FL problem involves learning a single and
globally predicted model from the database separately stored
in dozens of or even hundreds of organizations. We assume
that each device kstores its local dataset Dkof size Dk.So
we can define the local training dataset size D=K
k=1 Dk.
In a typical deep learning setting, given a set of input-output
pairs {xi,y
i}|Dk|
i=1 , where the input sample vector with d
features is xiRd, and the labeled output value for the
input sample xiis yiR. If we input the training sample
vector xi(e.g., the traffic flow data), we need to find the
model parameter vector ωRdthat characterrizes the output
yi(e.g., the value output of the traffic flow data) with loss
function fi(ω)(e.g., fi(ω)=1
2(xT
iωyi)). Our goal is to
learn this model under the constraints of local data storage
and processing by devices in the organization with a secure
parameter aggregation mechanism. For local data of client c,
we aim to minimize the objective function as follows:
Jc(ω)= 1
|Dk||Dk|
i=1 fi(ω)+λh(ω),(1)
where the local model parameter ωRd,λ[0,1], and
h(·)is a regualarizer function. This characterizes the local
model in the FL setting.
At the cloud, the global predicted model problem can be
represented as follows:
arg min
ωRdJ(ω),J(ω)|Dk|
k=1 DkJc(ω)
D,(2)
we recast the global predicted model problem in (2) as
follows:
arg min
ωRdJ(ω):= |Dk|
k=1 fi(ω)+λh(ω)
D.(3)
For TFP problem, we regard GRU neural network model
as the local model in Equation (1). Cho et al. in [19]
proposed the GRU neural network in 2014, which is a variant
of RNN that handles time-series data. GRU is different from
RNN is that it adds a “Processor” to the algorithm to judge
whether the information is useful or not. The structure of the
processor is called “Cell.” A typical structure of GRU cell
uses two data “gates” to control the data from processor:
reset gate rand update gate z.
Let X={x1,x
2,···,x
n},Y={y1,y
2,···,y
n}, and
H={h1,h
2,···,h
n}be the input time series, output time
series and the hidden state of the cells, respectively. At time
step t, the value of update gate ztis expressed as:
zt=σ(W(z)xt+U(z)ht1),(4)
where xtis the input vector of the t-th time step, W(z)is the
weight matrix, and ht1holds the cell state of the previous
time step t1. The update gate aggregates W(z)xtand
U(z)ht1, then maps the results in (0,1) through a Sigmoid
activation function. The reset gate rtis computed similarly
to the update gate:
rt=σ(W(z)xt+U(r)ht1).(5)
The candidate activation htis denoted as:
ht= tanh(Wx
t+rtUht1),(6)
where rtUht1represents the Hadamard product of rt
and Uht1.
The final memory of the current time step tis calculated
as follows:
ht=ztht1+(1zt)ht.(7)
Traditional learning methods typically include three steps:
data processing, data fusion, and modeling. Among them,
data fusion is used for traditional learning model that directly
shares data among all parties to obtain a global database for
training. However, such a centralized learning approach faces
the challenge of new data privacy laws and regulations as
organizations may disclose privacy when sharing data. FL is
introduced into this context to address the above challenges.
B. Privacy-preserving Traffic Flow Prediction Algorithm
We develop a FL framework FedGRU to fully handle the
data privacy infringement issues in traffic flow prediction
task. We first introduce a FedAVG algorithm as an imple-
mentation of the secure parameter aggregation mechanism
to collect gradient information. Then, we illustrate the feder-
ated traffic flow prediction learning architecture. Finally, we
demonstrate the details of the FedGRU algorithm.
Government Company Station
Local
training
data
GRADIENT
Local
training
data
Local
training
data
AGGREGATOR
G/2%$/ M2'(/
&OLHQW &OLHQW &OLHQW
Fig. 1. Federated traffic flow prediction learning architecture.
1) FedAVG algorithm: A recognized problem in federated
learning is the limited network bandwidth that bottlenecks
cloud-aggregated local updates from the organizations. To
reduce the communication overhead, each client uses its local
data to perform gradient descent optimization on the current
model. Then the central cloud performs a weighted average
aggregation of the model updates uploaded by the clients.
As shown in Algorithm 1, FedAVG consists of three steps:
(i) The cloud selects volunteers from organizations Oto
participate in this round of training and broadcasts
global model ωoto the selected organizations;
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
Algorithm 1: Federated Averaging (FedAVG) Algorith-
m.
Input: Organizations O={O1,O
2,···,O
N}.Bis the
local mini-batch size, Eis the number of local
epochs, αis the learning rate, ∇L(·;·)is the
gradient optimization function.
Output: Parameter ω.
1Initialize ω0(Pre-trained by a public dataset);
2foreach round t=1,2,··· do
3{Ov}←select volunteer from organizations O
participate in this round of training;
4Broadcast global model ωoto organization in {Ov};
5foreach organization o∈{Ov}in parallel do
6Initialize ωo
t=ωo;
7ωo
t+1 (ωo
t+1 LocalUpdate(o, ω o
t);
8ωt+1 1
|{Ov}| oOvωo
t+1;
9LocalUpdate(o, ωo
t):// Run on organization o;
10 B←(split Sointo batches of size B);
11 if each local epoch ifrom 1 to Ethen
12 if batch b∈Bthen
13 ωωα·∇L(ω;b);
14 return ωto cloud
(ii) Each organization otrains data locally and updates ωo
t
for Eepochs of SGD with mini-batch size Bto obtain
ωo
t+1, i.e., ωo
t+1 LocalUpdate(o, ω o
t);
(iii) The cloud aggregates each organization’s ωt+1 through
a secure parameter aggregation mechanism.
FedAVG algorithm is a critical mechanism in FedGRU to
reduce the communication overhead in the process of trans-
mitting parameters. This algorithm is an iterative process.
For the i-th round of training, the models of the organizations
participating in the training will be updated to the new global
one.
2) Federated Learning-based Gated Recurrent Unit neu-
ral network algorithm: FedGRU aims to achieve accurate
and timely TFP through combining FL and GRU without
compromising privacy. The overview of FedGRU is shown
in Fig. 1. It consists of four steps:
i) The cloud model is initialized through pre-training that
utilizes domain-specific public datasets without privacy
concerns;
ii) The cloud distributes the copy of the global model to
all organizations, and each organization trains its copy
on local data;
iii) Each organization uploads model updates to the cloud.
The entire process does not share any private data, but
instead sharing the encrypted parameters;
iv) The cloud aggregates the updated parameters uploaded
by all organizations by the secure parameter aggregation
mechanism to build a new global model, and then
distributes the new global model to each organization.
Algorithm 2: Federated Learning-based Gated Recurrent
Unit neural network (FedGRU) algorithm.
Input:{Ov}⊆O,X,Yand H. The mini-batch size
m, the number of iterations nand the learning
rate α. The optimizer SGD.
Output:J(ω),ωand Wr
v,Wz
v,Wh
v.
1According to X,Y,Hand Equations (8)–(12),
initialize the cloud model J(ω0),ω0,Wr0
v,Wz0
v,Wh0
v,
and H0
v;
2foreach round i=1,2,3,··· do
3{Ov}←select volunteer from organizations to
participate in this round of training ;
4while gωhas not convergence do
5foreach organization oOvin parallel do
6Conduct a mini-batch input time step
{xv(i)}m
i=1;
7Conduct a mini-batch true traffic flow
{yv(i)}m
i=1;
8Initalize ωo
t+1 =ωo
t;
9gω←∇
ω1
mm
i=1 fω(x(i)
v)y(i)
v2;
10 ωo
t+1 ωo
t+α·SGD(ωo
t,g
ω);
11 Update the parameters Wr0
v,Wz0
v,Wh0
v,
and H0
v;
12 Update reset gate rand update gate z;
13 Collect the all parameters from {Ov}to update
ωt+1. (Referring to the Algorithm 1.);
14 return J(ω),ωand Wr
v,Wz
v,Wh
v
Given voluntary organization {Ov}⊆Oand ov∈{Ov},
referring to the GRU neural network in Section IV-A, we
have:
zt
v=σ(W(zv)+U(zv)ht1
v)(8)
rt
v=σ(W(rv)+U(rv)ht1
v)(9)
ht
v= tanh(Wx
t
v+rt
vUht1
v)(10)
ht
v=zt
vht1
v+(1zt
v)ht
v(11)
where X={x1
v,x
2
v, ..., xn
v},Y ={y1
v,y
2
v, ..., yn
v},H =
{h1
v,h
2
v, ..., hn
v}denote ov’s input time series, ovs output
time series and the hidden state of the cells, respectively.
According to Equation 3, the objective function of FedGRU
is as follows:
arg min
ωJ(ω)=min|Dv|
i=1 T
t=1
1
2(ydyt
v)2(12)
The pseudocode of FedGRU framework is presented in
Algorithm 2.
V. EXPERIMENTS
A. Dataset Pre-Processing and Evaluation Method
In this experiment, the proposed FedGRU is applied to
the real-world data collected from the Caltrans Performance
Measurement System (PeMS) [20] database for performance
demonstration. The traffic flow data in PeMS database was
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
TABLE I
PERFORMANCE COMPARSION OF MAE, MSE, RMSE, AND MAPE FOR
FEDGRU, LSTM, SAE, AND SVM
Metrics MAE MSE RMSE MAPE
FedGRU (default setting) 7.96 101.49 11.04 17.82%
GRU [15] 7.20 99.32 9.97 17.78%
SAE [1] 8.26 99.82 11.60 19.80%
LSTM [13] 8.28 107.16 11.45 20.32%
SVM [22] 8.68 115.52 13.24 22.73%
collected from over 39,000 individual detectors in real time.
These sensors span the freeway system across all major
metropolitan areas of the State of California [1]. In this paper,
traffic flow data collected during the first three months of
2013 is used for experiments. We select the traffic flow data
in the first two months as the training dataset and the third
month as the testing dataset. Furthermore, since the traffic
flow data is time-series data, we need to use them at the
previous time interval, i.e., xt1,x
t2,···,x
tr, to predict
the traffic flow at time interval t, where ris the length of
the history data window.
We adopt the the Mean Absolute Error (MAE), the Mean
Square Error (MSE), the RMS Error (RMSE), and the Mean
Absolute Percentage Error (MAPE) to show the prediction
accuracy, i.e., prediction error. They are defined as:
MAE = 1
n
n
i=1
|yiˆyp|,(13)
MSE = 1
n
n
i=1
(yiˆyp)2,(14)
RMSE = [ 1
n
n
i=1
(|yiˆyp|)2]1
2,(15)
MAPE = 100%
n
n
i=1
|ˆypyi
yi
|.(16)
where yiis the observed traffic flow, and yp
iis the predicted
traffic flow.
B. Experimental Setup
Without loss of generality, we assume that the detector sta-
tions are distributed and independent, and the data cannot be
exchanged arbitrarily among them. In the secure parameter
aggregation mechanism, PySyft [21] framework is adopted
to encrypt the parameters1.
For the cloud and each organization, we use mini-batch
SGD for model optimization. PeMS dataset is split equally
and assigned to 20 organizations. During the simulation,
learning rate α=0.001, mini-batch size m= 128, and
|Ov|=20. Note that the client C=2of the FedGRU model
is the default setting in FL [6]. All experiments are conducted
using TensorFlow and PyTorch with Ubuntu 18.04.
1https://github.com/OpenMined/PySyft
(a) (b)
Fig. 2. (a) Traffic flow prediction of GRU model and FedGRU model. (b)
Loss of GRU model and FedGRU model.
C. Experimental Results
1) Traffic Flow Prediction Accuracy: We compared the
performance of the proposed FedGRU model with that of
GRU, SAE, LSTM, and support vector machine (SVM)
with an identical simulation configuration. Among these five
competing methods, FedGRU is a federated machine learning
model, and the rest are centralized ones. Among them,
GRU is a widely-adopted baseline model that has better
performance for traffic flow forecast tasks, as aforementioned
in Section IV, and SVM is a popular machine learning model
for general prediction applications [1]. In all investigations,
we use the same PeMS dataset. The prediction results are
given in Table I for 5-min ahead traffic flow prediction.
From the simulation results, it can be observed that MAE
of FedGRU is lower than those of SAEs, LSTM, and SVM
but higher than that of GRU. Specifically, MAE of FedGRU
is 9.04% lower than that of the worst case (i.e., SVM)
in this experiment. This result is contributed by the fact
that FedGRU inherits the advantages of GRU’s outstanding
performance in prediction tasks.
Fig. 2(a) shows a comparison between GRU and FedGRU
for a 5-min traffic flow prediction task. We can find that
the predict results of FedGRU model are very close to that
of GRU. This is because the core technique of FedGRU to
prediction is GRU structure, so the performance of FedGRU
is comparable to GRU model. Furthermore, FedGRU can
protect data privacy by keeping the training dataset locally.
Fig. 2(b) illustrates the loss of GRU model and FedGRU
model. From the results, the loss of FedGRU model is not
significantly different from GRU model. This proves that
FedGRU model has good convergence and stability. In a
word, FedGRU can achieve accurate and timely traffic flow
prediction without compromising privacy.
2) Performance Comparison of FedGRU Model Under D-
ifferent Client Numbers: In Section V-C.1, the default client
number is set C=2. However, it is highly plausible that
traffic data can be gathered by more than two entities, e.g.,
organizations and companies. In this experiment, we explore
the impact of different client numbers (i.e., C=2,4,8,10)
on the performance of FedGRU. The simulation results are
presented in Fig. 3, where we observe that the number
of clients has an adverse influence on the performance of
FedGRU. The reason is that more clients introduce increasing
communication overhead to the underlying communication
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
Fig. 3. The prediction error of FedGRU model with different client
numbers.
infrastructure, which makes it more difficult for the cloud
to simultaneously perform aggregation of gradient informa-
tion. Furthermore, such overhead may cause communication
failures in some clients, causing clients to fail to upload
gradient information, thereby reducing the accuracy of the
global model.
In this paper, we initially use FedAVG algorithm to
alleviate the expensive communication overhead issue. Fe-
dAVG reduces communication overhead by i) computing the
average gradient of a batch size samples on the client and ii)
computing the average aggregation gradient from all clients.
Fig. 3 shows that FedAVG performs well when the number
of clients is less than 8, but when the number of clients
exceeds 8, the performance of FedAVG starts to decline. The
reason is that, when the number of clients exceeds a certain
threshold (e.g., C=8), the probability of client failure
will increase, which causes FedAVG to calculate wrong
gradient information. Nevertheless, FedAVG is significant
for reducing communication overhead because the number
of entities involved in predicting traffic flow tasks in real
life is usually small.
VI. CONCLUSION
In this paper, we propose a FedGRU algorithm for traffic
flow prediction with federated learning for privacy preser-
vation. FedGRU does not directly access distributed organi-
zational data but instead employs a secure parameter aggre-
gation mechanism to train a global model in a distributed
manner. It aggregates the gradient information uploaded by
all locally trained models in the cloud to construct the global
one for traffic flow forecasts. We evaluate the performance
of FedGRU on a PeMS dataset and compared it with GRU,
LSTM, SAE, and SVM, which all potentially compromise
user privacy during the forecast. The results show that the
proposed method performs comparably to the competing
methods with minuscule accuracy degradation with privacy
well-preserved. In the future, we plan to apply Graph Con-
volutional Network to the federated learning framework to
predict traffic flow.
REFERENCES
[1] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Traffic flow
prediction with big data: a deep learning approach,” IEEE Transactions
on Intelligent Transportation Systems, vol. 16, no. 2, pp. 865–873,
2014.
[2] N. Zhang, F.-Y. Wang, F. Zhu, D. Zhao, S. Tang et al., “Dynacas:
Computational experiments and decision support for ITS,” 2008.
[3] C. Zhang, J. J. Q. Yu, and Y. Liu, “Spatial-temporal graph attention
networks: A deep learning approach for traffic forecasting,IEEE
Access, vol. 7, pp. 166 246–166 256, 2019.
[4] S. Madan and P. Goswami, “A novel technique for privacy preservation
using k-anonymization and nature inspired optimization algorithms,”
Available at SSRN 3357276, 2019.
[5] J. L. Ny, A. Touati, and G. J. Pappas, “Real-time privacy-preserving
model-based estimation of traffic flows,” in 2014 ACM/IEEE Interna-
tional Conference on Cyber-Physical Systems (ICCPS), April 2014,
pp. 92–102.
[6] J. Koneˇ
cn`
y, H. B. McMahan, F. X. Yu, P. Richt´
arik, A. T. Suresh, and
D. Bacon, “Federated learning: Strategies for improving communica-
tion efficiency,” arXiv preprint arXiv:1610.05492, 2016.
[7] X. Yuan, X. Wang, C. Wang, J. Weng, and K. Ren, “Enabling
secure and fast indexing for privacy-assured healthcare monitoring
via compressive sensing,IEEE Transactions on Multimedia (TMM),
vol. 18, no. 10, pp. 1–13, 2016.
[8] M. S. Ahmed, “Analysis of freeway traffic time series data and their
application to incident detection,” Equine Veterinary Education, vol. 6,
no. 1, pp. 32–35, 1979.
[9] J. J. Q. Yu, A. Y. S. Lam, D. J. Hill, Y. Hou, and V. O. K. Li,
“Delay aware power system synchrophasor recovery and prediction
framework,IEEE Transactions on Smart Grid, vol. 10, no. 4, pp.
3732–3742, July 2019.
[10] G. A. Davis and N. L. Nihan, “Nonparametric regression and short-
term freeway traffic forecasting,Journal of Transportation Engineer-
ing, vol. 117, no. 2, pp. 178–188, 1991.
[11] C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector
machines,” ACM transactions on intelligent systems and technology
(TIST), vol. 2, no. 3, p. 27, 2011.
[12] D. Svozil, V. Kvasnicka, and J. Pospichal, “Introduction to multi-
layer feed-forward neural networks,” Chemometrics and intelligent
laboratory systems, vol. 39, no. 1, pp. 43–62, 1997.
[13] X. Ma, Z. Tao, Y. Wang, H. Yu, and Y. Wang, “Long short-term
memory neural network for traffic speed prediction using remote
microwave sensor data,Transportation Research Part C: Emerging
Technologies, vol. 54, pp. 187–197, 2015.
[14] Y. Tian and L. Pan, “Predicting short-term traffic flow by long short-
term memory recurrent neural network,” in 2015 IEEE international
conference on smart city/SocialCom/SustainCom (SmartCity). IEEE,
2015, pp. 153–158.
[15] R. Fu, Z. Zhang, and L. Li, “Using lstm and gru neural network
methods for traffic flow prediction,” in 2016 31st Youth Academic
Annual Conference of Chinese Association of Automation (YAC),Nov
2016, pp. 324–328.
[16] J. J. Q. Yu, W. Yu, and J. Gu, “Online vehicle routing with neural
combinatorial optimization and deep reinforcement learning,” IEEE
Transactions on Intelligent Transportation Systems, vol. 20, no. 10,
pp. 3806–3817, Oct 2019.
[17] J. J. Q. Yu and J. Gu, “Real-time traffic speed estimation with
graph convolutional generative autoencoder,IEEE Transactions on
Intelligent Transportation Systems, vol. 20, no. 10, pp. 3940–3951,
Oct 2019.
[18] B. Y. He and J. Y. Chow, “Optimal privacy control for transport
network data sharing,” Transportation Research Part C: Emerging
Technologies, 2019.
[19] K. Cho, B. Van Merri¨
enboer, C. Gulcehre, D. Bahdanau, F. Bougares,
H. Schwenk, and Y. Bengio, “Learning phrase representations using
rnn encoder-decoder for statistical machine translation,” arXiv preprint
arXiv:1406.1078, 2014.
[20] C. Chao, Freeway performance measurement system (pems), 2003.
[21] T. Ryffel, A. Trask, M. Dahl, B. Wagner, J. Mancuso, D. Rueckert,
and J. Passerat-Palmbach, “A generic framework for privacy preserving
deep learning,” arXiv preprint arXiv:1811.04017, 2018.
[22] M. A. Mohandes, T. O. Halawani, S. Rehman, and A. A. Hussain,
“Support vector machines for wind speed prediction,” Renewable
Energy, vol. 29, no. 6, pp. 939–947, 2004.
CONFIDENTIAL. Limited circulation. For review only.
Manuscript 129 submitted to 2020 IEEE Intelligent Transportation
Systems Conference (ITSC). Received February 25, 2020.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Traffic speed prediction, as one of the most important topics in Intelligent Transport Systems (ITS), has been investigated thoroughly in the literature. Nonetheless, traditional methods show their limitation in coping with complexity and high nonlinearity of traffic data as well as learning spatial-temporal dependencies. Particularly, they often neglect the dynamics happening to traffic network. Attention-based models witnessed extensive developments in recent years and have shown its efficacy in a host of fields, which inspires us to leverage graph-attention-based method to handling traffic network speed prediction. In this paper, we propose a novel deep learning framework, Spatial-Temporal Graph Attention Networks (ST-GAT). A graph attention mechanism is adopted to extract the spatial dependencies among road segments. Additionally, we introduce a LSTM network to extract temporal domain features. Compared with previous related research, the proposed approach is able to capture dynamic spatial dependencies of traffic networks. A series of comprehensive case studies on a real-world dataset demonstrate that ST-GAT supersedes existing state-of-the-art results of traffic speed prediction. Furthermore, outstanding robustness against noise and on reduced graphs of the proposed model has been demonstrated through the tests.
Article
Full-text available
In the era of smart cities, Internet of Things, and Mobility-as-a-Service, private operators need to share data with public agencies to support data exchanges for “living lab” ecosystems more than ever before. However, it is still problematic for private operators to share data with the public due to risks to competitive advantages. A privacy control algorithm is proposed to overcome this key obstacle for private operators sharing complex network-oriented data objects. The algorithm is based on information-theoretic k-anonymity and, using tour data as an example, where an operator’s data is used in conjunction with performance measure accuracy controls to synthesize a set of alternative tours with diffused probabilities for sampling during a query. The algorithm is proven to converge sublinearly toward a constrained maximum entropy under certain asymptotic conditions with measurable gap. Computational experiments verify the applicability to multi-vehicle fleet tour data; they confirm that reverse engineered parameters from the diffused data result in controllable sampling error; and tests conducted on a set of realistic routing records from travel data in Long Island, NY, demonstrate the use of the methodology from both the adversary and user perspectives.
Article
Full-text available
This paper presents a novel delay aware synchrophasor recovery and prediction framework to address the problem of missing power system state variables due to the existence of communication latency. This capability is particularly essential for dynamic power system scenarios where fast remedial control actions are required due to system events or faults. While a wide area measurement system can sample high-frequency system states with phasor measurement units, the control center cannot obtain them in real-time due to latency and data loss. In this work, a synchrophasor recovery and prediction framework and its practical implementation are proposed to recover the current system state and predict the future states utilizing existing incomplete synchrophasor data. The framework establishes an iterative prediction scheme, and the proposed implementation adopts recent machine learning advances in data processing. Simulation results indicate the superior accuracy and speed of the proposed framework, and investigations are made to study its sensitivity to various communication delay patterns for pragmatic applications. IEEE
Conference Paper
Full-text available
Accurate and real-time traffic flow prediction is important in Intelligent Transportation System (ITS), especially for traffic control. Existing models such as ARMA, ARIMA are mainly linear models and cannot describe the stochastic and nonlinear nature of traffic flow. In recent years, deep-learning-based methods have been applied as novel alternatives for traffic flow prediction. However, which kind of deep neural networks is the most appropriate model for traffic flow prediction remains unsolved. In this paper, we use Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU) neural network (NN) methods to predict short-term traffic flow, and experiments demonstrate that Recurrent Neural Network (RNN) based deep learning methods such as LSTM and GRU perform better than auto regressive integrated moving average (ARIMA) model. To the best of our knowledge, this is the first time that GRU is applied to traffic flow prediction.
Article
Real-time traffic speed estimation is an essential component of intelligent transportation system (ITS) technologies. It is the foundation of modern transportation control and management applications. However, the existing traffic speed acquisition systems can only provide real-time speed measurements of a small number of roads with stationary speed sensors and crowdsourcing vehicles. How to utilize this information to provide traffic speed maps for transportation networks is becoming a key problem in ITSs. In this paper, we present a novel deep-learning model called graph convolutional generative autoencoder to fully address the real-time traffic speed estimation problem. The proposed model incorporates the recent development in deep-learning techniques to extract the spatial correlation of the transportation network from the input incomplete historical data. To evaluate the proposed speed estimation technique, we conduct comprehensive case studies on a real-world transportation network and vehicular traces. The simulation results demonstrate that the proposed technique can notably outperform existing traffic speed estimation and deep-learning techniques. In addition, the impact of dataset properties and control parameters is investigated.
Article
Online vehicle routing is an important task of the modern transportation service provider. Contributed by the ever-increasing real-time demand on the transportation system, especially small-parcel last-mile delivery requests, vehicle route generation is becoming more computationally complex than before. The existing routing algorithms are mostly based on mathematical programming, which requires huge computation time in city-size transportation networks. To develop routes with minimal time, in this paper, we propose a novel deep reinforcement learning-based neural combinatorial optimization strategy. Specifically, we transform the online routing problem to a vehicle tour generation problem, and propose a structural graph embedded pointer network to develop these tours iteratively. Furthermore, since constructing supervised training data for the neural network is impractical due to the high computation complexity, we propose a deep reinforcement learning mechanism with an unsupervised auxiliary network to train the model parameters. A multisampling scheme is also devised to further improve the system performance. Since the parameter training process is offline, the proposed strategy can achieve a superior online route generation speed. To assess the proposed strategy, we conduct comprehensive case studies with a real-world transportation network. The simulation results show that the proposed strategy can significantly outperform conventional strategies with limited computation time in both static and dynamic logistic systems. In addition, the influence of control parameters on the system performance is investigated.
Article
As e-health technology continues to advance, health related multimedia data is being exponentially generated from healthcare monitoring devices and sensors. Coming with it are the challenges on how to efficiently acquire, index, and process such a huge amount of data for effective healthcare and related decision making, while respecting user's data privacy. In this paper, we propose a secure cloud-based framework for privacy-aware healthcare monitoring systems, which allows fast data acquisition and indexing with strong privacy assurance. For efficient data acquisition, we adopt compressive sensing for easy data sampling, compression, and recovery. We then focus on how to secure and fast index the resulting large amount of continuously generated compressed samples, with the goal to achieve secure selected retrieval over compressed storage. Among others, one particular challenge is the practical demand to cope with the incoming data samples in high acquisition rates. For that problem, we carefully exploit recent efforts on encrypted search, efficient content-based indexing techniques, and fine-grained locking algorithms, to design a novel encrypted index with high-performance customization. It achieves memory efficiency, provable security, as well as greatly improved building speed with nontrivial multithread support. Comprehensive evaluations on Amazon Cloud show that our encrypted design can securely index 1 billion compressed data samples within only 12 min, achieving a throughput of indexing almost 1.4 million encrypted samples per second. Accuracy and visual evaluation on a real healthcare dataset shows good quality of high-value retrieval and recovery over encrypted data samples.
Article
Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model. The typical clients in this setting are mobile phones, and communication efficiency is of utmost importance. In this paper, we propose two ways to reduce the uplink communication costs. The proposed methods are evaluated on the application of training a deep neural network to perform image classification. Our best approach reduces the upload communication required to train a reasonable model by two orders of magnitude.