PreprintPDF Available

FedGRU: Privacy-preserving Traffic Flow Prediction via Federated Learning

March 2020

March 2020

Authors:

Yi Liu

City University of Hong Kong

Shuyu Zhang

The Hong Kong Polytechnic University

Chenhan Zhang

University of Technology Sydney

James Jianqiao Yu

The University of York

Preprints and early-stage research may not have been peer reviewed yet.

Existing traffic flow forecasting technologies achieve great success based on deep learning models on a large number of datasets gathered by organizations. However, there are two critical challenges. One is that data exists in the form of "isolated islands". The other is the data privacy and security issue, which is becoming more significant than ever before. In this paper, we propose a Federated Learning-based Gated Recurrent Unit neural network framework (FedGRU) for traffic flow prediction (TFP) to address these challenges. Specifically, FedGRU model differs from current centralized learning methods and updates a universe learning model through a secure aggregation parameter mechanism rather than sharing data among organizations. In the secure parameter aggregation mechanism, we introduce a Federated Averaging algorithm to control the communication overhead during parameter transmission. Through extensive case studies on the Performance Measurement System (PeMS) dataset, it is shown that FedGRU model can achieve accurate and timely traffic prediction without compromising privacy.

Federated traffic flow prediction learning architecture. 1) FedAVG algorithm: A recognized problem in federated learning is the limited network bandwidth that bottlenecks cloud-aggregated local updates from the organizations. To reduce the communication overhead, each client uses its local data to perform gradient descent optimization on the current model. Then the central cloud performs a weighted average aggregation of the model updates uploaded by the clients. As shown in Algorithm 1, FedAVG consists of three steps: (i) The cloud selects volunteers from organizations O to participate in this round of training and broadcasts global model ω o to the selected organizations;

…

The prediction error of FedGRU model with different client numbers.

…

Figures - uploaded by Yi Liu

Content may be subject to copyright.

Content uploaded by Yi Liu

Content may be subject to copyright.

FedGRU: Privacy-preserving Trafﬁc Flow Prediction via Federated

Learning

Yi Liu1,2,†, Shuyu Zhang1,†, Chenhan Zhang1, James J.Q. Yu1

Abstract— Existing trafﬁc ﬂow forecasting technologies

achieve great success based on deep learning models on a

large number of datasets gathered by organizations. However,

there are two critical challenges. One is that data exists in the

form of “isolated islands”. The other is the data privacy and

security issue, which is becoming more signiﬁcant than ever

before. In this paper, we propose a Federated Learning-based

Gated Recurrent Unit neural network framework (FedGRU)

for trafﬁc ﬂow prediction (TFP) to address these challenges.

Speciﬁcally, FedGRU model differs from current centralized

learning methods and updates a universe learning model

through a secure aggregation parameter mechanism rather than

sharing data among organizations. In the secure parameter

aggregation mechanism, we introduce a Federated Averaging

algorithm to control the communication overhead during pa-

rameter transmission. Through extensive case studies on the

Performance Measurement System (PeMS) dataset, it is shown

that FedGRU model can achieve accurate and timely trafﬁc

prediction without compromising privacy.

I. INTRODUCTION

Urban residents, taxi drivers, business sectors, and govern-

ment agencies have an immediate requirement on accurate

and timely trafﬁc ﬂow information [1]. Such information

can help the trafﬁc sector to alleviate trafﬁc congestion,

control trafﬁc light, and improve the efﬁciency of trafﬁc

operations and residents for developing better traveling plans

[2]. Trafﬁc ﬂow prediction (TFP) is to provide such trafﬁc

ﬂow information by using historical trafﬁc ﬂow data to

predict the future [3]. TFP is regarded as a critical technology

of the deployment of Intelligent Transportation System (ITS)

subsystems, particularly the advanced traveler information,

online car-hailing, and trafﬁc management systems.

In the previous TFP literature, Convolutional Neural Net-

works (CNN), Recurrent Neural Networks (RNN), and their

variants have achieved gratifying results in predicting trafﬁc

ﬂow. Such centralized machine learning methods are typical-

ly utilized to predict trafﬁc ﬂow by training with sufﬁcient

sensor data from mobile phones, cameras, radars, etc. In this

context, these methods generally require data aggregation

among public agencies and private companies. Indeed, the

general public witnessed partnerships among public agencies

†Equal contributions. This work is supported by the Generation Program

of Guangdong Natural Science Foundation (Grant No. 2019A1515011032)

and the Ministry of Education of China and the School of Entrepreneurship

Education of Heilongjiang University (Grant No. 201910212133).

1Yi Liu, Shuyu Zhang, Chenhan Zhang, and James J.Q. Yu are with

Department of Computer Science and Engineering, Southern University

of Science and Technology, Shenzhen, China 97liuyi@ieee.org,

{11712122, zhangch}@mail.sustech.edu.cn,

yujq3@sustech.edu.cn

2Yi Liu is also with School of Data Science and Technology, Heilongjiang

University, Harbin, China

and mobile service providers such as DiDi Chuxing, Uber,

and Hellobike in recent years. These partnerships extend the

capability and services of companies that provide real-time

trafﬁc ﬂow forecasting, trafﬁc management, car sharing, and

personal travel applications.

Nevertheless, it is often overlooked that the data may

contain sensitive private information (e.g., user’s traveling

track, home address etc.), which leads to potential privacy

leakage. Therefore, different organizations should store their

user data locally and avoid exchanges to protect users’

privacy, which makes it challenging to train an effective

model with the valuable data. While the assumption that

an organization owns all the data is widely made in the

literature, the acquisition of massive user data is not possible

in real applications respecting privacy. To predict trafﬁc ﬂow

in ITS without compromising privacy, reference [4] intro-

duced a privacy control mechanism based on “k-anonymous

diffusion,” which can complete taxi order scheduling with-

out leaking user privacy. Le Ny et al. in [5] proposed a

differentially private real-time trafﬁc state estimator system

to predict trafﬁc ﬂow. However, these privacy-preserving

methods cannot achieve the trade-off between accuracy and

privacy, rendering degraded system performance.

To address the data privacy leakage issue, we incorporate a

privacy-preserving machine learning technique named feder-

ated learning (FL) [6] for TFP in this work. In FL, distributed

organizations cooperatively train a globally shared model

through their local data without exchanging the raw data.

To accurately predict trafﬁc ﬂow, we propose an enhanced

federated learning algorithm with a Gated Recurrent Unit

neural network (FedGRU) in this paper. Through FL and

its aggregation mechanism [7], FedGRU aggregates model

parameters from different geographically located organiza-

tions to build a global deep learning model under privacy

well-preserved conditions. Furthermore, contributed by the

outstanding data regression capability of GRU neural net-

works, FedGRU can achieve accurate and timely trafﬁc ﬂow

prediction for multiple organizations.

The main contributions are summarized as follows:

•We propose a novel privacy-preserving algorithm that

integrates emerging federated learning with a practical

GRU neural network for trafﬁc ﬂow prediction. Such

an algorithm provides reliable data privacy preservation

through a locally training model without raw data

exchange.

•We introduce a Federated Averaging (FedAVG) algo-

rithm in the secure parameter aggregation mechanism

which runs Stochastic Gradient Descent (SGD) on a

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

selected subset of all organizations to aggregate model

parameters to update the global model.

•FedGRU is a standard, stable, and extensible FL frame-

work for ITS. As a privacy-preserving framework, it not

only achieves accuracy close to traditional models but

also can be combined with other and future state-of-the-

art deep learning models.

The remainder of this paper is organized as follows.

Section II reviews the literature on short-term TFP and

privacy research in ITS. Section III deﬁnes the Centralized

TFP Learning problem and Federated TFP Learning problem

and proposes a security parameter aggregation mechanism.

Section IV presents the FedGRU framework. Section V

discusses the experimental results. Concluding remarks are

described in Section VI.

II. RELATED WORK

A. Trafﬁc Flow Prediction

Trafﬁc ﬂow prediction (TFP) has always been a hot issue

in ITS, which can improve the efﬁciency of real-time trafﬁc

control and urban planning. Although researchers have pro-

posed many new models and methods, they can generally be

divided into two categories: parametric and non-parametric

models.

1) Parametric models: Parametric models predict future

data by capturing existing data feature within its parameters.

M. S. Ahmed et al. in [8] proposed the Autoregressive Inte-

grated Moving Average (ARIMA) model in the 1970s to pre-

dict short-term freeway trafﬁc. Since then, many researchers

have proposed variants of ARIMA such as Kohonen-ARIMA

(KARIMA), subset ARIMA, seasonal ARIMA, etc. These

models further improve the accuracy of TP by focusing on

the statistical correlation of the data.

2) Non-parametric models: With the improvement of

data storage and computing, non-parametric models have

achieved great success in TFP [9]. Davis and Nihan et

al. in [10] proposed k-NN model for short-term trafﬁc

ﬂow prediction. Lv et al. in [1] ﬁrst applied the stacked

autoencoder (SAE) model for TFP. Furthermore, SAE adopts

a hierarchical greedy network structure to learn non-linear

features and has better performance than support vector ma-

chines (SVM) [11] and feed-forward neural network (FFNN)

[12]. Considering the temporal correlation of the data, Ma et

al. in [13] and Tian et al. in [14] applied Long Short-Term

Memory (LSTM) to achieve accurate and timely TFP. Fu

et al. in [15] ﬁrst proposed GRU neural network methods

for TFP. In recent years, due to the success of convolutional

networks and graph networks, Yu et al. in [16], [17] proposed

graph convolutional generative autoencoder to address the

real-time trafﬁc speed estimation problem.

B. Privacy Research in Intelligent Transportation Systems

In ITS, many models and methods rely on training data

from users or organizations. However, with the increasing

privacy awareness of users and organizations, direct ex-

changes of data between users or organizations are advocated

against the law. Brian et al. in [18] designed a data sharing

algorithm based on information-theoretic k-anonymity prin-

ciple. However, this algorithm may leak privacy during data

sharing operations. Furthermore, the EU has promulgated

GDPR, which means that as long as the organization has the

possibility of revealing privacy in the data sharing process,

such data transactions violate the law.

Although researchers have proposed some privacy-

preserving methods to predict trafﬁc ﬂow in the literature,

they still cannot meet the requirements of GDPR. In this

paper, we explore a powerful privacy-preserving method with

GRU for trafﬁc ﬂow prediction.

III. PROBLEM DEFINITION

We use the term “organization” throughout the paper to

describe entities in TFP, such as urban agencies, private

companies, and detector stations. We use the term “client” to

describe computing nodes that correspond to one or multiple

sensors in FL and use the term “device” to describe the sen-

sor in the organizations. Let C={C1,C

2,···,C

n}and O=

{O1,O

2,···,O

m}denote the client set and organization set

in ITS, respectively. Each client has qorganizations. Each

organization has kidevices and their respective database Di.

We aim to predict the number of vehicles with historical

trafﬁc ﬂow information from different organizations without

sharing raw data and privacy leakage. We design a secure

parameter aggregation mechanism as follows:

Secure Parameter Aggregation Mechanism: Detector

station Oihas Ndevices, and the trafﬁc ﬂow data collected

by the Ndevices constitute a database Di. The deep

learning model constructed in Oicalculates updated model

parameters piusing the local training data from Di. When

all detector stations ﬁnish the same operation, they upload

their respective pito the cloud and aggregate a new global

model.

According to Secure Parameters Aggregation, no trafﬁc

ﬂow data is exchanged among different detector stations. The

cloud aggregates the gradients uploaded by organizations to

obtain a new global model without exchanging data.

In this paper, tand vtrepresent the t-th timestamp in the

time-series and trafﬁc ﬂow at the t-th timestamp, respec-

tively. Let f(·)be the trafﬁc ﬂow prediction function, the

centralized and federated TFP learning problems are deﬁned

as follows:

Centralized TFP Learning: Given organizations O, each

organization’s devices ki, and an aggregated database D=

D1∪D2∪D3∪···∪DN, the centralized TFP problem is

to calculate vt+s=f(t+s, D), where sis the prediction

window after t.

Federated TFP Learning: Given organizations Oand

each organization’s devices ki, and their respective database

Di, the federated TFP problem is to calculate vt+s=fi(t+

s, Di)where fi(·,·)is the local version of f(·,·)and sis

the prediction window after t. Subsequently, the produced

results are aggregated by a secure parameter aggregation

mechanism.

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

IV. METHODOLOGY

A. Federated Learning and Gated Recurrent Unit

Federated Learning (FL) [6] is a distributed machine

learning (ML) paradigm that has been designed to train ML

models without compromising privacy. With this scheme,

different organizations can contribute to the overall model

training while keeping the training data locally.

Particularly, FL problem involves learning a single and

globally predicted model from the database separately stored

in dozens of or even hundreds of organizations. We assume

that each device kstores its local dataset Dkof size Dk.So

we can deﬁne the local training dataset size D=K

k=1 Dk.

In a typical deep learning setting, given a set of input-output

pairs {xi,y

i}|Dk|

i=1 , where the input sample vector with d

features is xi∈Rd, and the labeled output value for the

input sample xiis yi∈R. If we input the training sample

vector xi(e.g., the trafﬁc ﬂow data), we need to ﬁnd the

model parameter vector ω∈Rdthat characterrizes the output

yi(e.g., the value output of the trafﬁc ﬂow data) with loss

function fi(ω)(e.g., fi(ω)=1

2(xT

iω−yi)). Our goal is to

learn this model under the constraints of local data storage

and processing by devices in the organization with a secure

parameter aggregation mechanism. For local data of client c,

we aim to minimize the objective function as follows:

Jc(ω)= 1

|Dk||Dk|

i=1 fi(ω)+λh(ω),(1)

where the local model parameter ω∈Rd,∀λ∈[0,1], and

h(·)is a regualarizer function. This characterizes the local

model in the FL setting.

At the cloud, the global predicted model problem can be

represented as follows:

arg min

ω∈RdJ(ω),J(ω)≡|Dk|

k=1 DkJc(ω)

D,(2)

we recast the global predicted model problem in (2) as

follows:

arg min

ω∈RdJ(ω):= |Dk|

k=1 fi(ω)+λh(ω)

D.(3)

For TFP problem, we regard GRU neural network model

as the local model in Equation (1). Cho et al. in [19]

proposed the GRU neural network in 2014, which is a variant

of RNN that handles time-series data. GRU is different from

RNN is that it adds a “Processor” to the algorithm to judge

whether the information is useful or not. The structure of the

processor is called “Cell.” A typical structure of GRU cell

uses two data “gates” to control the data from processor:

reset gate rand update gate z.

Let X={x1,x

2,···,x

n},Y={y1,y

2,···,y

n}, and

H={h1,h

2,···,h

n}be the input time series, output time

series and the hidden state of the cells, respectively. At time

step t, the value of update gate ztis expressed as:

zt=σ(W(z)xt+U(z)ht−1),(4)

where xtis the input vector of the t-th time step, W(z)is the

weight matrix, and ht−1holds the cell state of the previous

time step t−1. The update gate aggregates W(z)xtand

U(z)ht−1, then maps the results in (0,1) through a Sigmoid

activation function. The reset gate rtis computed similarly

to the update gate:

rt=σ(W(z)xt+U(r)ht−1).(5)

The candidate activation htis denoted as:

ht= tanh(Wx

t+rtUht−1),(6)

where rtUht−1represents the Hadamard product of rt

and Uht−1.

The ﬁnal memory of the current time step tis calculated

as follows:

ht=ztht−1+(1−zt)ht.(7)

Traditional learning methods typically include three steps:

data processing, data fusion, and modeling. Among them,

data fusion is used for traditional learning model that directly

shares data among all parties to obtain a global database for

training. However, such a centralized learning approach faces

the challenge of new data privacy laws and regulations as

organizations may disclose privacy when sharing data. FL is

introduced into this context to address the above challenges.

B. Privacy-preserving Trafﬁc Flow Prediction Algorithm

We develop a FL framework FedGRU to fully handle the

data privacy infringement issues in trafﬁc ﬂow prediction

task. We ﬁrst introduce a FedAVG algorithm as an imple-

mentation of the secure parameter aggregation mechanism

to collect gradient information. Then, we illustrate the feder-

ated trafﬁc ﬂow prediction learning architecture. Finally, we

demonstrate the details of the FedGRU algorithm.

Government Company Station

∑

Local

training

data

GRADIENT

Local

training

data

Local

training

data

AGGREGATOR

G/2%$/ M2'(/

&OLHQW &OLHQW &OLHQW

Fig. 1. Federated trafﬁc ﬂow prediction learning architecture.

1) FedAVG algorithm: A recognized problem in federated

learning is the limited network bandwidth that bottlenecks

cloud-aggregated local updates from the organizations. To

reduce the communication overhead, each client uses its local

data to perform gradient descent optimization on the current

model. Then the central cloud performs a weighted average

aggregation of the model updates uploaded by the clients.

As shown in Algorithm 1, FedAVG consists of three steps:

(i) The cloud selects volunteers from organizations Oto

participate in this round of training and broadcasts

global model ωoto the selected organizations;

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

Algorithm 1: Federated Averaging (FedAVG) Algorith-

Input: Organizations O={O1,O

2,···,O

N}.Bis the

local mini-batch size, Eis the number of local

epochs, αis the learning rate, ∇L(·;·)is the

gradient optimization function.

Output: Parameter ω.

1Initialize ω0(Pre-trained by a public dataset);

2foreach round t=1,2,··· do

3{Ov}←select volunteer from organizations O

participate in this round of training;

4Broadcast global model ωoto organization in {Ov};

5foreach organization o∈{Ov}in parallel do

6Initialize ωo

t=ωo;

7ωo

t+1 (ωo

t+1 ←LocalUpdate(o, ω o

t);

8ωt+1 ←1

|{Ov}| o∈Ovωo

t+1;

9LocalUpdate(o, ωo

t):// Run on organization o;

10 B←(split Sointo batches of size B);

11 if each local epoch ifrom 1 to Ethen

12 if batch b∈Bthen

13 ω←ω−α·∇L(ω;b);

14 return ωto cloud

(ii) Each organization otrains data locally and updates ωo

for Eepochs of SGD with mini-batch size Bto obtain

ωo

t+1, i.e., ωo

t+1 ←LocalUpdate(o, ω o

t);

(iii) The cloud aggregates each organization’s ωt+1 through

a secure parameter aggregation mechanism.

FedAVG algorithm is a critical mechanism in FedGRU to

reduce the communication overhead in the process of trans-

mitting parameters. This algorithm is an iterative process.

For the i-th round of training, the models of the organizations

participating in the training will be updated to the new global

one.

2) Federated Learning-based Gated Recurrent Unit neu-

ral network algorithm: FedGRU aims to achieve accurate

and timely TFP through combining FL and GRU without

compromising privacy. The overview of FedGRU is shown

in Fig. 1. It consists of four steps:

i) The cloud model is initialized through pre-training that

utilizes domain-speciﬁc public datasets without privacy

concerns;

ii) The cloud distributes the copy of the global model to

all organizations, and each organization trains its copy

on local data;

iii) Each organization uploads model updates to the cloud.

The entire process does not share any private data, but

instead sharing the encrypted parameters;

iv) The cloud aggregates the updated parameters uploaded

by all organizations by the secure parameter aggregation

mechanism to build a new global model, and then

distributes the new global model to each organization.

Algorithm 2: Federated Learning-based Gated Recurrent

Unit neural network (FedGRU) algorithm.

Input:{Ov}⊆O,X,Yand H. The mini-batch size

m, the number of iterations nand the learning

rate α. The optimizer SGD.

Output:J(ω),ωand Wr

v,Wz

v,Wh

1According to X,Y,Hand Equations (8)–(12),

initialize the cloud model J(ω0),ω0,Wr0

v,Wz0

v,Wh0

and H0

2foreach round i=1,2,3,··· do

3{Ov}←select volunteer from organizations to

participate in this round of training ;

4while gωhas not convergence do

5foreach organization o∈Ovin parallel do

6Conduct a mini-batch input time step

{xv(i)}m

i=1;

7Conduct a mini-batch true trafﬁc ﬂow

{yv(i)}m

i=1;

8Initalize ωo

t+1 =ωo

9gω←∇

ω1

mm

i=1 fω(x(i)

v)−y(i)

v2;

10 ωo

t+1 ←ωo

t+α·SGD(ωo

t,g

ω);

11 Update the parameters Wr0

v,Wz0

v,Wh0

and H0

12 Update reset gate rand update gate z;

13 Collect the all parameters from {Ov}to update

ωt+1. (Referring to the Algorithm 1.);

14 return J(ω),ωand Wr

v,Wz

v,Wh

Given voluntary organization {Ov}⊆Oand ov∈{Ov},

referring to the GRU neural network in Section IV-A, we

have:

v=σ(W(zv)+U(zv)ht−1

v)(8)

v=σ(W(rv)+U(rv)ht−1

v)(9)

ht

v= tanh(Wx

v+rt

vUht−1

v)(10)

v=zt

vht−1

v+(1−zt

v)ht

v(11)

where X={x1

v,x

v, ..., xn

v},Y ={y1

v,y

v, ..., yn

v},H =

{h1

v,h

v, ..., hn

v}denote ov’s input time series, ov’s output

time series and the hidden state of the cells, respectively.

According to Equation 3, the objective function of FedGRU

is as follows:

arg min

ωJ(ω)=min|Dv|

i=1 T

t=1

2(yd−yt

v)2(12)

The pseudocode of FedGRU framework is presented in

Algorithm 2.

V. EXPERIMENTS

A. Dataset Pre-Processing and Evaluation Method

In this experiment, the proposed FedGRU is applied to

the real-world data collected from the Caltrans Performance

Measurement System (PeMS) [20] database for performance

demonstration. The trafﬁc ﬂow data in PeMS database was

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

TABLE I

PERFORMANCE COMPARSION OF MAE, MSE, RMSE, AND MAPE FOR

FEDGRU, LSTM, SAE, AND SVM

Metrics MAE MSE RMSE MAPE

FedGRU (default setting) 7.96 101.49 11.04 17.82%

GRU [15] 7.20 99.32 9.97 17.78%

SAE [1] 8.26 99.82 11.60 19.80%

LSTM [13] 8.28 107.16 11.45 20.32%

SVM [22] 8.68 115.52 13.24 22.73%

collected from over 39,000 individual detectors in real time.

These sensors span the freeway system across all major

metropolitan areas of the State of California [1]. In this paper,

trafﬁc ﬂow data collected during the ﬁrst three months of

2013 is used for experiments. We select the trafﬁc ﬂow data

in the ﬁrst two months as the training dataset and the third

month as the testing dataset. Furthermore, since the trafﬁc

ﬂow data is time-series data, we need to use them at the

previous time interval, i.e., xt−1,x

t−2,···,x

t−r, to predict

the trafﬁc ﬂow at time interval t, where ris the length of

the history data window.

We adopt the the Mean Absolute Error (MAE), the Mean

Square Error (MSE), the RMS Error (RMSE), and the Mean

Absolute Percentage Error (MAPE) to show the prediction

accuracy, i.e., prediction error. They are deﬁned as:

MAE = 1



i=1

|yi−ˆyp|,(13)

MSE = 1



i=1

(yi−ˆyp)2,(14)

RMSE = [ 1



i=1

(|yi−ˆyp|)2]1

2,(15)

MAPE = 100%



i=1

|ˆyp−yi

|.(16)

where yiis the observed trafﬁc ﬂow, and yp

iis the predicted

trafﬁc ﬂow.

B. Experimental Setup

Without loss of generality, we assume that the detector sta-

tions are distributed and independent, and the data cannot be

exchanged arbitrarily among them. In the secure parameter

aggregation mechanism, PySyft [21] framework is adopted

to encrypt the parameters1.

For the cloud and each organization, we use mini-batch

SGD for model optimization. PeMS dataset is split equally

and assigned to 20 organizations. During the simulation,

learning rate α=0.001, mini-batch size m= 128, and

|Ov|=20. Note that the client C=2of the FedGRU model

is the default setting in FL [6]. All experiments are conducted

using TensorFlow and PyTorch with Ubuntu 18.04.

1https://github.com/OpenMined/PySyft

(a) (b)

Fig. 2. (a) Trafﬁc ﬂow prediction of GRU model and FedGRU model. (b)

Loss of GRU model and FedGRU model.

C. Experimental Results

1) Trafﬁc Flow Prediction Accuracy: We compared the

performance of the proposed FedGRU model with that of

GRU, SAE, LSTM, and support vector machine (SVM)

with an identical simulation conﬁguration. Among these ﬁve

competing methods, FedGRU is a federated machine learning

model, and the rest are centralized ones. Among them,

GRU is a widely-adopted baseline model that has better

performance for trafﬁc ﬂow forecast tasks, as aforementioned

in Section IV, and SVM is a popular machine learning model

for general prediction applications [1]. In all investigations,

we use the same PeMS dataset. The prediction results are

given in Table I for 5-min ahead trafﬁc ﬂow prediction.

From the simulation results, it can be observed that MAE

of FedGRU is lower than those of SAEs, LSTM, and SVM

but higher than that of GRU. Speciﬁcally, MAE of FedGRU

is 9.04% lower than that of the worst case (i.e., SVM)

in this experiment. This result is contributed by the fact

that FedGRU inherits the advantages of GRU’s outstanding

performance in prediction tasks.

Fig. 2(a) shows a comparison between GRU and FedGRU

for a 5-min trafﬁc ﬂow prediction task. We can ﬁnd that

the predict results of FedGRU model are very close to that

of GRU. This is because the core technique of FedGRU to

prediction is GRU structure, so the performance of FedGRU

is comparable to GRU model. Furthermore, FedGRU can

protect data privacy by keeping the training dataset locally.

Fig. 2(b) illustrates the loss of GRU model and FedGRU

model. From the results, the loss of FedGRU model is not

signiﬁcantly different from GRU model. This proves that

FedGRU model has good convergence and stability. In a

word, FedGRU can achieve accurate and timely trafﬁc ﬂow

prediction without compromising privacy.

2) Performance Comparison of FedGRU Model Under D-

ifferent Client Numbers: In Section V-C.1, the default client

number is set C=2. However, it is highly plausible that

trafﬁc data can be gathered by more than two entities, e.g.,

organizations and companies. In this experiment, we explore

the impact of different client numbers (i.e., C=2,4,8,10)

on the performance of FedGRU. The simulation results are

presented in Fig. 3, where we observe that the number

of clients has an adverse inﬂuence on the performance of

FedGRU. The reason is that more clients introduce increasing

communication overhead to the underlying communication

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

Fig. 3. The prediction error of FedGRU model with different client

numbers.

infrastructure, which makes it more difﬁcult for the cloud

to simultaneously perform aggregation of gradient informa-

tion. Furthermore, such overhead may cause communication

failures in some clients, causing clients to fail to upload

gradient information, thereby reducing the accuracy of the

global model.

In this paper, we initially use FedAVG algorithm to

alleviate the expensive communication overhead issue. Fe-

dAVG reduces communication overhead by i) computing the

average gradient of a batch size samples on the client and ii)

computing the average aggregation gradient from all clients.

Fig. 3 shows that FedAVG performs well when the number

of clients is less than 8, but when the number of clients

exceeds 8, the performance of FedAVG starts to decline. The

reason is that, when the number of clients exceeds a certain

threshold (e.g., C=8), the probability of client failure

will increase, which causes FedAVG to calculate wrong

gradient information. Nevertheless, FedAVG is signiﬁcant

for reducing communication overhead because the number

of entities involved in predicting trafﬁc ﬂow tasks in real

life is usually small.

VI. CONCLUSION

In this paper, we propose a FedGRU algorithm for trafﬁc

ﬂow prediction with federated learning for privacy preser-

vation. FedGRU does not directly access distributed organi-

zational data but instead employs a secure parameter aggre-

gation mechanism to train a global model in a distributed

manner. It aggregates the gradient information uploaded by

all locally trained models in the cloud to construct the global

one for trafﬁc ﬂow forecasts. We evaluate the performance

of FedGRU on a PeMS dataset and compared it with GRU,

LSTM, SAE, and SVM, which all potentially compromise

user privacy during the forecast. The results show that the

proposed method performs comparably to the competing

methods with minuscule accuracy degradation with privacy

well-preserved. In the future, we plan to apply Graph Con-

volutional Network to the federated learning framework to

predict trafﬁc ﬂow.

REFERENCES

[1] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Trafﬁc ﬂow

prediction with big data: a deep learning approach,” IEEE Transactions

on Intelligent Transportation Systems, vol. 16, no. 2, pp. 865–873,

2014.

[2] N. Zhang, F.-Y. Wang, F. Zhu, D. Zhao, S. Tang et al., “Dynacas:

Computational experiments and decision support for ITS,” 2008.

[3] C. Zhang, J. J. Q. Yu, and Y. Liu, “Spatial-temporal graph attention

networks: A deep learning approach for trafﬁc forecasting,” IEEE

Access, vol. 7, pp. 166 246–166 256, 2019.

[4] S. Madan and P. Goswami, “A novel technique for privacy preservation

using k-anonymization and nature inspired optimization algorithms,”

Available at SSRN 3357276, 2019.

[5] J. L. Ny, A. Touati, and G. J. Pappas, “Real-time privacy-preserving

model-based estimation of trafﬁc ﬂows,” in 2014 ACM/IEEE Interna-

tional Conference on Cyber-Physical Systems (ICCPS), April 2014,

pp. 92–102.

[6] J. Koneˇ

cn`

y, H. B. McMahan, F. X. Yu, P. Richt´

arik, A. T. Suresh, and

D. Bacon, “Federated learning: Strategies for improving communica-

tion efﬁciency,” arXiv preprint arXiv:1610.05492, 2016.

[7] X. Yuan, X. Wang, C. Wang, J. Weng, and K. Ren, “Enabling

secure and fast indexing for privacy-assured healthcare monitoring

via compressive sensing,” IEEE Transactions on Multimedia (TMM),

vol. 18, no. 10, pp. 1–13, 2016.

[8] M. S. Ahmed, “Analysis of freeway trafﬁc time series data and their

application to incident detection,” Equine Veterinary Education, vol. 6,

no. 1, pp. 32–35, 1979.

[9] J. J. Q. Yu, A. Y. S. Lam, D. J. Hill, Y. Hou, and V. O. K. Li,

“Delay aware power system synchrophasor recovery and prediction

framework,” IEEE Transactions on Smart Grid, vol. 10, no. 4, pp.

3732–3742, July 2019.

[10] G. A. Davis and N. L. Nihan, “Nonparametric regression and short-

term freeway trafﬁc forecasting,” Journal of Transportation Engineer-

ing, vol. 117, no. 2, pp. 178–188, 1991.

[11] C.-C. Chang and C.-J. Lin, “Libsvm: A library for support vector

machines,” ACM transactions on intelligent systems and technology

(TIST), vol. 2, no. 3, p. 27, 2011.

[12] D. Svozil, V. Kvasnicka, and J. Pospichal, “Introduction to multi-

layer feed-forward neural networks,” Chemometrics and intelligent

laboratory systems, vol. 39, no. 1, pp. 43–62, 1997.

[13] X. Ma, Z. Tao, Y. Wang, H. Yu, and Y. Wang, “Long short-term

memory neural network for trafﬁc speed prediction using remote

microwave sensor data,” Transportation Research Part C: Emerging

Technologies, vol. 54, pp. 187–197, 2015.

[14] Y. Tian and L. Pan, “Predicting short-term trafﬁc ﬂow by long short-

term memory recurrent neural network,” in 2015 IEEE international

conference on smart city/SocialCom/SustainCom (SmartCity). IEEE,

2015, pp. 153–158.

[15] R. Fu, Z. Zhang, and L. Li, “Using lstm and gru neural network

methods for trafﬁc ﬂow prediction,” in 2016 31st Youth Academic

Annual Conference of Chinese Association of Automation (YAC),Nov

2016, pp. 324–328.

[16] J. J. Q. Yu, W. Yu, and J. Gu, “Online vehicle routing with neural

combinatorial optimization and deep reinforcement learning,” IEEE

Transactions on Intelligent Transportation Systems, vol. 20, no. 10,

pp. 3806–3817, Oct 2019.

[17] J. J. Q. Yu and J. Gu, “Real-time trafﬁc speed estimation with

graph convolutional generative autoencoder,” IEEE Transactions on

Intelligent Transportation Systems, vol. 20, no. 10, pp. 3940–3951,

Oct 2019.

[18] B. Y. He and J. Y. Chow, “Optimal privacy control for transport

network data sharing,” Transportation Research Part C: Emerging

Technologies, 2019.

[19] K. Cho, B. Van Merri¨

enboer, C. Gulcehre, D. Bahdanau, F. Bougares,

H. Schwenk, and Y. Bengio, “Learning phrase representations using

rnn encoder-decoder for statistical machine translation,” arXiv preprint

arXiv:1406.1078, 2014.

[20] C. Chao, Freeway performance measurement system (pems), 2003.

[21] T. Ryffel, A. Trask, M. Dahl, B. Wagner, J. Mancuso, D. Rueckert,

and J. Passerat-Palmbach, “A generic framework for privacy preserving

deep learning,” arXiv preprint arXiv:1811.04017, 2018.

[22] M. A. Mohandes, T. O. Halawani, S. Rehman, and A. A. Hussain,

“Support vector machines for wind speed prediction,” Renewable

Energy, vol. 29, no. 6, pp. 939–947, 2004.

CONFIDENTIAL. Limited circulation. For review only.

Manuscript 129 submitted to 2020 IEEE Intelligent Transportation

Systems Conference (ITSC). Received February 25, 2020.

ResearchGate has not been able to resolve any citations for this publication.

Spatial-Temporal Graph Attention Networks: A Deep Learning Approach for Traffic Forecasting

Article

Full-text available

Jan 2019

Traffic speed prediction, as one of the most important topics in Intelligent Transport Systems (ITS), has been investigated thoroughly in the literature. Nonetheless, traditional methods show their limitation in coping with complexity and high nonlinearity of traffic data as well as learning spatial-temporal dependencies. Particularly, they often neglect the dynamics happening to traffic network. Attention-based models witnessed extensive developments in recent years and have shown its efficacy in a host of fields, which inspires us to leverage graph-attention-based method to handling traffic network speed prediction. In this paper, we propose a novel deep learning framework, Spatial-Temporal Graph Attention Networks (ST-GAT). A graph attention mechanism is adopted to extract the spatial dependencies among road segments. Additionally, we introduce a LSTM network to extract temporal domain features. Compared with previous related research, the proposed approach is able to capture dynamic spatial dependencies of traffic networks. A series of comprehensive case studies on a real-world dataset demonstrate that ST-GAT supersedes existing state-of-the-art results of traffic speed prediction. Furthermore, outstanding robustness against noise and on reduced graphs of the proposed model has been demonstrated through the tests.

Optimal privacy control for transport network data sharing

Article

Full-text available

Jul 2019
TRANSPORT RES C-EMER

In the era of smart cities, Internet of Things, and Mobility-as-a-Service, private operators need to share data with public agencies to support data exchanges for “living lab” ecosystems more than ever before. However, it is still problematic for private operators to share data with the public due to risks to competitive advantages. A privacy control algorithm is proposed to overcome this key obstacle for private operators sharing complex network-oriented data objects. The algorithm is based on information-theoretic k-anonymity and, using tour data as an example, where an operator’s data is used in conjunction with performance measure accuracy controls to synthesize a set of alternative tours with diffused probabilities for sampling during a query. The algorithm is proven to converge sublinearly toward a constrained maximum entropy under certain asymptotic conditions with measurable gap. Computational experiments verify the applicability to multi-vehicle fleet tour data; they confirm that reverse engineered parameters from the diffused data result in controllable sampling error; and tests conducted on a set of realistic routing records from travel data in Long Island, NY, demonstrate the use of the methodology from both the adversary and user perspectives.

Delay Aware Power System Synchrophasor Recovery and Prediction Framework

Article

Full-text available

May 2018

This paper presents a novel delay aware synchrophasor recovery and prediction framework to address the problem of missing power system state variables due to the existence of communication latency. This capability is particularly essential for dynamic power system scenarios where fast remedial control actions are required due to system events or faults. While a wide area measurement system can sample high-frequency system states with phasor measurement units, the control center cannot obtain them in real-time due to latency and data loss. In this work, a synchrophasor recovery and prediction framework and its practical implementation are proposed to recover the current system state and predict the future states utilizing existing incomplete synchrophasor data. The framework establishes an iterative prediction scheme, and the proposed implementation adopts recent machine learning advances in data processing. Simulation results indicate the superior accuracy and speed of the proposed framework, and investigations are made to study its sensitivity to various communication delay patterns for pragmatic applications. IEEE

Using LSTM and GRU neural network methods for traffic flow prediction

Conference Paper

Full-text available

Nov 2016

Accurate and real-time traffic flow prediction is important in Intelligent Transportation System (ITS), especially for traffic control. Existing models such as ARMA, ARIMA are mainly linear models and cannot describe the stochastic and nonlinear nature of traffic flow. In recent years, deep-learning-based methods have been applied as novel alternatives for traffic flow prediction. However, which kind of deep neural networks is the most appropriate model for traffic flow prediction remains unsolved. In this paper, we use Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU) neural network (NN) methods to predict short-term traffic flow, and experiments demonstrate that Recurrent Neural Network (RNN) based deep learning methods such as LSTM and GRU perform better than auto regressive integrated moving average (ARIMA) model. To the best of our knowledge, this is the first time that GRU is applied to traffic flow prediction.

Real-Time Traffic Speed Estimation With Graph Convolutional Generative Autoencoder

Article

Apr 2019

Real-time traffic speed estimation is an essential component of intelligent transportation system (ITS) technologies. It is the foundation of modern transportation control and management applications. However, the existing traffic speed acquisition systems can only provide real-time speed measurements of a small number of roads with stationary speed sensors and crowdsourcing vehicles. How to utilize this information to provide traffic speed maps for transportation networks is becoming a key problem in ITSs. In this paper, we present a novel deep-learning model called graph convolutional generative autoencoder to fully address the real-time traffic speed estimation problem. The proposed model incorporates the recent development in deep-learning techniques to extract the spatial correlation of the transportation network from the input incomplete historical data. To evaluate the proposed speed estimation technique, we conduct comprehensive case studies on a real-world transportation network and vehicular traces. The simulation results demonstrate that the proposed technique can notably outperform existing traffic speed estimation and deep-learning techniques. In addition, the impact of dataset properties and control parameters is investigated.

Online Vehicle Routing With Neural Combinatorial Optimization and Deep Reinforcement Learning

Article

Apr 2019

Online vehicle routing is an important task of the modern transportation service provider. Contributed by the ever-increasing real-time demand on the transportation system, especially small-parcel last-mile delivery requests, vehicle route generation is becoming more computationally complex than before. The existing routing algorithms are mostly based on mathematical programming, which requires huge computation time in city-size transportation networks. To develop routes with minimal time, in this paper, we propose a novel deep reinforcement learning-based neural combinatorial optimization strategy. Specifically, we transform the online routing problem to a vehicle tour generation problem, and propose a structural graph embedded pointer network to develop these tours iteratively. Furthermore, since constructing supervised training data for the neural network is impractical due to the high computation complexity, we propose a deep reinforcement learning mechanism with an unsupervised auxiliary network to train the model parameters. A multisampling scheme is also devised to further improve the system performance. Since the parameter training process is offline, the proposed strategy can achieve a superior online route generation speed. To assess the proposed strategy, we conduct comprehensive case studies with a real-world transportation network. The simulation results show that the proposed strategy can significantly outperform conventional strategies with limited computation time in both static and dynamic logistic systems. In addition, the influence of control parameters on the system performance is investigated.

A Novel Technique for Privacy Preservation Using K-Anonymization and Nature Inspired Optimization Algorithms

Article

Jan 2019

Enabling Secure and Fast Indexing for Privacy-Assured Healthcare Monitoring via Compressive Sensing

Article

Oct 2016

As e-health technology continues to advance, health related multimedia data is being exponentially generated from healthcare monitoring devices and sensors. Coming with it are the challenges on how to efficiently acquire, index, and process such a huge amount of data for effective healthcare and related decision making, while respecting user's data privacy. In this paper, we propose a secure cloud-based framework for privacy-aware healthcare monitoring systems, which allows fast data acquisition and indexing with strong privacy assurance. For efficient data acquisition, we adopt compressive sensing for easy data sampling, compression, and recovery. We then focus on how to secure and fast index the resulting large amount of continuously generated compressed samples, with the goal to achieve secure selected retrieval over compressed storage. Among others, one particular challenge is the practical demand to cope with the incoming data samples in high acquisition rates. For that problem, we carefully exploit recent efforts on encrypted search, efficient content-based indexing techniques, and fine-grained locking algorithms, to design a novel encrypted index with high-performance customization. It achieves memory efficiency, provable security, as well as greatly improved building speed with nontrivial multithread support. Comprehensive evaluations on Amazon Cloud show that our encrypted design can securely index 1 billion compressed data samples within only 12 min, achieving a throughput of indexing almost 1.4 million encrypted samples per second. Accuracy and visual evaluation on a real healthcare dataset shows good quality of high-value retrieval and recovery over encrypted data samples.

Federated Learning: Strategies for Improving Communication Efficiency

Article

Oct 2016

Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model. The typical clients in this setting are mobile phones, and communication efficiency is of utmost importance. In this paper, we propose two ways to reduce the uplink communication costs. The proposed methods are evaluated on the application of training a deep neural network to perform image classification. Our best approach reduces the upload communication required to train a reasonable model by two orders of magnitude.

LIBSVM: A library for support vector machines

Article

Jan 2011

FedGRU: Privacy-preserving Traffic Flow Prediction via Federated Learning

Abstract and Figures

Recommended publications

Fighting fake Chinese Herbal Medicines

Simulation model prepares cardiologists for surgeries

Privacy-Preserving Traffic Flow Prediction: A Federated Learning Approach

FedGRU: Privacy-preserving Traffic Flow Prediction via Federated Learning

Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach

Traffic Flow Prediction Using Deep Learning Models