ArticlePDF Available

Dynamic Traffic Flow Prediction Based on Long-Short Term Memory Framework With Feature Organization

Authors:

Abstract and Figures

Traffic flow is an important piece of information for traffic management and control. In particular, the dynamic prediction of traffic flow provides the basis for efficient control measures. The existing studies focus on improving the prediction accuracy by integrating the long short-term memory (LSTM) into various complex frameworks without paying attention to the feature engineering, which has a significant impact on the performance of machine learning methods. In this article, we propose a dynamic traffic flow prediction approach based on the LSTM framework with different feature organizations: feature division modes and feature selection. The feature division modes consider the periodicity of traffic flow by intervals (e.g., 5 min) and periods (e.g., daily). The feature selection determines different types of features as inputs to the prediction model. The impact of different feature organization strategies on the prediction accuracy is investigated using field data collected by the Caltrans Performance Measurement System. Two types of LSTM frameworks, the fully connected LSTM and the sequence-to-sequence LSTM (seq2seq-LSTM), are used to evaluate the performance of the proposed prediction approach. The results show that the seq2seq-LSTM model with optimized feature organization can significantly improve the prediction performance.
Content may be subject to copyright.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 2 MONTH 2021 1939-1390/21©2021IEEE
Dynamic Traffic Flow
Prediction Based on Long-
Short Term Memory
Framework With Feature
Organization
XXXXXX
Di git al O bje ct Id ent ifi er 10 .1109/ MIT S.2 021.31161 56
Dat e of curr ent ver sion: 10 November 2021
Jing Liu, Fangfang Zheng*, and Xiaobo Liu
Are with the School of Transportation and Logistics, National Engineering Laboratory of
Integrated Transportation Big Data Application Technology, National United
Engineering Laboratory of Integrated and Intelligent Transportation,
Southwest Jiaotong University, Chengdu, 611756, China.
Email: jing.liu@my.swjtu.edu.cn; fzheng@swjtu.cn; xiaobo.liu@swjtu.cn.
Ge Guo
Is with the State Key Laboratory of Synthetical Automation for Process Industries,
Northeastern University, Shenyang 110819, China, and also with the School of Control
Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, 066004, China.
Email: geguo@yeah.net.
*Corresponding author
AbstractTraff ic flow is an important piece of information for traffic management and control. In par-
ticular, the dy namic prediction of traffic flow provides the basis for efficient control measures. The existing
studies focus on improving the prediction accuracy by integrating the long short-term memory (LSTM) into
various complex frameworks without paying attention to the feature engineering, which has a significant
impact on the performance of machine learning met hods. In this article, we propose a dynamic traffic flow
prediction approach based on the LSTM framework with different feature organizations: featu re division
modes and feature selection. The feature div ision modes consider the periodicity of t raffic flow by intervals
(e.g., 5 min) and periods (e.g., daily). The feature selection determines different types of features as inputs
to the prediction model. The impact of different feature organization strategies on the prediction accuracy
is investigated using field data collected by the Caltrans Performance Measurement System. Two types of
LSTM frameworks, the fully connected LSTM and the sequence-to-sequence LSTM (seq2seq-LSTM), are
used to evaluate the performance of the proposed prediction approach. The results show that the seq2seq-
LSTM model with optimized feature organization can significantly improve the prediction performance.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 3 MONTH 2021
Traffic conditions keep changing constantly due to the
complex relationship between the supply and demand
of traffic networks over space and t ime. The dynamic
prediction of traffic conditions, e.g., the traff ic flow,
can provide future information about traffic states based
on past information. These predictions can be used as in-
put for intelligent transportation systems to further pro-
vide information for travelers to make better choices (e.g.,
regarding route, mode, or departure time) as well as for
road operators to take effective management and control
measures to improve traffic conditions for road networks.
Basically, dynamic traffic flow prediction approaches
can be divided into three categories: statistical model-
based approaches, simulation-based approaches, and ma-
chine learning-based approaches. Statistical model-based
approaches assume the system is stochastic, e.g., apart
from the periodic changes of the traff ic demand pattern,
the f luctuation of traffic flow i n the network is due to ran-
dom events. The basic idea behind t his type of method is to
determine corresponding state variables, decompose them
into time series, and derive the model for periodic patterns
[1]. Some widely used models include autoregressive inte-
grated moving average models [2], (state space) Kal man
filter models [3], and Bayesian network models [4], [5]. The
simulation-based approaches [6] replicate a real situation
by constructing scenarios using computers and applying
traffic dynamics models to describe macro- or microchar-
acteristics of the traffic flow assuming that the traffic de-
mand is known. These models are often used to evaluate
the impacts of different management or control measures
on traffic conditions.
Machine learning-based approaches have received a
lot of attention in recent years. This type of approach does
not speci fy the relationship among variables explicitly. In-
stead, a model framework is designed where the parame-
ters of the model are updated iteratively by using historical
data. Then, the model is trained with a certain prediction
ability. Machi ne learning-based approaches can be divided
into traditional machine learning-based approaches [7], [8]
and deep learning-based approaches [9]–[17 ], according
to different prediction frameworks. Generally, traditional
machine learning-based approaches have fewer parame-
ters, require less data, and have better interpretability but
are more dependent on features.
Deep learning is a special kind of machine learning
that can achieve more power and flexibility by learning
to represent the world as a nested hierarchy of concepts.
Each concept is defined in relation to simpler concepts,
and more abstract representations are computed in terms
of less abstract ones [18]. Though the predict ion framework
of the deep learning approach is more complicated, it can
provide a higher prediction accuracy. Deep learning-based
approaches are widely used to predict traffic conditions,
where a recurrent neural network (RNN) module and
convolutional neural network (CNN) module are usually
applied to process time ser ies and spatial information, re-
spectively [9], [10]. For the prediction of a network t raffic
flow, a graph neu ral network is used to process the spatial
information of the traffic network, which is abstracted into
a graph, and the relationship between nodes is not affected
by their physical distance [11], [12].
In the literature, R NNs are widely used to describe the
temporal characteristics of traffic flow. Among them, a
variant of RNN, called the long short-term memory (LSTM )
module, has been extensively applied to make predictions
[9], [14]–[17]. W hen dealing with long-term sequences, the
RNN is prone to gradient disappearance or gradient explo-
sion. The LSTM modu le can process information with a
longer step by introducing memory units; thus, it is more
suitable for traffic prediction. Wu and Tan [9] propose a
deep learning architecture that combines the CNN and the
LSTM to forecast future traffic flow. They apply two LSTMs
to mine the short-term variability and the periodicity of
traffic flow using near-term data and the traffic f low data
of the previous day and the previous week. Wu et al. [13]
also put forward a deep neural network-based traffic flow
model in which the attention mechanism is introduced into
the traffic flow prediction. Traffic flow weighted by the at-
tention matrix extracted by speed data is taken to extract
spatiotemporal characteristics. Li et al. [14] combine a
graph convolutional network (GCN) and t he LSTM to pre-
dict traffic flow, where the GCN is used to extract t he input
spatiotemporal features of the LSTM, and only short-term
features are considered. Zhao et al. [15] propose a 2D LSTM
network by modifying the structu re of the LSTM. Their
proposed model can extract spatiotemporal characteristics
at the same time, but only short-term features are consid-
ered. The input and temporal features of several LSTM-
based traffic flow prediction models are listed in Table 1.
As can be seen from Table 1, among the prediction mod-
els based on the LSTM, most of the effort has been put into
designing the deep learning framework, while feature en-
gineering is largely ignored. The reason is that deep learn-
ing was originally applied in computer science, including
computer vision and natural language processing, where
it is di fficult to derive suitable features with physical sig-
nificance. Thus, various complex deep learning structures
[19], [20] are developed to extract abstract features. By con-
trast, traffic flow dy namics can be characterized by spatio-
temporal correlated features with physical meaning (e.g.,
traffic volume, speed, density, and so on). This allows fea-
ture engineering (or feature organization) to provide the
additional physical relationship of traffic f low data sets,
which can be used as input to LSTM models to improve
prediction accuracy.
For traffic prediction, we can consider two aspects of
feature organization, namely, feature division and feature
selection. The former deals w ith the period ic properties of
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 4 MONTH 2021
traffic, and the latter determines proper feature variables,
which can ref lect the physical characteristics of traff ic.
Regarding feature division, most of the existing research
directly provides t he feature division mode w ithout analyz-
ing the impact of different feature division modes on traffic
prediction performance [9], [21]. The feature selection could
also affect the performance of prediction models since the
feature parameters describing traffic dynamics are corre-
lated and have physical meaning. However, the existing deep
learning-based models use only observed data by detectors
(such as traffic volume, density, and mean speed) as i nput
features without considering the inherent physical relation-
ship between each feature [22], [23]. Thus, more complex
structures are needed for those models to improve predic-
tion per formance at the cost of increased computation time.
To fill the gap, we develop a short-term traffic flow
prediction approach based on LSTM prediction frame-
works considering different feature aspects: feature
division and feature selection. Three feature division
modes are proposed to derive the input features consid-
ering the periodic properties of traf fic f low. The feature
selection focuses on selecting different type of features
as the input to the predict ion framework. Apart from the
features directly observed by detectors, such as observed
traffic flows, we derive two additional ty pes of traff ic
state features by applying traffic f low models. The first
type of feature is the t raffic equilibrium state estimated
by the fundamental diagram (FD); an FD-based traffic
state estimation model is proposed accordingly. The sec-
ond type of feature is the traf fic state transition index,
which is inspired by the Lighthill-Whitham-Richards
model [24]–[26]. The traffic equilibrium state feature
reflects the traffic characteristics and the general law
of traffic flow parameters. The state transition index
reflects the traffic state changes between adjacent time
intervals. These two types of features provide traffic
state information over space and time. The detector data
of an expressway are collected by the Caltrans Perfor-
mance Measurement System (PeMS) and used to validate
the proposed approach under different feature organi-
zations. Two types of LSTM frameworks, the fully con-
nected LSTM (FC-LSTM) and the sequence-to-sequence
LSTM (seq2seq-LSTM), are used to evaluate the perfor-
mance of the proposed prediction approach.
Methodology
In this section, we introduce the traffic flow prediction
approach i ncluding the LSTM prediction frameworks,
the FD-based traf fic state estimation model, and the
traffic state transition index. The LSTM frameworks
are w idely used methods in traffic f low prediction. The
FD-based state esti mation model and the transition in-
dex are applied to derive input features, where the for-
mer is used to obtain equilibrium state features and the
latter is used to extract state transition features. These
features are used as inputs for the LSTM prediction
framework.
LSTM Prediction Framework
The prediction framework describes the connections be-
tween different modules in the prediction model. The
LSTM framework refers to the framework contai ning the
LSTM module, which is composed of LSTM layers first
proposed by Hochreiter and Schmidhuber [19]. The LSTM
layer is a special R NN structure that mainly deals with
time series. The memory unit is introduced into t he basic
RNN structure so that the model can stay sensitive to the
information of distant time steps, and the problem of gradi-
ent disappearance in the learning process when the time
series is long can be solved.
In this section, we introduce two commonly used LSTM
prediction frameworks: the FC-LSTM and the seq2seq-
LSTM. The prediction models based on these frameworks
are used to analyze the inf luence of different feature orga-
nizations on the prediction accuracy.
FC-LSTM Framework
The FC-LSTM framework is a well-k nown net work archi-
tecture that is powerful in capturing sequential depen-
dency [27]. It is a basic LSTM framework that consists of
one LSTM layer that extracts information from time series
data and one dense layer that converts the outputs into the
Authors Model Structure Input Features Temporal Features for LSTM Periodicity
Wu and Tan [9] CNN + LSTM Traffic flow Traffic flow of near term, previous day and previous weekday Daily and weekly
Li et al. [14] GCN + LSTM + Attention Traffic flow Features extracted by GCN of near term Not considered
Zhao et al. [15] 2D LSTM Traffic flow Origin destination correlation matrix of traffic flow of near term Not considered
Luo et al. [16] k-nearest neighbor + LSTM Traffic flow Traffic flow of near term Not considered
Zhaowei et al.
[17]
Probability mapping + deep
bidirectional LSTM + LSTM
Traffic flow Features derived from traffic flow through probability distribution
mapping of near term
Not considered
Table 1. LSTM-based traffic flow prediction research.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 5 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 4 MONTH 2021
target dimension. The FC-LSTM architecture is shown in
Figure 1.
First, we distinguish the following concepts:
LSTM c ell: the area enclosed by a solid red box in Figure
1, which is used to process the information of a single
time step
LSTM l ayer: the blue dashed box in Figu re 1, which is a
layer with series-connected LSTM cells. The latter unit
accepts the information from the previous unit. The
length of the series is the number of input time steps,
which is also equal to the number of outputs.
In an LSTM cell, a single time-step operation is per-
formed according to:
,Wa
xb
f
t
ftt
f
1
vC
=+
-
^h
6
@
(1)
,Wa xb
u
tu
tt
u
1
vC=+
-
^h
6
@
(2)
,tanhcWaxb
t
c
tt
c
1
=+
-
u
^h
6
@
(3)
ccc
tu
t
tf
t
t1
))CC
=+
-
u
(4)
,Wa xb
o
to
tt
o
1
vC=+
-
^h
6
@
(5)
,tanhac
t
o
t
t
)
C
=
^h
(6)
where the superscript < t > corresponds to the tth time step.
a and c represent the hidden state variable and the memo-
ry variable, respectively. These two variables run through
all time steps and are used for state connection and mem-
ory storage.
c
u
is the candidate memory variable with tanh
as the activation function and
bc
as the weight and
bias.
c
u
generates the current memory variable
c
t
togeth-
er with the previous memor y variable
.c
t1-
The genera-
tion process is controlled by the update gate and the forget
gate.
,
f
C
,
u
C
and
o
C
represent the forget gate, the update
gate, and the output gate, respectively. The weight W and
the bias b with f, u , or o as subscripts are the parameters
of the corresponding gate. These gates can be regarded
as filteri ng devices, and the output value is controlled be-
tween (0,1) by the
v
activation function, which can be un-
derstood as the passing rate of the controlled tensor.
An LSTM layer is formed by connecting multiple LSTM
cells in series. The hidden state a and memory c transfer
information between adjacent time steps. The number of
LSTM cells in an LSTM layer is the input sequence length
during prediction.
The dense layer converts model outputs into the target
dimension according to
,yW ab
d
t
d
)
v=+
t
^h
(7)
where
y
t
represents the prediction state.
Wd
and
bd
repre-
sent the weights and the bias of the dense layer, the dimen-
sion of which determines the prediction hori zon T.
Forget GateUpdate Gatetanh Output Gate
+
.
+
+
tanh
Ċ Ċ
Ċ Ċ
ǂ
Ċ Ċ
Dense Layer
y<t + 1>
c<t – 1>
a<t – 1>
c<t>
a<t>
a<t>
y<t + 2>
s<t>
y<t + T>
LSTM Layer
c<0> c<1>
x<1> x<2> x<t>
a<1>
c<2>
a<2>
a<0>
a<1> a<2>
LSTM Cell
Forget GateUpdate GatetanhOutput Gate
+
.
+
+
tanh
LSTM Cell
Forget GateUpdate GatetanhOutput Gate
+
.
+
+
tanh
LSTM Cell
σ
"
"
"
FIG 1 The FC-LSTM framework.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 6 MONTH 2021
seq2seq-LSTM Framework
The seq2seq-LSTM is a combination framework of the se-
q2seq architecture and the LSTM module. The seq2seq is
an architecture represented by the encoder and the de-
coder, where t he encoder receives an input sequence, ex-
tracts the information in the sequence, and transmits it to
the decoder, which converts the information into an out-
put sequence. When using the seq2seq-LSTM framework
to predict traffic f low, the encoding and the decoding pro-
cesses are completed by the LSTM module. The framework
architecture is shown in Figure 2.
FD-Based Traffic State Estimation Model
The FD-based traffic state estimation model is used to
estimate the equilibrium state corresponding to the mea-
surement data based on the FD. The estimation model and
the detailed estimation procedu re based on Smulders’ fun-
damental diagram (SFD) are introduced in the following
two sections.
Equilibrium State Estimation Based on the FD
The FD describes the general law of traffic flow pa-
rameters (the traffic flow, density, and speed) in the
stationary state (or equilibrium state). Given a set
of detector data, we can derive two groups of states,
namely, the detection state and the equilibrium state.
The detection state can be represented by the detection
volume
,qd
the detection speed
,vd
and the detection
density
,kd
which are directly obtained from the detec-
tor data. The equilibr ium state refers to the state in
the FD, which is a stationary state, including the equi-
librium volume
,qe
the equilibrium speed
,ve
and the
equilibrium densit y
.ke
In the literature, it is usual ly assumed that the detection
density is equal to the equilibrium density, and then the equi-
librium volume and speed are obtained according to the FD
(see, e.g., [28]). However, the detector data are likely to be er-
roneous. The assumption that the detection density is equal
to the equilibrium density can cause an inaccurate estima-
tion of the equilibriu m state. In this article, we use the Eu-
clidean distance as a measure to convert the detection state to
the equilibrium state. Specifically, the point that is closest to
the detection point on the f low-density relationship curve is
used as the estimation of the equi librium state given by
argmin kk
qq
,kq t
d
t
e
t
d
t
e
22
t
et
e
-+-
^^hh
(8)
.. ,determinedst therelationshipbythe FDKQ
-
where
kt
e
and
q
t
e
represent the equilibr ium density and
volume at time t, respectively; the equilibrium speed
vt
d
can be derived according to (9):
.qkv
t
d
t
d
t
d
$=
(9)
Traffic State Estimation Based on SFD
In this section, we discuss the detailed estimation proce-
dure based on the SFD, which is a two-stage FD model for
expressway sections proposed by Smulders [29]. The for-
mulation of the SFD model is given by
/
//
vk
vkkkk
vk kk
kk
1
11
fj c
fc
jc
1
2
=
-
-
^
^^
h
hh
)
, (10)
where
vf
represents the free flow speed,
kj
is the jam den-
sity, and
kc
is the density at capacity. This model assumes
a linear relationship between the speed and the density in
the free flow state and a linear relationship between the
flow and the density in the congestion state. These param-
eters can be estimated from field detector observations
ĊĊ ĊĊ
Encoder Decoder
LSTM ModuleLSTM Module
x<1> x<2> x<t>y<t + 1> y<t + 2> y<t + T>
"
"
"
FIG 2 The seq2seq-LSTM framework.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 7 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 6 MONTH 2021
by the least-squa res method, and the corresponding flow
state can be obtained by
.qvk)
=
In this study, we use the
field detector data to calibrate the linear relationship.
The point with the shortest distance from the detection
state point to the FD function refers to the equilibrium state
estimated by the FD-based traff ic state estimation model
proposed in the precedi ng section. The f low-density FD
under Smulders’ model is composed of a quadratic relation
in the free f low state and a linear relation in the congested
state. To find the estimated equilibr ium state from the de-
tection state, it is necessary to compare t he shortest distance
from the detection state point to the linear function and to
the quadratic function of the SFD model, respectively.
There are five possible equilibrium state points for the
two-stage FD model:
(, ),kq
ll
(, ),kq
ss
(, ),kq
00
(, ),kq
11
and
(, ),kq
cc
indicated with red dots in Figure 3.
(, )kq
dd
refers
to the detection state point.
(, )kq
ll
and
(, )kq
ss
are the per-
pendiculars from the detection state point to the t wo-stage
FD function, respectively.
(, ),kq
00
(, ),kq
11
and
(,)kq
cc
represent the state points corresponding to the free flow
state when the road is empty, the congested state with the
maximum density, and the capacity state, respectively.
,dl
,ds
,d0
,d1
and
dc
are the Euclidean distances from the de-
tection state point to the corresponding equilibrium state
point. The estimation procedure and the illustration of the
results are given in Table 2 and Figure 4.
Traffic State Transition Index
The equilibrium state estimated by the FD-based traffic
state estimation model reflects the spatial characteristics
of traffic flow at the given road segment. In this section, we
introduce the traffic state tra nsition index, which is used
to descr ibe the temporal characteristics of the traffic state.
In the kinematic wave model, the shock-wave speed de-
scribes the transit ion between adjacent traff ic states in space.
Inspired by the traffic wave model, we propose a traffic state
transition index, as described by (11). This index considers
the relative speed of traffic f low in different time periods and
is used to describe the change of the traffic state over time:
st
kk
qq
t
i
t
e
ti
e
t
eti
e
=
-
-
-
-
, (11)
where
stt
i
is the t raffic state transition index between time
ti-
and time
t
. It can be seen from (11) that, if bot h states
are under the free flow condition, the st index is approximate-
ly equal to the free flow speed; if both states are under the
congested condition, the st index corresponds to the transi-
tion speed of congestion over time. If the t wo states are in dif-
ferent traff ic conditions, the st index fluctuates significantly.
Feature Organization
The feature organization includes the feature division
mode and t he feature selection. In this section, we first
introduce the feature division mode for a single feature.
The FD-based traf fic state estimation model and the traffic
state transition index proposed in the “Methodology” sec-
tion are applied to construct the prediction features.
Feature Division Modes
Traffic data intrinsically tend to have daily periodicity (the
same time of previous days) and weekly periodicity (the
same day of previous weeks). Figure 5 illustrates a group
of traffic volume featu res. As shown in Figure 5(a), if we
would like to predict the t raffic volume from interval
i0
to interval
i1
on day 8, three important parts of historical
data can be considered: historical data of the past intervals
in the current period (the green dashed box 1), historical
data of the past interval in the past periods (the blue box 2),
and historical data of the same interval in the past periods
(the orange dot-and-dashed box 3).
Consider ing these three parts of information, we pro-
pose three feature division modes:
Division mode 1:
,fi ii
00
T
-
^h
Divi sion mode 2:
,fi nd ind
n
p
1
01
,--
=
^h
Divi sion mode 3:
,, ,fi ii fi nd ii nd
n
p
00 1
01
,
TT
--
--
=
^^hh
6
@
,
q
k
kc
d0
d1
dl
dc
ds
(k0, q0)(k1, q1)
(kd, qd)
(ks, qs)
(kc, qc)
(kl, ql)
FIG 3 The estimation of the possible equilibrium state points for the two-
stage FD model.
Step 1 Calculate the Euclidean distance from the detection state point to
the possible equilibrium state points.
Step 2 Find the point closest to (kd , qd ) from the possible equilibrium
state points on the FD.
Step 3 Determine whether the following conditions are satisfied:
(kl $ kc ) if the closest point is (kl , ql ) V (ks < kc ) if the closest
point is (ks , qs ).
Step 4 If the condition in step 3 is true, the closest point is an estimate
of the equilibrium state.
Step 5 If not, delete the current closest point in the possible equilibrium
state points and go back to step 2.
Table 2. The equilibrium state estimation procedure.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 8 MONTH 2021
where
f
is the feature considered in the prediction, e.g.,
the traffic volume.
,fab
^h
represents the value of
f
from
interval a to interval b.
iT
is an integer indicating the
number of adjacent intervals. d represents the number of
intervals included in one day and p is the number of days in-
cluded in one period. We consider two periods when divid-
ing features: t he daily period and weekly period, which are
determined by t he characteristics of t raffic data. Division
mode 1 considers the past
iT
intervals’ data for the current
period (e.g., day 8) focusing on the short-term variation [as
shown i n Figure 5(b) with the dashed box]. Division mode
2 considers the same interval in the past periods (e.g., from
day 1 to day 7 ) and mainly pays attention to the periodic
variation [as shown in F igure 5(c) with the dashed box].
Division mode 3 not only considers the short-term vari-
ation of the current period, but also considers the periodic
variation in the same i nterval and the previous intervals in
the past periods [as shown in Figure 5(d) with the dashed
box]. It is worth mentioning that most of the input of a sin-
gle-layer LSTM is organi zed using division mode 1 without
considering the periodic patterns. If the periodic informa-
tion is needed for prediction, a multilayer LSTM model is
usually adopted. The multilayer LSTM model is a complex
model, which requires more training samples as well as
computational effort to obtain a high prediction accuracy.
Feature Selection
In this section, we introduce the features selected to pre-
dict the traffic volume. We mainly consider three types of
features in this study:
Detection features: These features are directly collect-
ed by detectors with simple preprocessing and include
the detection volume, t he detect ion speed, and the de-
tection density.
Equilibrium features: These features are estimated from
the detection features based on the FD-based traffic state
estimation model; they include the equilibrium volume
and the equilibrium density.
State transition features: These features are the traffic
state transition indexes between adjacent intervals cal-
culated by the traffic state transition model.
Among these three types of features, the detection fea-
tures are likely to be disturbed by randomness, while the
equilibriu m features can reflect the (aggregated) traffic
characteristics of the road, and the state transition features
characterize the stability of traffic flow between time inter-
vals. These features describe the physical relationship of
traffic states and can provide more predictive information.
Case Study
In this section, we analyze the influence of different feature
organization aspects on the prediction accuracy, based on
the LSTM frameworks introduced in the “LSTM Prediction
Framework” section and using field data. We first specify
the prediction models by determining the framework pa-
rameters. The prediction performance of t he FC-LSTM
model with feature organization is analyzed, and a further
contrastive analysis is carried out with the more sophisti-
cated seq2seq-LSTM model. The best feature organization
method is derived for the final traffic flow prediction under
different prediction models.
Data Acquisition
To validate the proposed traffic prediction approach, we
use real traffic data of the state of California collected by
the Caltrans PeMS, which provides the historical traffic
data of more than 39,000 detectors for more than 10 years
on the expressway. In this article, we selected the detector
data of the I80-E expressway from 5 January 2020 to 5 July
2020. Since the original data from t he PeMS are updated
5,000
4,000
Volume (Vehicles/h)
3,000
2,000
1,000
0
0 100 200 300
Density (Vehicles/km)
(a)
400 500 600
5,000
4,000
Volume (Vehicles/h)
3,000
2,000
1,000
0
0 100 200 300
Density (Vehicles/km)
(b)
400 500 600
Fundamental Diagram
Detection State Point
Possible Equilibrium State Point
Estimated Equilibrium Point
FIG 4 An illustration of the estimation results. State estimation in the (a)
free flow condition and (b) congested condition.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 9 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 8 MONTH 2021
every 30 s, we preprocessed the data in the follow ing as-
pects before organizing the features:
Short-term missing data (< 3 h) were filled with the his-
torical mean combined with the values before and af ter
the missing data.
Long-term missing data (
$
3 h) were filled with the his-
torical mean.
The traff ic count and occupancy were aggregated by
5 min, and the traffic count was converted into hourly
traffic volume.
The space mean speed was aggregated into 5-min in-
tervals.
The density was estimated using the traffic flow and the
space mean speed based on the fundamental relation
among these three parameters.
To facilitate the comparison, we selected the data of the
last 30 days (from 5 June 2020 to 5 July 2020) for validation,
and the previous five months’ data (from January 5 to 4 June
2020) were used to organize the features in different modes,
which could be used as the input to predict the traffic volume
in the next 1 h (5 min is one step, and the next 12 steps are
predicted). To eliminate the randomness, the prediction was
repeated 20 times for all scenarios, and the average predic-
tion error was taken for analysis.
Parameter Determination of the Prediction Framework
The parameters of the prediction framework are shown in
Table 3, where time_step is the length of a single sample
determined by feature division, and n_features is the fea-
ture number determined by feature selection. We specify
the number of LSTM hidden units to be 12. The prediction
horizon is set to be 1 h, so the output length of each predic-
tion model is 12.
Prediction Performance of the FC-LSTM Model
With Feature Organization
Performance of Different Division Modes With One Feature
The detection volume was selected as the prediction fea-
ture. We set the parameters
, , ,idp288 288 7T
===
(a)
Day 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
Day 8
Day 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
Day 8
Day 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
Day 8
Day 1
Day 2
Day 3
Day 4
Day 5
Day 6
Day 7
Day 8
23
Time Interval i0i1
(b)
Time Interval i0i1
(c)
Time Interval i0i1
(d)
Time Interval i0i1
1
FIG 5 A schematic diagram of feature division. (a) Historical data regionalization, (b) division mode 1, (c) division mode 2, and (d) division mode 3.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 10 MONTH 2021
and the feature was organized according to the three divi-
sion modes, as described in the “Feature Division Modes”
section. The mean absolute error (MAE) and the mean
absolute percentage error (MAPE) were chosen as the per-
formance indexes. Table 4 and Figure 6 show the average
prediction results of 20 simulation runs.
It can be seen from Table 4 and Figure 6 that different
division modes significantly affect the prediction accu-
racy of the FC-LSTM model. A mong the three division
modes, division mode 3 gives the highest accuracy for all
prediction horizons, followed by division mode 1. Divi-
sion mode 2 provides the worst prediction performance.
With the increase of the prediction horizon, the accuracy
of both division mode 1 and 3 gradually declines, and
division mode 2 provides a relatively stable prediction
performance.
Different Feature Selection Schemes With One Division Mode
We combine the three types of features (detection features,
equilibrium features, and state transition features) pro-
posed in the “Feature Selection” section and obtain eight
feature selection schemes considering single and multiple
detection features, equilibrium features, and state transi-
tion features, as shown in Table 5. Division mode 1 was
chosen to derive the input of different types of features for
the prediction model with
.i288T=
The predicted results
are shown in Figure 7.
Figure 7 illustrates the comparison of the prediction
accuracy for the eight schemes. As can be seen from the
figure, on one hand, the predict ion accuracy deterio-
rates with the increase of the prediction horizon for all
schemes; on the other hand, the prediction performance
is inconsistent among the eight schemes at different
prediction horizons (from 5 min to 1 h). For shorter pre-
diction horizons (<20 min), the performance difference
among the eight schemes is marginal. However, scheme
1, which only takes the detection flow as the input fea-
ture, performs better for the longer prediction horizons
(>30 min). This indicates that the FC-LSTM model does
not fully utilize the feature information. To better verify
the effectiveness of feature selection, we further ana-
lyze the prediction performance of the seq2seq-LSTM
model with a more complex structure under different
feature organizations.
Features of the FC-LSTM Model
Layer Output Shape
InputLayer [(None, time_step = Di, n_features)]
LSTM (None, 12)
Dense (None, output length = 12)
Table 3. Prediction features determination.
Features of the seq2seq-LSTM Model
Layer Output Shape
InputLayer [(None, time_step = Di, n_features)]
LSTM (Encoder) (None, 12)
Dense (None, 12)
RepeatVector (None, 12, 12)
LSTM (Decoder) (None, 12)
Dense (None, output length = 12)
MAPE
Mode 1 Mode 2 Mode 3
5 min 10.34% 14.70% 9.88%
10 min 10.75% 14.18% 10.11%
15 min 11.17% 13.82% 10.37%
20 min 11.52% 13.62% 10.58%
25 min 11.94% 13.40% 10.77%
30 min 12.18% 13.47% 10.92%
35 min 12.43% 13.42% 11.01%
40 min 12.95% 13.56% 11.18%
45 min 13.20% 13.71% 11.37%
50 min 13.66% 13.91% 11.46%
55 min 13.94% 14.09% 11.75%
60 min 14.31% 14.53% 12.07%
MAE (vehicles/h)
Mode 1 Mode 2 Mode 3
5 min 180.305 276.958 173.33
10 min 188.93 268.388 180.066
15 min 198.179 263.647 187.004
20 min 206.237 259.282 192.157
25 min 213.404 256.887 196.055
30 min 219.239 257.378 199.206
35 min 225.175 257.323 202.078
40 min 232.073 259.246 204.941
45 min 237.735 261.507 208.512
50 min 243.232 264.232 211.502
55 min 249.495 269.192 216.077
60 min 255.932 275.693 220.796
Table 4. One-hour (12-step) prediction per formance of three
division modes for FC-LSTM.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 11 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 10 MONTH 2021
Prediction Performance of the seq2seq-LSTM Model
With Feature Organization
We set the same parameters
, , idp288 288 7T
===
as
in the “Prediction Performance of the FC-LSTM Model
With Feature Organization” section. The prediction perfor-
mance of the seq2seq-LSTM for the three division modes
evaluated by MA PE is shown in Figure 8. It can be clearly
seen that division mode 3 performs best. The influence of
the different division modes on the prediction performance
of FC-LSTM and seq2seq-LSTM is similar.
Figure 9 compares the prediction accuracy of the se-
q2seq-LSTM model for the eight schemes. As can be seen
from the figure, the prediction results in terms of MAPE at
different prediction horizons are rather consistent among
all schemes. The prediction accuracy of three detection
features
,,)(qvk
ddd
is higher than that of a single detection
features
)(qd
in ter ms of MAPE. This indicates that incorpo-
rating more traffic parameters (e.g., speed and density) into
the seq2seq-LSTM model can significantly improve the pre-
diction accuracy. Moreover, three detection features com-
bined w ith the equilibriu m features or the state transition
features can further improve the prediction accuracy. For
instance, the prediction accuracy of schemes 6–8 is higher
than that of scheme 5. Among all schemes, scheme 6 with
both detection features and equilibrium features performs
best at different prediction horizons (from 5 min to 1 h).
Optimal Feature Organization for the seq2seq-LSTM Model
The prediction results from the “Prediction Performance
of the FC-LSTM Model With Feature Organi zation” section
show that the FC-LSTM model, because of its simple struc-
ture, fails to extract consistent information from features
of the given data set. Thus, we mainly analyze the optimal
feature organization for the seq2seq-LSTM model.
Based on t he analysis from the “Prediction Performance
of the seq2seq-LSTM Model With Feature Organization”
section, we can observe that division mode 3 and scheme
6 are the optimal division mode and the optimal feature
selection scheme for the seq2seq-LSTM model for the
given data set. Si nce scheme 1 contains the simplest input
feature (only detection flow), which is the most common
input feature in traffic f low pred iction, and scheme 5 con-
tai ns only detection features (detection f low, density, and
speed), where additional features based on the traff ic flow
models are not considered, we chose these two as the ba-
sic schemes for comparison purposes. Meanwhile, div ision
Prediction Horizon (min)
(a)
160
180
200
220
240
260
280
Division Mode 1 Division Mode 2 Division Mode 3
9
10
11
12
13
14
15
MAE (Vehicle/h)
MAPE (%)
51015202530354045505560
Prediction Horizon (min)
(b)
51015202530354045505560
FIG 6 A comparison of the 1-h (12-step) prediction performance of the three division modes with a single feature for FC-LSTM. (a) MAE and (b) MAPE.
Detection
Features
Equilibrium
Features
State Transition
Features
Scheme 1
qd
Scheme 2
qdqe, k e
Scheme 3
qd
St1
Scheme 4
qdqe, k e
St1
Scheme 5
qd, v d, k d
Scheme 6
qd, v d, k dqe, k e
Scheme 7
qd, v d, k d
St1
Scheme 8
qd, v d, k dqe, k e
St1
Table 5. Eight feature selection schemes for traffic flow prediction.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 12 MONTH 2021
mode 1 is considered as t he most commonly used feature
division mode in the literature. Therefore, we set the fol-
lowing three combinations:
Basi c combination 1: scheme 1 and division mode 1
Basi c combination 2: scheme 5 and division mode 1
Optimal combination: scheme 6 and division mode 3.
The comparison results among these three combina-
tions are shown in Figure 10 a nd Table 6. Fi gure 10(a)–(c)
shows the comparison of basic combination 1 and the opti-
mal combination. It can be clearly seen that the prediction
error of the optimal combination is significantly lower
than that of basic combination 1 at different predict ion
horizons. Similar results can be observed from Figure
10(d ) –( f ) with the comparison between basic combina-
tion 2 and the optimal combination. Compared with basic
combination 1 and basic combination 2, the improvement
of the prediction accuracy for the optimal combination
ranges, respectively, from 4.72 to 10.75% and from 4.36%
to 7.39% when the prediction horizon increases from 5
min to 1 h, as illustrated in Table 6.
The prediction results for the three combinations using
the seq2seq-LSTM model are shown in Figu re 11. It can
be observed from the figure that the traff ic volume of the
selected road section does not include obvious morning
and evening peak hours but presents a si ngle peak around
noon. According to the characteristics of the data set, we
chose 9 a.m.–9 p.m. as the high-volume period and 9 p.m.
–9 a.m. as the low-volume period to analyze the effect of
feature organization under different traffic conditions.
Table 7 illustrates the prediction per formance wit h
a prediction horizon of 5 min (one step). It can be clearly
seen that the prediction error in terms of the MAPE is lower
in the high-volume period than in the low-volume period.
Among these three combinations, the optimal combination
provides the best performance with the lowest MAPE of
6.66% and 12.3% for the high-volume period and the low-
volume period, respectively. Basic combination 2 per forms
slightly better wit h a lower MA PE than basic combination 1
for both t he high-volume and the low-volume periods.
Figure 12 further compares the prediction accuracy im-
provement for different volume levels under various predic-
tion horizons. The seq2seq-LSTM model with the optimal
combination performs best in terms of the MAE and M APE
under different prediction horizons in both the high-volume
10
10.5
11
11.5
MAPE (%)
MAPE (%)
MAPE (%)
11.5
12
12.5
13
13
13.5
14
14.5
15
4
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Scheme 5 Scheme 6 Scheme 7 Scheme 8
51015202530354045505560
(min) (min) (min)
(a) (b) (c)
FIG 7 The prediction accuracy of the eight schemes with division mode 1 for a 1-h time horizon (12 steps). Prediction time horizon between (a) 5 and
20 min, (b) 25 and 40 min, and (c) 45 and 60 min.
MAPE (%)
14
13
12
11
10
951015202530354045505560
Prediction Horizon (min)
Division Mode 1
Division Mode 2
Division Mode 3
FIG 8 A comparison of the 1-h (12-step) prediction performance of the
three division modes with a single feature for the seq2seq-LSTM.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 13 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGA ZINE 12 MONTH 2021
10
10.5
11
11.5
MAPE (%)
MAPE (%)
MAPE (%)
11
11.5
12
12.5
13
12
12.5
13
13.5
14
14.5
Scheme 1 Scheme 2 Scheme 3 Scheme 4
Scheme 5 Scheme 6 Scheme 7 Scheme 8
5101520253035404550
(min) (min) (min)
(a) (b) (c)
55 60
FIG 9 The prediction accuracy of the eight feature selection schemes with division mode 1 for the seq2seq-LSTM. Prediction time horizon between (a)
5 and 20 min, (b) 25 and 40 min, and (c) 45 and 60 min.
170
180
190
200
210
MAE (Vehicle/h)
190
200
210
220
230
200
210
220
230
240
250
Basic Combination 2 Optimal Combination
Basic Combination 1 Optimal Combination
5101520253035404550
(min) (min)
(b) (c)(a)
(e) (f)(d)
(min)
55 60
170
180
190
200
MAE (Vehicle/h)
MAE (Vehicle/h)MAE (Vehicle/h)
MAE (Vehicle/h)MAE (Vehicle/h)
190
200
210
220
200
210
220
230
240
5101520253035404550
(min) (min) (min)
55 60
FIG 10 The prediction accuracy of the seq2seq-LSTM model with different feature organizations. (a)–(c) Basic combination 1 versus optimal
combination: Prediction time horizon between (a) 5 and 20 min, (b) 25 and 40 min, and (c) 45 and 60 min. (d)–(f) Basic combination 2 versus optimal
combination: Prediction time horizon between (d) 5 and 20 min, (e) 25 and 40 min, and (f) 45 and 60 min.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 14 MONTH 2021
and low-volume conditions. The prediction accuracy de-
creases with the increase of the prediction horizon (from 5
min to 1 h) for the high-volume period. A similar phenom-
enon can be observed for the low-volume condition except
for some performance fluctuations. The overall prediction
accuracy in terms of MAPE for the high-volume period is
higher than t hat for the low-volume period.
Conclusion
In this article, we discuss the influence of feature organi-
zation in terms of feature division and feature selection on
the accuracy of different LSTM prediction frameworks—
FC-LSTM and seq2seq-LSTM—for traffic flow prediction.
We propose three feature division modes and consider the
periodicity of traf fic f low by intervals (e.g., 5 min) and pe-
riods (e.g., daily). Three features, the detection featu res,
equilibriu m state features, and state transition features,
are selected as inputs to the prediction model. The de-
tection features, such as traffic volumes and speeds, are
obtained di rectly from the detectors. The equilibrium state
features (e.g., the equilibrium f low, speed, and density) are
estimated by the FD-based traff ic state estimation model.
The state transition index is derived to describe the state
changes at adjacent time intervals.
We apply real traffic data collected by the Califor nia
state PeMS to validate the performance of the proposed
approach. The results show that a simple structure (e.g.,
FC-LSTM) of the pred iction model with feature organiza-
tion cannot improve the prediction performance. A pos-
sible reason is that a model with a too-si mple structure
has a limited ability to extract information from external
features, which could be regarded as disturbance. An i n-
crease of the model structure complexity (e.g., seq2seq-
LSTM) can help improve the ability of feature extraction.
Therefore, the prediction performance is i mproved sig-
nificantly for the seq2seq-LSTM model w ith feature
organization compared with that without feature orga-
nization. By enumerating and analyzing the proposed
Basic Combination 1 Versus Optimal Combination
MAE
(Vehicles/h)
Scheme 1 and
Mode 1
Scheme 6 and
Mode 3
Accuracy
Improvement
5 min 182.021 173.434 4.72%
10 min 189.64 180.525 4.81%
15 min 197.441 186.718 5.43%
20 min 203.864 190.988 6.32%
25 min 210.133 194.375 7.5%
30 min 215.74 198.038 8.21%
35 min 220.592 200.197 9.25%
40 min 225.5 203.125 9.92%
45 min 230.826 206.319 10.62%
50 min 235.703 210.171 10.83%
55 min 239.967 214.533 10.6%
60 min 245.669 219.252 10.75%
Table 6. A comparison of the prediction accuracy for the seq2seq-
LSTM model.
Basic Combination 2 Versus Optimal Combination
MAE
(Vehicles/h)
Scheme 5 and
Mode 1
Scheme 6and
Mode 3
Accuracy
Improvement
5 min 181.342 173.434 4.36%
10 min 188.313 180.525 4.14%
15 min 195.181 186.718 4.34%
20 min 200.633 190.988 4.81%
25 min 206.53 194.375 5.89%
30 min 210.8 198.038 6.05%
35 min 214.979 200.197 6.88%
40 min 218.72 203.125 7.13%
45 min 223.828 206.319 7.82%
50 min 227.762 210.171 7.72%
55 min 232.365 214.533 7.67%
60 min 236.759 219.252 7.39%
Volume (Vehicle/h)
6,000
4,000
2,000
0500 1,000 1,500 2,000
Time Interval
Actual
Basic Combination 1
Basic Combination 2
Optimal Combination
FIG 11 The one-step prediction result of one week with different
combinations.
MAPE
Basic
Combination 1
Basic
Combination 2
Optimal
Combination
High-volume
period
7.31% 6.95% 6.66%
Low-volume
period
13.18% 13.57% 12.30%
Table 7. The 5-min (one-step) prediction accuracy with different
combinations under different volume levels.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 15 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 14 MONTH 2021
combinations of feature organiza-
tion in terms of feature division
modes and feature selection for
the seq2seq-LSTM model using the
PeMS data set, an optimal combi-
nation, composed of division mode
3 (considering both short-term and
periodic variation) and scheme 6
(considering both detection state
features and equilibrium state fea-
tures), is derived and gives the best
prediction accuracy. The results
also show that the optimal combi-
nation can improve the prediction
accuracy in both high-volume and
low-volume conditions.
This article provides a prelimi-
nary attempt to integrate a physical
model into a deep learning frame-
work for traffic prediction. Future
research can be carried out from the
following aspects:
One can develop a traffic predic-
tion approach that can deal with
different traffic scenarios, e.g.,
regular and irregular variations
in traff ic states. Since deep learn-
ing mainly depends on historical
data to find underlying patterns,
it is di fficult to describe irregu-
lar variations in traffic states.
While the t raffic flow model is
established based on the inherent
evolution of the traffic state, it has
better potential to predict traffic
flow under abnormal conditions. Integrating these two
types of models by classification is expected to make bet-
ter traffic predictions.
Though the construction of state transition features
is inspired by the traffic wave model, the state transi-
tion features do not have a physical meaning equivalent
to the traffic wave parameters. In a follow-up study, a
model that can describe the temporal state transit ion
relationship is expected for development to better ex-
tract the prediction features.
In this article, features based on traffic flow t heory are
used in traffic f low prediction. The results show that
these features have the ability to improve the prediction
accuracy. In the future, we would like to analyze the
sensitivity of features on the performance of traffic flow
prediction models. We would also like to validate the ap-
proach through more field data sets.
The tasks of traffic flow prediction can be classified
as pointwise prediction and network-wise prediction,
and the prediction features can be divided into tem-
poral features and spatial features. In this article, we
consider only the impact of temporal feature organi-
zation on the performance of pointwise traffic flow
prediction. We would like to investigate the organi-
zation of temporal features suitable for network-wise
prediction in the future.
Acknowledgments
This work was funded by the National Science Foundation
of China under project codes 61673321 and 52072315, the
Depar tment of Scienc e and Technology of Sichuan P rovince
under project codes 2019JDTD0002 and 2020JDJQ0034,
and the Chengdu Science and Technology Bureau under
grant project code 2019-YF05-02657-SN.
350
300
250
11
10
9
8
7
6
200
MAE (Vehicle/h)
5
10
15
20
25
30
35
40
45
50
55
60
Prediction Horizon (min)
(a) (b)
(c) (d)
150
140
130
120
110
MAE (Vehicle/h)
5
10
15
20
25
30
35
40
45
50
55
60
Prediction Horizon (min)
5
10
15
20
25
30
35
40
45
50
55
60
Prediction Horizon (min)
5
10
15
20
25
30
35
40
45
50
55
60
Prediction Horizon (min)
High-Volume Period Low-Volume Period
High-Volume Period Low-Volume Period
17
16
15
14
13
12
MAPE (%)
MAPE (%)
Basic Combination 1 Basic Combination 2 Optimal Combination
FIG 12 The 1-h (12-step) prediction accuracy with different combinations under various volume levels.
MAE of the (a) high-volume period and (b) low-volume period. MAPE of the (c) high-volume period
and (d) low-volume period.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 16 MONTH 2021
About the Authors
Jing Liu ( jing.liu@my.swjtu.edu.cn)
earned her M.S. degree in transporta-
tion engineering from Southwest Jiao-
tong University, Chengdu, China, in
2019. She is currently pursuing a Ph.D.
degree in traffic engineering from
Southwest Jiaotong University, Cheng-
du, 611756, China. Her research interests include traffic
prediction and traffic anomaly detection.
Fangfang Zheng (fzheng@swjtu.cn)
earned her Ph.D. degree in transport
and planni ng from Delft University of
Technology, Delft, The Netherlands, in
2011. She is a professor with the School
of Transportation and Logistics, South-
west Jiaotong University, Chengdu,
611756, China. Her research interests include urban traffic
flow theory and modeling, intelligent transportation sys-
tems, and traffic control. She has published more than 50
articles in peer-reviewed journals and conference pro-
ceedings.
Xiaobo Liu (xiaobo.liu@sw jtu.cn)
earned his Ph.D. degree from the
New Jersey Institute of Technology,
Newark, New Jersey, in 2004. He is cur-
rently a professor with the School of
Transportation and Logistics, Southwest
Jiaotong University, Chengdu, 611756,
China. His research focuses on the direction of transpor-
tation system analysis under connected vehicle/autono-
mous vehicle environment, and intelligent logistics
analysis. He received the George Krambles Transportation
Scholarship, 2003; Most Outstanding Student Paper
Award by the Institute of Transportation Engineers Met-
ropol i tan Section of New York and New Jersey, 2004; and
Stella Dafermos Best Paper Award by the Transportation
Research Board Transportation Network Modeling Com-
mittee, 2018.
Ge Guo (geguo@yeah.net) received his
Ph.D. degree from Northeastern Uni-
versity, Shenyang, China, i n 1998.
From 2000 to 2003, he was w ith Lan-
zhou University of Technology, China,
as the director of the Institute of Intel-
ligent Control and Robots and dean of
the Department of Electric Engineering and then a profes-
sor from July 2004 to May 2005. He then joined Dalian Mar-
itime University, China, as a professor with the Department
of Automat ion. He is cu rrently a professor with Northeast-
ern University, Qinhuangdao, 066004, China. He has pub-
lished more than 100 international jour nal articles within
his areas of interest, which include intelligent transporta-
tion systems, cyberphysical systems, and networked con-
trol. He is an associate ed itor of Information Sciences, IEEE
Intelligent Transportation System s Magazine, and Acta Au-
tomatica Sinica. He was an honoree of the New Century Ex-
cellent Talents in University, Ministr y of Education, in 2004, a
nominee for Gansu Top Ten Excellent Youths by the Gansu
Prov incial Government, and a Chinese Association of Auto-
mation Young Scientist Award winner. He is a Senior Member
of IEEE.
References
[1] J. Liu, F. Zheng, H. van Zuylen, J. Li, a nd J. Luo, “An anomaly detec-
tion-based dy namic OD prediction framework for urban networks,
in P roc. Forum Int egrated Sustain . Tran sp. Syst . (FISTS), Delft, South
Holland Province, Netherlands, Nov. 2 020, pp. 135141. doi: 10.1109/
FISTS46898.2020.9264855.
[2] B. M. Williams and L. A. Hoel, “Model ing and forecasting veh icular
tra ffic f low as a seasonal ARI MA process: Theoretical basis and em-
pirical results,J. Tr ansp. En g., vol. 129, no. 6, pp. 664 672, Nov. 2003.
doi: 10.1061/(ASCE)0733-947X(2003)129:6(664).
[3] I. Okuta ni and Y. J. Stephanedes, “Dynamic predict ion of tra ffic vol-
ume t hroug h Kalman fi lteri ng theory,Transp. Res. B, Meth odol., vol .
18, no. 1, pp. 1–11, Feb. 198 4. doi: 10.1016/0191-2615(84)900 02-X.
[4] W. Zheng, D.-H. Lee, and Q. Shi, “Short-term freeway traffic flow
predict ion: Bayesian combined neural network approach,” J. Tr ansp.
Eng., vol. 132, no. 2, pp. 114 121, Feb. 2 006. doi: 10.1061/(ASCE)0733-
947X (20 0 6)13 2: 2(114).
[5] Z. Zhu, B. Peng, C. Xiong, and L . Zhang, “Shor t-ter m traffic f low
pred iction w ith linear c ondit ional Gau ssian Bayesia n network: Traf-
fic flow prediction, Bayesia n network, linea r conditional Gaussian,
J. Adv. T ransp., vol. 50, no. 6, pp. 11111123, Oct . 2016. doi: 10.10 02/
at r.1392 .
[6] M. Ben-Akiva, M. Bierlaire, H. Kout sopoulos, and R. Mishalani,
DynaMIT: A simulat ion-based syst em for tra ffic prediction,” 1998,
pp. 1–12.
[7] M. Castro-Neto, Y.-S. Jeong, M.-K. Jeong, and L . D. Han, “Online-SVR
for shor t-ter m traf fic flow prediction under typical and atypical traff ic
conditions,” Expert Syst. Appl., vol. 36, no. 3, pp. 6164617 3, Ap r. 20 09.
doi: 10.1016/j.eswa.2008.07.069.
[8] J. Wang and Q. Shi, “Short-term traffic speed forecast ing hybr id model
based on chaos–wavelet analysi s-supp ort vect or machi ne theory,”
Tra nsp. Res. C, Emerg. Technol ., vol. 27, pp. 219232, Feb. 2013. doi:
10.1016/j.trc.2012.08.00 4.
[9] Y. Wu and H. Tan, “Short-term traffic flow for ecast ing with spat ial-
temporal cor relation in a hybr id deep lea rni ng framework,” Dec.
2016 . Accessed: Dec. 12, 202 0. [Onl ine]. Avai lable: http://a rxiv.org/
abs/1612.01022
[10] Z. Lv, J. Xu , K. Zheng, H. Yin, P. Zhao, a nd X. Zhou, “LC-R NN: A deep
lear ning model for traffic speed predict ion,” in Proc. 27th Int . Joint
Conf. Ar tif. Int ell., 2 018, pp. 34703476.
[11] L. Zhao et al., “T-G CN: A temporal graph convolut ional network for
traffic pred iction,” IEEE Tr ans. Intell. T rans p. Syst., vol. 21, no. 9, pp.
38483858, Sept. 2020. doi: 10.1109/T ITS.2019. 2935152.
[12] B. Yu, Y. Lee, and K. Sohn, “ Foreca sting road tra ffic speeds by consid-
eri ng area-w ide spatiot emporal dep endencies based on a grap h convo-
lutional neural network (GCN),Tran sp. Res. C, Emerg. Tec hnol., vol.
114, pp. 18 9204, 2020. doi: 10.1016/j.trc.2020.02.013.
[13] Y. Wu, H. Tan, L . Qin, B. Ran, and Z. Jia ng, “A hybrid deep learning
based traf fic f low pred iction method and its understanding,” Tra nsp.
Res. C, Emerg. Technol., vol. 90, pp. 166 180, May 2018. doi: 10 .1016/j.
trc.2018.03.001.
[14] Z. Li et al., “A hybr id deep lea rning approach with G CN and LSTM
for traffic flow pr edict ion,” in P roc. IE EE Intel l. Tra nsp. Syst. Conf.
(IT SC), Auckl and, New Zea land, Oct. 2019, pp. 1929 1933. doi: 10.1109/
ITSC.2019.8916778.
[15] Z. Zhao, W. Chen, X. Wu, P. C. Y. Chen, a nd J. Liu, “LST M network:
a deep lea rni ng approach for short -term t raf fic forecast,” IET Intell.
Tra nsp. Syst., vol. 11, no. 2, pp. 6875, Ma r. 2017. doi: 10 .1049/iet-
its.2016.0208.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.
IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 17 MONTH 2021IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE 16 MONTH 2021
[16] X. Luo, D. Li, Y. Yang , and S. Zhang, “Spatiotemp oral traff ic flow pre-
dict ion wit h KNN and LST M,J. Adv. Tran sp., vol. 2019, pp. 1–10, Feb.
2019. doi: 10.1155/2019/4145353.
[17 ] Q. Zhaowei, L. Haitao, L. Zhihui, and Z. Tao, “Short-term traff ic flow
forecasting method w ith M- B-LSTM hybrid network,” IEEE Trans. In-
tell. Tran sp. Syst., pp. 1–11, 2020. doi: 10.1109/TITS.2020.3009725.
[18] F. Shaikh, “Deep learn ing vs. machine learning—The essential differ-
ences you need to know,” Analy tics Vidhya, A pr. 8, 2017.
[19] S. Hochreiter a nd J. Schm idhuber, “Long short-term memory,”
Neural Comput ., vol. 9, no. 8, pp. 17 35 178 0, 19 97. doi: 10.1162/
neco.1997.9. 8.1735.
[20] P. Vel ičković, G. Cucurull, A. Casanova, A. Romero, P. L, and Y.
Bengio, “Graph attention networks,” Feb. 2018 . arXiv:1710.10903 [cs,
stat]. Accessed: Dec. 1 2, 2020. [Online]. Avai lable: http://arxiv.org/
abs/1710.10903
[21] M. Fouladgar, M. Parchami, R. Elmasri, and A. Ghaderi, “Scalable
Deep Traff ic Flow Neural Networks for Urban Traff ic Congestion Pre-
diction,” in Pro c. Int. Jo int Conf. Neural Net w. (IJCNN ), May 2017, pp.
22 512258. doi: 10.1109/IJCN N.2017.796612 8.
[22] B. L. Smith, B. M. Williams, and R. Keith Oswald, “Compar ison of
para metr ic and nonparametric models for traffic flow for ecast ing,”
Tra nsp. Res. C, Emerg. Technol ., vol. 10, no. 4, pp. 303321, Aug. 2 002.
doi: 10.1016/S0968-090X(02)00009-8.
[23] H. M. Zha ng, “Recu rsive pr edict ion of traffic condit ions with neural
network models,” J. T rans p. Eng., vol. 126, no. 6, p. 472, Dec. 2000. doi:
10.1061/(ASCE)0733-947X(2000)126:6(472).
[24] M. J. Lighthill, “On kinematic waves. I. Flood movement i n long riv-
ers,” vol. 229, 1955, p. 36.
[25] “On kinematic waves II. A theory of t raffic flow on long crowded
roads,” Proc. R. S oc. Lond. A, vol. 229, no. 1178, pp. 317345, May 1955.
doi: 10 .1098/ rspa.1955.0 089.
[26] P. G . Michalopoulos, D. E. Beskos, and Y. Yamauchi, “Multilane traf-
fic f low dynamic s: Some macroscopic consider ations ,Transp. Res . B,
Methodol., vol. 18, no. 4-5, pp. 377395, Aug . 1984. doi: 10.1016 /0191-
2615(84)90019-5.
[27] F. Li, J. Feng, H. Yan, G. Jin, D. Jin, and Y. Li, “Dyn amic g raph convo-
lutional recurrent network for traf fic pred iction: benchmark and solu-
tion,May 2021, arXi v:2104.14917 [cs]. Accessed : Jun. 8, 2021. [Online].
Available: http://arxiv.org/abs/2104.14917
[28] S. Smulders, “Control of f reeway t raffic flow by variable speed signs,”
Tra nsp. Res. B, Meth odol., vol . 24, no. 2, pp. 111132, A pr. 19 90. doi:
10.1016/0191-2615(9 0)90023- R.
[29] S. A. Smulders, “C ontrol of freeway traf fic f low,” Transp. Res.
B, Methodol., vol. 24, no. 2, pp. 111132, 19 90. doi: 10.1016/0191-
2615(90)90023-R.
Authorized licensed use limited to: SOUTHWEST JIAOTONG UNIVERSITY. Downloaded on December 12,2021 at 03:04:37 UTC from IEEE Xplore. Restrictions apply.
... FDs serve as essential inputs for continuous traffic flow models (Makridis et al., 2020) and find extensive applications in areas such as traffic control (Wang et al., 2014;Heydecker and Addison, 2011;D. Frejo et al., 2019), capacity analysis (Qin and Wang, 2023), traffic state estimation (Thodi et al., 2022;Zheng et al., 2018), prediction (Liu et al., 2021;Yang et al., 2021), and identification (Kalair and Connaughton, 2021). ...
Preprint
Full-text available
We consider the role of non-localities in speed-density data used to fit fundamental diagrams from vehicle trajectories. We demonstrate that the use of anticipated densities results in a clear classification of speed-density data into stationary and non-stationary points, namely, acceleration and deceleration regimes and their separating boundary. The separating boundary represents a locus of stationary traffic states, i.e., the fundamental diagram. To fit fundamental diagrams, we develop an enhanced cross entropy minimization method that honors equilibrium traffic physics. We illustrate the effectiveness of our proposed approach by comparing it with the traditional approach that uses local speed-density states and least squares estimation. Our experiments show that the separating boundary in our approach is invariant to varying trajectory samples within the same spatio-temporal region, providing further evidence that the separating boundary is indeed a locus of stationary traffic states.
... A chaotic particle swarm optimization (CPSO) algorithm dynamically optimizes the hidden layer of the LSTM so as to achieve robust performance. The feature selection is combined with the LSTM model in [112] while features extraction in space and time is performed in [113] for forecasting with a bi-directional LSTM. Table 3 summarizes the key aspects of the miscellaneous variants of the LSTM algorithm ( [92]- [113]). ...
Article
Full-text available
This paper surveys the short-term road traffic forecast algorithms based on the long-short term memory (LSTM) model of deep learning. The algorithms developed in the last three years are studied and analyzed. This provides an in-depth and thorough description of the algorithms rather than their marginal description as performed in the existing surveys that focus on general deep learning algorithms. The chosen algorithms are classified depending upon the use of LSTM in combination with other techniques for processing input data features towards a final traffic forecast. The operational strategies of the algorithms are described with merits and limitations. Moreover, a comparative analysis of the compared classes of algorithms is also provided. These strategies are helpful in selection of the right algorithms and their classes for the diverse traffic conditions and their future investigation for improvement. Besides, the applications of these classes of algorithms to traffic forecast in various networks for the latest decade is graphically depicted. Moreover, the applications of the LSTM in other fields involving a forecast are provided. Finally, the challenges associated with the short-term traffic forecast using the LSTM are described and strategies are highlighted for their future investigation.
... Traffic prediction has received significant research attention [24], [25], [26], [27], and the prediction traffic data have been integrated into the route planning [28], [29], [30], which can provide better insights for route guidance because being aware of potential future traffic conditions can help drivers take the right path to prevent congestion. Song et al. [28] considered the comprehensive impact of real-time traffic data and prediction traffic data and proposed a hierarchical route computation algorithm for retrieving the fastest congestion-avoidance driving routes. ...
Article
Traffic congestion has become a major concern in most cities all over the world. The proper guidance of cars with an effective route planning method has become a fundamental and smart way to alleviate congestion under existing urban road facilities. Current route planning methods mainly focus on a single car, but ignoring the dynamic effect between cars may lead to severe congestion during the actual driving guidance. In this article, we extend the study of route planning to the case of multiple cars and present a novel multicar shortest travel-time routing problem. The objective is to minimize the average travel time by considering the dynamic effect of the induced traffic congestion on travel speed, while ensuring that each car’s travel distance is within an acceptable range. We construct a time-hierarchical graph model for structuring the spatiotemporal dynamic properties of the urban road network and then develop a two-level multicar route planning optimization method for complex problem solving. The experimental results show that our path recommendations reduce the average travel time by 51.74% and 38.87% on average compared to two representative methods. Our research will become more important in the years ahead as self-driving cars become more commonplace.
... To mine the temporal correlations, researchers introduced recurrent neural networks such as LSTM into traffic prediction and showed promising performance [22]. Liu et al. introduced a prediction framework that utilizes LSTM with feature partitioning and feature selection, which successfully enhanced prediction performance by incorporating feature engineering techniques [23]. The traffic state in a region is probably impacted by its surrounding regions as well as distant regions (for instance, there exists a subway between two regions or the functionality of two regions is relevant). ...
Article
Full-text available
Traffic prediction plays a significant part in creating intelligent cities such as traffic management, urban computing, and public safety. Nevertheless, the complex spatio-temporal linkages and dynamically shifting patterns make it somewhat challenging. Existing mainstream traffic prediction approaches heavily rely on graph convolutional networks and sequence prediction methods to extract complicated spatio-temporal patterns statically. However, they neglect to account for dynamic underlying correlations and thus fail to produce satisfactory prediction results. Therefore, we propose a novel Self-Adaptive Spatio-Temporal Graph Convolutional Network (SASTGCN) for traffic prediction. A self-adaptive calibrator, a spatio-temporal feature extractor, and a predictor comprise the bulk of the framework. To extract the distribution bias of the input in the self-adaptive calibrator, we employ a self-supervisor made of an encoder–decoder structure. The concatenation of the bias and the original characteristics are provided as input to the spatio-temporal feature extractor, which leverages a transformer and graph convolution structures to learn the spatio-temporal pattern, and then applies a predictor to produce the final prediction. Extensive trials on two public traffic prediction datasets (METR-LA and PEMS-BAY) demonstrate that SASTGCN surpasses the most recent techniques in several metrics.
Article
Cooperative computing is promising to enhance the performance and safety of autonomous vehicles benefiting from the increase in the amount, diversity as well as scope of data resources. However, effective and privacy-preserving utilization of multi-modal and multi-source data remains an open challenge during the construction of cooperative mechanisms. Recently, Transformers have demonstrated their potential in the unified representation of multi-modal features, which provides a new perspective for effective representation and fusion of diverse inputs of intelligent vehicles. Federated learning proposes a distributed learning scheme and is hopeful to achieve privacy-secure sharing of data resources among different vehicles. Towards privacy-preserving computing and cooperation in autonomous driving, this paper reviews recent progress of Transformers, federated learning as well as cooperative perception, and proposes a hierarchical structure of Transformers for intelligent vehicles which is comprised of Vehicular Transformers, Federated Vehicular Transformers and the Federation of Vehicular Transformers to exploit their potential in privacy-preserving collaboration.
Article
Full-text available
The traffic state in an urban transportation network is determined via spatio-temporal traffic propagation. In early traffic forecasting studies, time-series models were adopted to accommodate autocorrelations between traffic states. The incorporation of spatial correlations into the forecasting of traffic states, however, involved a computational burden. Deep learning technologies were recently introduced to traffic forecasting in order to accommodate the spatio-temporal dependencies among traffic states. In the present study, we devised a novel graph-based neural network that expanded the existing graph convolutional neural network (GCN). The proposed model allowed us to differentiate the intensity of connecting to neighbor roads, unlike existing GCNs that give equal weight to each neighbor road. A plausible model architecture that mimicked real traffic propagation was established based on the graph convolution. The domain knowledge was efficiently incorporated into a neural network architecture. The present study also employed a generative adversarial framework to ensure that a forecasted traffic state could be as realistic as possible considering the joint probabilistic density of real traffic states. The forecasting performance of the proposed model surpassed that of the original GCN model, and the estimated adjacency matrices revealed the hidden nature of real traffic propagation.
Article
Full-text available
Accurate and real-time traffic forecasting plays an important role in the intelligent traffic system and is of great significance for urban traffic planning, traffic management, and traffic control. However, traffic forecasting has always been considered an "open" scientific issue, owing to the constraints of urban road network topological structure and the law of dynamic change with time. To capture the spatial and temporal dependences simultaneously, we propose a novel neural network-based traffic forecasting method, the temporal graph convolutional network (T-GCN) model, which is combined with the graph convolutional network (GCN) and the gated recurrent unit (GRU). Specifically, the GCN is used to learn complex topological structures for capturing spatial dependence and the gated recurrent unit is used to learn dynamic changes of traffic data for capturing temporal dependence. Then, the T-GCN model is employed to traffic forecasting based on the urban road network. Experiments demonstrate that our T-GCN model can obtain the spatio-temporal correlation from traffic data and the predictions outperform state-of-art baselines on real-world traffic datasets. Our tensorflow implementation of the T-GCN is available at https://github.com/lehaifeng/T-GCN.
Article
Full-text available
The traffic flow prediction is becoming increasingly crucial in Intelligent Transportation Systems. Accurate prediction result is the precondition of traffic guidance, management, and control. To improve the prediction accuracy, a spatiotemporal traffic flow prediction method is proposed combined with k-nearest neighbor (KNN) and long short-term memory network (LSTM), which is called KNN-LSTM model in this paper. KNN is used to select mostly related neighboring stations with the test station and capture spatial features of traffic flow. LSTM is utilized to mine temporal variability of traffic flow, and a two-layer LSTM network is applied to predict traffic flow respectively in selected stations. The final prediction results are obtained by result-level fusion with rank-exponent weighting method. The prediction performance is evaluated with real-time traffic flow data provided by the Transportation Research Data Lab (TDRL) at the University of Minnesota Duluth (UMD) Data Center. Experimental results indicate that the proposed model can achieve a better performance compared with well-known prediction models including autoregressive integrated moving average (ARIMA), support vector regression (SVR), wavelet neural network (WNN), deep belief networks combined with support vector regression (DBN-SVR), and LSTM models, and the proposed model can achieve on average 12.59% accuracy improvement.
Article
Traffic prediction is the cornerstone of intelligent transportation system. Accurate traffic forecasting is essential for the applications of smart cities, i.e., intelligent traffic management and urban planning. Although various methods are proposed for spatio-temporal modeling, they ignore the dynamic characteristics of correlations among locations on road network. Meanwhile, most Recurrent Neural Network (RNN) based works are not efficient enough due to their recurrent operations. Additionally, there is a severe lack of fair comparison among different methods on the same datasets. To address the above challenges, in this paper, we propose a novel traffic prediction framework, named Dynamic Graph Convolutional Recurrent Network (DGCRN). In DGCRN, hyper-networks are designed to leverage and extract dynamic characteristics from node attributes, while the parameters of dynamic filters are generated at each time step. We filter the node embeddings and then use them to generate dynamic graph, which is integrated with pre-defined static graph. As far as we know, we are first to employ a generation method to model fine topology of dynamic graph at each time step. Further, to enhance efficiency and performance, we employ a training strategy for DGCRN by restricting the iteration number of decoder during forward and backward propagation. Finally, a reproducible standardized benchmark and a brand new representative traffic dataset are opened for fair comparison and further research. Extensive experiments on three datasets demonstrate that our model outperforms 15 baselines consistently. Source codes are available at https://github.com/tsinghua-fib-lab/Traffic-Benchmark.
Article
Deep learning has achieved good performance in short-term traffic forecasting recently. However, the stochasticity and distribution imbalance are main characteristics to traffic flow, and these will bring the uncertainty and induce the network overfitting problem during deep learning. To deal with the problems, a new end-to-end hybrid deep learning network model, named M-B-LSTM, is proposed for short-term traffic flow forecasting in this paper. In the M-B-LSTM model, an online self-learning network is constructed as a data mapping layer to learn and equalize the traffic flow statistic distribution for reducing the effect of distribution imbalance and overfitting problem during network learning. Besides, the deep bidirectional long short-term memory network (DBLSTM) is introduced to reduce the uncertainty problem by forward and reverse contexts approximation process in the stochasticity reducing layer, and then the long short-term memory network (LSTM) is used to forecast the next traffic flow state in the forecasting layer. Furthermore, sufficient comparative experiments have been conducted and the results show the proposed model has better ability on solving uncertainty and overfitting problems than the state-of-art methods.
Conference Paper
Traffic speed prediction is known as an important but challenging problem. In this paper, we propose a novel model, called LC-RNN, to achieve more accurate traffic speed prediction than existing solutions. It takes advantage of both RNN and CNN models by a rational integration of them, so as to learn more meaningful time-series patterns that can adapt to the traffic dynamics of surrounding areas. Furthermore, since traffic evolution is restricted by the underlying road network, a network embedded convolution structure is proposed to capture topology aware features. The fusion with other information, including periodicity and context factors, is also considered to further improve accuracy. Extensive experiments on two real datasets demonstrate that our proposed LC-RNN outperforms six well-known existing methods.
Article
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved state-of-the-art results across three established transductive and inductive graph benchmarks: the Cora and Citeseer citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs are entirely unseen during training).