ArticlePDF Available

An Aggregation Approach to Short-Term Traffic Flow Prediction

Authors:

Abstract and Figures

In this paper, an aggregation approach is proposed for traffic flow prediction that is based on the moving average (MA), exponential smoothing (ES), autoregressive MA (ARIMA), and neural network (NN) models. The aggregation approach assembles information from relevant time series. The source time series is the traffic flow volume that is collected 24 h/day over several years. The three relevant time series are a weekly similarity time series, a daily similarity time series, and an hourly time series, which can be directly generated from the source time series. The MA, ES, and ARIMA models are selected to give predictions of the three relevant time series. The predictions that result from the different models are used as the basis of the NN in the aggregation stage. The output of the trained NN serves as the final prediction. To assess the performance of the different models, the naive, ARIMA, nonparametric regression, NN, and data aggregation (DA) models are applied to the prediction of a real vehicle traffic flow, from which data have been collected at a data-collection point that is located on National Highway 107, Guangzhou, Guangdong, China. The outcome suggests that the DA model obtains a more accurate forecast than any individual model alone. The aggregation strategy can offer substantial benefits in terms of improving operational forecasting.
Content may be subject to copyright.
60 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009
An Aggregation Approach to Short-Term
Traffic Flow Prediction
Man-Chun Tan, S. C. Wong, Jian-Min Xu, Zhan-Rong Guan, and Peng Zhang
Abstract—In this paper, an aggregation approach is proposed
for traffic flow prediction that is based on the moving average
(MA), exponential smoothing (ES), autoregressive MA (ARIMA),
and neural network (NN) models. The aggregation approach as-
sembles information from relevant time series. The source time
series is the traffic flow volume that is collected 24 h/day over
several years. The three relevant time series are a weekly similarity
time series, a daily similarity time series, and an hourly time series,
which can be directly generated from the source time series. The
MA, ES, and ARIMA models are selected to give predictions of
the three relevant time series. The predictions that result from the
different models are used as the basis of the NN in the aggregation
stage. The output of the trained NN serves as the final prediction.
To assess the performance of the different models, the naïve,
ARIMA, nonparametric regression, NN, and data aggregation
(DA) models are applied to the prediction of a real vehicle traffic
flow, from which data have been collected at a data-collection point
that is located on National Highway 107, Guangzhou, Guangdong,
China. The outcome suggests that the DA model obtains a more
accurate forecast than any individual model alone. The aggrega-
tion strategy can offer substantial benefits in terms of improving
operational forecasting.
Index Terms—Autoregressive moving average (ARIMA) model,
data aggregation (DA), exponential smoothing (ES), moving aver-
age (MA), neural network (NN), time series, traffic flow prediction.
I. INTRODUCTION
TRAFFIC flow forecasting is an essential part of transporta-
tion planning, traffic control, and intelligent transporta-
tion systems [1]–[19]. In particular, short-term traffic volume
forecasts support proactive dynamic traffic control. As a result,
forecasting technologies have gotten the attention of traffic en-
Manuscript received January 9, 2007; revised June 3, 2007 and February 14,
2008. Current version published February 27, 2009. This work was supported in
part by the Research Grants Council of the Hong Kong Special Administrative
Region, China, under Project HKU 7176/07E, by the University of Hong Kong
under Grant 10207394, by the National Natural Science Foundation of China
under Grant 50578064 and Grant 70629001, by the Natural Science Foundation
of Guangdong Province, China, under Grant 06025219, and by the National
Basic Research Program of China under Grant 2006CB705500. The Associate
Editor for this paper was H. Mahmassani.
M.-C. Tan and Z.-R. Guan are with the Department of Mathematics, College
of Information Science and Technology, Jinan University, Guangzhou 510632,
China (e-mail: tanmc@jnu.edu.cn; jameswingkwan@163.com).
S. C. Wong is with the Department of Civil Engineering, University of
Hong Kong, Hong Kong (e-mail: hhecwsc@hkucc.hku.hk).
J.-M. Xu is with the College of Traffic and Communication, South China
University of Technology, Guangzhou 510641, China (e-mail: aujmxu@
scut.edu.cn).
P. Zhang is with the Shanghai Institute of Applied Mathematics and
Mechanics, Shanghai University, Shanghai 200072, China (e-mail: pzhang@
mail.shu.edu.cn).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TITS.2008.2011693
gineers and researchers. A wide variety of techniques has been
applied in the context of short-term traffic flow forecasting,
depending upon the type of data that are available and the
potential end use of the forecast. These techniques include time
series analysis [6], [7], Bayesian networks [8], neural networks
(NNs) [9]–[12], fuzzy NNs [13], [14], nonparametric regression
(NP) [15], [16], and intelligence computation [17]–[19].
It is almost universally agreed in the forecasting literature
that no single method is the best in every situation. Since the
early work of Edgerton and Kolbe [20] and Bates and Granger
[21], the literature on this topic has significantly expanded.
Numerous researchers have demonstrated that combining the
predictions of several models frequently results in prediction
accuracy higher than that of the individual models [22], [23].
Using a hybrid model has become a common practice to im-
prove forecasting accuracy [24], [25], and a combination of sev-
eral models is employed in traffic flow forecasting [26]–[29].
The moving average (MA) and exponential smoothing (ES)
models are popular in time series forecasting, and their strength
lies in their good short-term accuracy combined with quick
low-cost updating. However, the MA and ES models do not
handle trend or seasonality well [30]. The autoregressive MA
(ARIMA) model and artificial NNs are often compared, with
mixed conclusions regarding their forecasting performance.
The ARIMA model and the Box–Jenkins methodology are
quite flexible in several typical time series such as the pure
autoregressive (AR), pure MA, and combined AR and MA
(ARIMA) models. The major limitation of the ARIMA model
is the preassumed linear form of the model. Artificial NNs
were introduced as efficient tools for modeling and forecasting
approximately two decades ago. The major advantage of NNs is
their flexible nonlinear modeling capability. Zhang [24] pointed
out that neither the ARIMA model nor NNs are adequate to
model and forecast time series because the ARIMA model can-
not deal with nonlinear relationships, and the NN model alone
is not able to handle linear and nonlinear patterns equally well.
It is desirable to exploit the strengths of each individual
approach, which should, in turn, produce a better overall result.
In this paper, a data aggregation (DA) approach for traffic flow
forecasting is presented that is based on an NN. The objective
of DA is to maximize useful information content by combining
data and knowledge from different models.
II. DATA STRUCTURE AND AGGREGATION STRATEGY
In this section, we describe the DA approach for vehicle
traffic flow forecasting. The traffic flow data are collected from
certain data collection points and are aggregated in 1-h periods,
1524-9050/$25.00 © 2009 IEEE
TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 61
24 h/day. Let q(t)be the 1-h traffic flow that is collected
within the time interval (t1,t](or tfor short), where tis an
integer. q(t)is the source time series. By analyzing the observed
traffic flow data, it can be found that the traffic flow pattern is
almost cyclical every week and that it is similar every weekday
(Monday to Friday) and similar every weekend (Saturday and
Sunday). Thus, three relevant time series are constructed for the
DA approach. They are the daily similarity time series s1(t),
the weekly similarity time series s2(t), and the hourly time
series s3(t).
1) s1(t)is a set that includes the previous traffic flow record
within the same time interval on k1days before q(t)
s1(t)={q(t24k1),q(t24(k11)) ,...,q(t24)}.
2) s2(t)is a set that includes the traffic flow record in
sequential k2weeks before q(t). The data in time series
s2(t)will be on the same weekday or weekend
s2(t)= {q(t7×24 ×k2),q(t7×24 ×(k21))
...,q(t7×24)}.
To forecast the traffic flow at time interval ton a certain
Thursday, for example, we also select the data at time
interval ton the previous k2Thursday.
3) s3(t)is a set that includes the previous k3traffic flow data
before q(t)
s3(t)={q(tk3),q(tk3+1),...,q(t1)}.
Different models can be selected to forecast the three relevant
time series. Let ˆqi(t)be the forecast value that results from
model ifor time series si(t),i=1,2,3.
In the DA stage, an NN model is used to produce the final
predictions
ˆqDA(t)=fq1(t),ˆq2(t),ˆq3(t)) (1)
where f(.)is the nonlinear function that is determined by the
trained NN.
There are many popular models, including the naïve, MA,
ES, nonseasonal ARIMA, and seasonal ARIMA (SARIMA)
models, that can be applied to time series prediction [16], [24].
Choosing a proper model to forecast each of the time series
s1(t),s2(t), and s3(t)is the primary task. Two important fac-
tors are considered in the model-selection stage: effectiveness
and simplicity.
As the MA and ES models cannot handle trend or seasonality
well [31] and the ARIMA model is often used to forecast hourly
time series [6], [7], we use the ARIMA model to forecast time
series s3(t). The MA and ES models are chosen to forecast
s1(t)or s2(t). It is found in the case study that choosing the
MA model for s1(t)and the ES model for s2(t)or exchanging
them does not significantly affect the final prediction, and there
is no evidence that either method is superior.
A notable characteristic of traffic flow is that it shows a very
repeatable pattern in time. Time series that exhibit a repeatable
pattern are modeled through the use of seasonal differencing
and seasonal parameters. Such models are called a SARIMA
model. We try to replace the nonseasonal ARIMA model with
Fig. 1. Framework of the DA approach.
the SARIMA model to forecast s3(t). It is found that using the
SARIMA model to forecast s3(t)produces a better prediction
than using the nonseasonal ARIMA model, but the final predic-
tion accuracy of the DA model shows no improvement in many
cases with the replacement of the nonseasonal ARIMA model
with the SARIMA model. This is because the daily similarity of
time series s1(t)and the weekly similarity of time series s2(t)
have already captured the repeatable pattern of traffic flow, and
consequently, the trained NNs in the DA stage have the ability
to reveal the repeatable pattern.
The naïve model has the simplest form and is used to serve as
the worst case approach [16]. We use the naïve and SARIMA
models for comparison purposes in the case study.
Finally, the process to produce the forecast value ˆqDA(t)by
the DA approach is summarized as follows (see Fig. 1).
Step 1) Select time series s1(t), and use the MA model to
produce the forecast value ˆq1(t)(see Section III-A1).
Step 2) Select time series s2(t), and use the ES model to
produce the forecast value ˆq2(t)(see Section III-A2).
Step 3) Select time series s3(t), and use the nonseasonal
ARIMA model to produce the forecast value ˆq3(t)
(see Section III-A3).
Step 4) In the DA stage, an NN model is used to produce the
final forecast value by (1) (see Section III-B).
III. FORMULATION OF DA FOR TIME
SERIES FORECASTING
In this section, ytdenotes the actual value at period t,ˆyt+1 is
the forecast value for the next period, and ˆyt+pis the forecast
for pperiods into the future.
A. Individual Submodels
1) MA Model: An MA of order kis computed by
ˆyt+1 =yt+yt1+yt2+···+ytk+1
k(2)
where kis the number of terms in the MA [31].
62 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009
The MA technique deals only with the latest kperiods of
known data; the number of data points in each average does not
change as time continues. The MA model does not handle trend
or seasonality well.
2) ES Model: ES is a forecasting method that seeks to
isolate trends or seasonality from irregular variation. It has been
found to be most effective when the components that describe
the time series change slowly over time [32].
Holt developed an ES method, Holt’s two-parameter method
[31], which allows for evolving local linear trends in a time
series
Ht=αyt+(1α)(Ht1+bt1)(3)
bt=γ(HtHt1)+(1γ)bt1(4)
ˆyt+p=Ht+btp(5)
where Htis the new smoothed value, αis the smoothing
constant for the data (0 α1),γis the smoothing constant
for the trend estimate 0γ1,btis the trend estimate, and p
is the period to be forecast into the future. The weights can be
selected by minimizing the measure of the mean square error
(MSE).
3) ARIMA Model: A general ARIMA model of order
(r, d, s)representing the time series can be written as
φ(B)dyt=θ(B)et(6)
where etrepresents the random error term at time t,Bis
a backward-shift operator defined by Byt=yt1and related
to by =1B,d=(1B)d, and dis the order of
differencing. φ(B)and θ(B)are the AR and MA operators of
orders rand s, respectively, which are defined as
φ(B)=1φ1Bφ2B−···φrBr(7)
θ(B)=1θ1Bθ2B−···θsBs(8)
where φi(i=1,2,...,r)are the AR coefficients, and θj(j=
1,2,...,s)are the MA coefficients.
Box and Jenkins [33] developed a practical approach to
building ARIMA models that has had a fundamental im-
pact on time series analysis and forecasting applications.
The Box–Jenkins methodology includes model identification,
parameter estimation, diagnostic checking, and model forecast-
ing [24] and consists of the following three iterative steps. First,
we determine whether the time series is stationary or nonsta-
tionary. If it is nonstationary, it is transformed into a stationary
time series by applying a suitable degree of differencing. This
gives the value of d. Then, appropriate values of rand sare
found by examining the autocorrelation function (ACF) and
partial ACF (PACF) of the time series. Having determined r,
d, and s, the coefficients of the AR and MA terms are estimated
using a nonlinear least squares method. In this paper, the time
series was analyzed using statistical software.
B. DA Based on an NN
The NN model that is employed in this paper consists of an
input layer with three neurons, a hidden layer with n1neurons,
Fig. 2. Feedforward BP network.
Fig. 3. Data collection point at Xia Yuan, Guangzhou.
and an output layer with one neuron (see Fig. 2). The NN model
maps the input vector q1(t),ˆq2(t),ˆq3(t)) to the output vector
ˆqDA(t), in which the hyperbolic tangent sigmoid transfer func-
tion is used in the hidden layer and the linear transfer function
is used in the output layer. In the training or learning stage
of the NN model, the weights or parameters of the network
are iteratively modified on the basis of a set of input–output
patterns known as a training set to minimize the deviance
or error between the output that is obtained by the network
and the observed output. The weights are initialized to small
values based on the technique of Nguyen and Widrow [37].
We normalize the data to a value between 0 and 1. The number
of hidden units and the learning rule are chosen through system-
atic experimentation. The learning rule that is commonly used
in this type of network is the backpropagation (BP) algorithm
or gradient descent method. Several different modifications of
the BP learning algorithm are tried in the training course.
To obtain a network that is capable of generalizing and
performing well with new cases, data samples are usually
subdivided into three sets: 1) a training set, 2) a validation set,
and 3) a test set [38]. During the learning stage of the network,
an excessive number of parameters or weights in relation to the
problem at hand and to the number of training data may lead
to overfitting. This phenomenon occurs when the model fits
the irrelevant features that are present in the training data too
closely instead of fitting the underlying function that relates the
inputs and outputs. This will result in the loss of the capacity to
generalize learning to new cases [35].
In this paper, cross validation and an early-stopping tech-
nique are used in the optimizing training process to avoid
TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 63
Fig. 4. Observed traffic flow at time intervals 7, 8, 9, 10, 11, and 12 in November 2005.
the overtraining (or overfitting) phenomenon of NNs. Early
stopping means that the termination of training is controlled by
the error of the validation sets, rather than by the error of the
training set.
IV. CASE STUDY
A. Study Area
A data set from January 1, 2005 to December 30, 2005 was
collected from a detector on National Highway 107, Xia Yuan,
Huangpu, Guangzhou, Guangdong, China (see Fig. 3). The
traffic flow data were aggregated and averaged into 1-h periods,
24 h/day.
B. Goodness-of-Fit Statistics
Three goodness-of-fit statistics are used to assess the forecast
accuracy of the results.
1) The root MSE (RMSE) is calculated as
RMSE =
1
N
N
n=1
(ynˆyn)2.(9)
2) The percentage absolute error (PAE) is calculated as
PAE(n)=|ynˆyn|
yn
×100%.(10)
3) The mean absolute percentage error (MAPE) is cal-
culated as
MAPE =1
N
N
n=1
|ynˆyn|
yn
.(11)
Here, ynand ˆynare the observed and the forecast values
of observation n, respectively, and Nis the total number of
observations.
C. Selection of Models and Parameters
In the case study, we first construct the three relevant time
series. Then, individual submodels are built as follows.
1) For time series s1(t), we try different orders of 1k=
k110. That is, we use the traffic flow at time interval t
on previous ksuccessive days to forecast the traffic flow
at time interval ton the current day. The MA model is
used to forecast these time series.
For example, Fig. 4 shows the traffic flow at time
intervals 7,8,...,12 for 30 days in November 2005. We
vary the order kin (2) in the MA model. Fig. 5 shows the
RMSEs of the MA models with different values of kfor
time interval 11, in which the RMSE is calculated based
on 30 kobservations (days) because the traffic flow on
the first kdays in November cannot be predicated using
the MA model of order k. Based on the results, we choose
k=3when the RMSE is the smallest. Fig. 6 shows the
observed and forecast traffic flows at time interval 11 for
k=3in the MA model.
64 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009
Fig. 5. RMSEs of the MA models varying with order kfor s1(11).
Fig. 6. Observed traffic flow and predictions that result from the MA model
for s1(11).
2) For time series s2(t), we select as much data as possible
from January 2005 forward to forecast the same time
interval for the same weekday or weekend. Fig. 7 shows
six series at time intervals 7,8,...,12 on all Tuesdays
in 2005. Holt’s ES method is employed to forecast these
time series.
For example, to forecast the traffic flow at time
interval 7 on Tuesday, November 29, 2005, we use the
data at time interval 7 on all Tuesdays from January 5 to
November 22 in 2005. Fig. 8 shows the traffic flow s2(7)
at time interval 7 on all Tuesdays in 2005. To choose the
proper Holt’s ES model, we try all of the combinations
of the smoothing constants α=0,0.1,...,0.9,1and
γ=0,0.1,...,0.9,1. We select the combination (α=
0.1=0.1) to fit time series s2(7) when the sum-of-
square error is the smallest. The predictions that are
produced by the ES method are also shown in Fig. 8.
Note that time series s1(t)and s2(t)can be generated
directly from the source data. A computer program can
automatically produce the predictions from the MA and
ES models if the parameters are specified, such as the
order k=3in the MA model and the smoothing con-
stants α=0.1and γ=0.1in the ES model (see Figs. 6
and 8).
3) For time series s3(t), by experimentation, we select k3=
48. That is, we use the traffic flow data in the previous
48 h (two days) to forecast the traffic flow at the current
time, because Hanke et al. [31] have pointed out that more
than 40 observations are required to develop the ARIMA
model. The Box–Jenkins methodology of ARIMA mod-
eling is employed.
We observe that time series s3(t)is nonstationary. We
try to transform it into a stationary time series by applying
a suitable degree of differencing d. Then, we examine the
ACF and PACF of the time series to find the appropriate
values of rand s. Here, we denote the ARIMA model
with the parameters r,d, and sas ARIMA(r, d, s).
For example, we take 48 hourly traffic flow data on
November 21 and 22, 2005. This time series is differ-
enced once, and the differenced data vary about a fixed
level, i.e., zero. After checking the ACF and PACF of the
differenced data, the parameters can be chosen as (r,s=
0or 1) and d=1. Table I shows the p-value and resid-
ual mean square for the three models ARIMA(1, 1, 0),
ARIMA(0, 1, 1), and ARIMA(1, 1, 1).
In Table I, the pvalue =0.809 in model ARIMA(1,
1, 1) indicates that coefficient θ1is not significant from
zero at the 5% level, so θ1can be dropped from the model.
Thus, model ARIMA(1, 1, 1) is not chosen. Of the other
two models, the residual mean square of the ARIMA(1, 1,
0) model is the smallest. The pvalue =0.002 in model
ARIMA(1, 1, 0) is good enough to indicate that the coeffi-
cient φ1is significant from zero. Therefore, this model is
the best fitting model and is chosen to forecast time series
s3(t). The resultant model can be explicitly written as
ˆq3(t)=q(t1) + 0.4351(q(t1) q(t2)).Fig.9
shows a comparison of the observed and predicted traffic
flows from the ARIMA(1, 1, 0) model for the chosen
time series.
D. Design of NNs
In designing NN models, the number of neurons in the
hidden layer is an important feature that needs to be carefully
chosen. To avoid the overtraining or overfitting problem of
NNs, we use three data sets: 1) a training set, 2) a validation
set, and 3) a testing set. Although there is no precise rule on
the optimum size of these data sets, it is recommended that the
training set should be the largest [38].
For example, the 2005 data q1(t),ˆq2(t),ˆq3(t),q(t)) from
Sunday, November 6 to Saturday, November 12 are cho-
sen as the training set, in which q1(t),ˆq2(t),ˆq3(t)) is the
input to the NN and q(t)is the output. The 2005 data
q1(t),ˆq2(t),ˆq3(t),q(t)) on November 13 and 14 are chosen
as the validation set. The 2005 data on Sunday, November 20,
are chosen as testing set 1 and the data on Wednesday,
November 23 as testing set 2.
A series of NNs with different numbers of neurons in the
hidden layer are trained. The number of neurons varies from
TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 65
Fig. 7. Observed traffic flow on all Tuesdays in 2005.
Fig. 8. Observed traffic flow and predictions that result from the ES model for
s2(7) on Tuesday.
TAB L E I
RESULTS FROM DIFFERENT ARIMA MODELS
3 to 20, and the RMSEs are calculated for both the training
and the testing set. According to its generalization ability
on the testing set, the lower the value of the RMSE is, the
better the network model is. Fig. 10 shows the curve of the
Fig. 9. Observed traffic flow and predictions that result from the ARIMA(1,
1, 0) model for s3(t).
RMSE versus the number of hidden-layer neurons when the
Levenberg–Marquardt (LM) BP algorithm is used. In Fig. 10,
we find that the best number of hidden-layer neurons is 12.
Therefore, a 3-12-1 NN model is selected for further study.
The curves of the RMSEs for the training and validation
sets versus the learning epochs that use the technique of early-
stopping training are shown in Fig. 11, which shows that the
RMSE swiftly decreases in both the training and validation
sets when the epoch is less than eight. The RMSE achieves the
lowest value when the epoch is eight and remains almost steady
when the epoch increases.
66 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009
Fig. 10. Hidden-node numbers versus the MSE on the training and testing
sets.
Fig. 11. RMSE of the training and testing sets versus the learning epochs.
TAB L E II
LEARNING ALGORITHM COMPARISON
LM: Levenberg-Marquardt backpropagation; RP: resilient backpropagation;
GD: gradient descent backpropagation; GDA: gradient descent with
adaptive learning rate back-propagation; GDM: gradient descent
with momentum backpropagation; GDX: gradient descent with
momentum and adaptive learning rate backpropagation.
Several BP learning rules are selected in the training course.
Let the maximum training epoch be 5000 and the MSE con-
vergence goal be 0.001. The learning epochs and the RMSEs
of the outputs of the trained NN on testing sets 1 and 2 are
listed in Table II, in which the early-stopping technique is
not applied. Table II shows that the convergence goal is met
at epoch 10 by the LM algorithm and at epoch 4293 by the
resilient backpropagation (RP) algorithm, but the maximum
training epoch is reached and the convergence goal is not met
by the other algorithms. The RMSEs by the LM algorithm is
similar to that by the RP algorithm, while the LM algorithm
has the quickest convergence. The LM algorithm is the best
learning rule in this case. Similar studies have been carried out
in traffic flow forecasting on other data sets. It is found that the
number of neurons in the hidden layers of the NN is between 4
and 24.
E. Comparison of Results
The DA approach, denoted by the DA model, integrates
three submodels (MA, ES, and ARIMA) using an NN. The
predictions from the MA, ES, and ARIMA models are used as
input to the NN in the DA stage, and the output of the trained
NN is the final prediction.
Several single-source models, including the naïve, ARIMA,
NP, and NN models, are applied to time series s3(t).The
ARIMA model is the same as that in the DA approach. We
compare their predictions on the testing sets with those of the
DA model in the case study.
1) The naïve (or no-change) model for traffic flow forecast-
ing has the simplest form
ˆqNa(t)=q(t1)
where ˆqNa(t)is the forecast value at time interval t.
2) NP is a potential approach to traffic flow forecasting [16],
[36]. Kernel estimation is a nonparametric estimation
technique, and the applicability of this technique requires
that both the kernel function and bandwidth be suitably
chosen [15]. Smith et al. [16] provided an extensive
comparison of the forecasting performance of six non-
parametric methods, which are called “straight average,”
“weighted by inverse of distance,” “adjusted by V(t),”
“adjusted by both V(t)and Vhist(t),” “adjusted by both
weight and distance,” and so forth. It is demonstrated
that among these models, the “adjusted by V(t)” model
has the simplest form and relatively satisfactory perfor-
mance, i.e.,
ˆ
V(t+1)= 1
K
K
i=1
Vi(t+1)Vc(t)
Vi(t)
where V(t)is the traffic flow at time interval t,Vc(t)is an
element in the current state, and Vhist(t)is the historical
average volume at the time of day and day of the week
that are associated with time interval t(see [16] for a
detailed description of the model). In this paper, forecasts
are generated using K-nearest neighbor forecasts for
values of Kbetween 5 and 40, and it is found that when
K=25, the model produces the best prediction on the
basis of MAPEs.
3) An artificial NN as a single-source model is used for com-
parison with the DA approach. To show the difference, the
NN in the DA approach is denoted by NN1 and the NN
for the purpose of comparison is denoted by NN2.
TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 67
As described above, the NN1 model is trained to fit the
nonlinear function in (1). The NN2 model is used to fit the
nonlinear relationship
ˆqNN(t)=f1(q(t1),q(t2),...,q(tl)) .
The NN1 and NN2 models have different inputs. The inputs
of the NN1 model are the predictions from the MA, ES, and
ARIMA models. The inputs of the NN2 model are the traffic
flow records at previous lsuccessive time intervals, and its
output is the prediction of traffic flow at time interval t.
The NN2 model is constructed with one input layer, one
hidden layer, and one output. The number of inputs land the
number of hidden neurons in the NN2 model are also optimized
by experimentation.
Note that all of the models (naïve, ARIMA, NP, and NN2) are
employed to produce forecasts for time series s3(t). To produce
the prediction of s3(t), each model may need a different length
of historical data. Hence, the sample size should be properly
chosen for each model. For example, the length of time of the
training data for the naïve model is no more than one day before
the forecasting time, while that for the training data for the NP
model covers several weeks before. For each model, we choose
the parameters by observing the best fitting or forecasting. We
compare their forecasts on the same test sets.
Testing set 1 is the traffic record on Sunday, November 20,
2005. Testing set 2 is the traffic record on Wednesday,
November 23, 2005. The training sample for the above models
covers the data from January 1, 2005 to November 12, 2005.
The validation set for the NN2 model is the observed traffic
flow record on November 13 and 14, 2005.
The models above and the DA model are evaluated for their
out-of-sample forecasting performance using a recursive train-
ing sample (see [39] for a detailed discussion on this method).
Let Tbe the forecasting origin, and we generate forecasts for
time periods T+1,T +2,...,T +N. The procedure of the
N-step-ahead forecast is given as follows.
1) Select a training set. Let T
0(<T)be the index of the
final traffic flow record in the training set.
2) Identify and estimate each model (naïve, ARIMA, NP,
and NN2) using the training set.
3) Identify and estimate the models (MS, ES, ARIMA, and
NN1) in the DA approach, where the ARIMA model is
the same as that in step 2), using the methods that are
described in Sections IV-B, C, and D. In this step, we use
the historical data before Tthat cover the training set in
step 1), as we may need much more data to construct time
series s1(t)and s2(t).
4) Compute the N-step-ahead forecasts for the models
(naïve, ARIMA, NP, NN2, and DA).
5) Advance the time index by one and increase the training
sample by one, i.e., set T
1=T+1, and go to step 4).
Estimate the same models and iterate over all of the
values of the testing data set.
6) Compute the MAPEs between the N-step-ahead predic-
tors and the observed values.
As we advance the time index, we do not identify the
parameters for the models (naïve, ARIMA, NP, NN2, and DA).
TABLE III
MAPEs (IN PERCENTAGE)(IN VEHICLES PER HOUR)OF MODELS
WITH DIFFERENT TIME HORIZONS (ON TESTING SET 1)
TAB L E IV
MAPEs (IN PERCENTAGE)(IN VEHICLES PER HOUR)OF MODELS
WITH DIFFERENT TIME HORIZONS (ON TESTING SET 2)
We use the same model parameters that are identified using
the training sample. The advantages and disadvantages of this
evaluation procedure have been discussed in [40]. When eval-
uating the DA model, the N-step-ahead forecast may need the
one-step-ahead forecasts from time series s1(t)and s2(t).For
example, to produce the two-step-ahead forecast ˆqDA (t+2)
by the DA model, we need q1(t+2),ˆq2(t+2),ˆq3(t+ 2)),in
which ˆq1(t+2)is the one-step-ahead forecast from time series
s1(t+2),ˆq2(t+2) is the one-step-ahead forecast from time
series s2(t+2), and ˆq3(t+2) is the two-step-ahead forecast
from time series s3(t).
Tables III and IV show the MAPEs of the two testing sets
with the different forecast horizons N=1, 2, or 3. In this paper,
the NN2 model has three inputs, the number of neurons in the
hidden layer is 16, and the LM algorithm is used for its training.
The bold numbers in the tables indicate the best performance in
each column.
The MAPE statistics in Tables III and IV show that the
predictions that result from the NP and DA models are better
than the predictions that result from the naïve, ARIMA, and
NN2 models. The predictions that result from the NN2 model
are better than the predictions that result from the ARIMA
model. The DA approach provides the best one-step-ahead
forecasting results compared with the other models. The NP
and DA models have almost the same performance in two-
step-ahead and three-step-ahead forecasting. Although ARIMA
models are quite flexible in many time series, the results are
not ideal when the time series is highly nonlinear. As the NN1
model in the DA approach has one input from the ARIMA
model, the poor performance of the ARIMA model affects the
prediction accuracy of the DA model. Fig. 12 shows the PAEs
that result from the NN1 model in the DA approach on the
training, validation, and testing sets. The PAE is not always
small in the application of the DA approach because of the
highly nonlinear characteristic of traffic flow.
V. C ONCLUSION
Combining different modeling schemes for the improvement
of prediction capability has become a common practice in
68 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009
Fig. 12. PAEs of the DA model for the training, validation, and testing sets.
different application fields, but the approaches vary, and thus,
worthy of investigation are specific applications. In this paper,
aiming at the repeatable pattern of traffic flow, three relevant
time series have been constructed, and three models have been
used to forecast these series. The DA strategy that has been
proposed uses NN technology and aggregates the forecasting
values that result from the multiple models. The weekly simi-
larity time series and daily similarity time series can be easily
constructed from the source time series, and the forecasting
value of the MA and ES models can be automatically obtained
by a computer program once the parameters of the models
are specified. The predictions of the hourly time series by the
ARIMA model can easily be obtained with statistical software.
Therefore, the proposed DA approach is not a time-consuming
but, rather, a feasible job.
The relevant time series make full use of the information in
the source time series that is collected on a single detector. By
analyzing the forecasting performance of the naïve, ARIMA,
NP, NN, and DA models, we have shown that the DA model
can provide results that are more accurate than those of the other
models.
If an accident or nonrecurring congestion happens, the re-
peatable pattern of traffic flow is lost, and the prediction accu-
racy will be affected. In such a case, one solution is to install
sufficient detectors on the upstream to track the traffic flow.
How to use the DA approach in a multiple-detector environment
deserves further study. The DA method that is based on NNs
may have considerable potential in forecasting technology.
ACKNOWLEDGMENT
The authors would like to thank the four anonymous refer-
ees for their helpful suggestions and critical and constructive
comments on an earlier version of the paper.
REFERENCES
[1] B. Abdulhai, H. Porwal, and W. Recker, “Short-term traffic flow predic-
tion using neuro-genetic algorithms,” J. Intell. Transp. Syst., vol. 7, no. 1,
pp. 3–41, Jan. 2002.
[2] M. C. Tan, J. M. Xu, and Z. Y. Mao, “Traffic flow modeling and on-
ramp optimal control in freeway,” Chin. J. Highw. Transp., vol. 13, no. 4,
pp. 83–85, 2000.
[3] M. C. Tan, C. O. Tong, and J. M. Xu, “Study and implementation of a
decision support system for urban mass transit service planning,” J. Inf.
Technol. Manage., vol. XV, no. 1/2, pp. 14–32, 2004.
[4] M. C. Tan, C. O. Tong, S. C. Wong, and J. M. Xu, “An algorithm for
finding reasonable paths in transit networks,” J. Adv. Transp., vol. 41,
no. 3, pp. 285–305, 2007.
[5] M. C. Tan, L. B. Feng, and J. M. Xu, “Traffic flow prediction based on
hybrid ARIMA and ANN model,” Chin. J. Highw. Transp., vol. 20, no. 4,
pp. 118–121, 2007.
[6] B. M. Williams, “Multivariate vehicular traffic flow prediction: Evalua-
tion of ARIMAX modeling,” Transp. Res. Rec., vol. 1776, pp. 194–200,
2001.
[7] B. M. Williams and L. A. Hoel, “Modeling and forecasting vehicular
traffic flow as a seasonal ARIMA process: Theoretical basis and empirical
results,” J. Transp. Eng., vol. 129, no. 6, pp. 664–672, Nov./Dec. 2003.
[8] S. Sun, C. Zhang, and G. Yu, “A Bayesian network approach to traffic flow
forecasting,” IEEE Trans. Intell. Transp. Syst., vol. 7, no. 1, pp. 124–131,
Mar. 2006.
[9] M. S. Dougherty and M. R. Cobbett, “Short-term inter-urban traffic fore-
casts using neural networks,” Int. J. Forecast., vol. 13, no. 1, pp. 21–31,
Mar. 1997.
TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 69
[10] C. Ledoux, “An urban traffic flow model integrating neural networks,”
Transp. Res., Part C–Emerg. Technol., vol. 5, no. 5, pp. 287–300,
Oct. 1997.
[11] D. Shmueli, “Applications of neural networks in transportation planning,”
Prog. Plann., vol. 50, no. 3, pp. 141–204, Oct. 1998.
[12] H. Dia, “An object-oriented neural network approach to short-term
traffic forecasting,” Eur. J. Oper. Res., vol. 131, no. 2, pp. 253–261,
Jun. 2001.
[13] H. B. Yin, S. C. Wong, J. M. Xu, and C. K. Wong, “Urban traffic flow
prediction using a fuzzy-neural approach,” Transp. Res., Part C–Emerg.
Technol., vol. 10, no. 2, pp. 85–98, Apr. 2002.
[14] L. W. Lan and Y. C. Huang, “A rolling-trained fuzzy neural network
approach for freeway incident detection,” Transportmetrica, vol. 2, no. 1,
pp. 11–29, 2006.
[15] N.-E. El Faouzi, “Nonparametric traffic flow prediction using kernel
estimator,” in Proc. 13th ISTTT, J.-B. Lesort, Ed., Lyon, France, 1996,
pp. 41–54.
[16] B. L. Smith, B. M. Williams, and R. K. Oswald, “Comparison of paramet-
ric and nonparametric models for traffic flow forecasting,” Transp. Res.,
Part C–Emerg. Technol., vol. 10, no. 4, pp. 303–321, Aug. 2002.
[17] H. Chen and S. Grant-Muller, “Use of sequential learning for short-term
traffic flow forecasting,” Transp. Res., Part C–Emerg. Technol.,vol.9,
no. 5, pp. 319–336, Oct. 2001.
[18] G. Huisken, “Soft-computing techniques applied to short-term traffic flow
forecasting,” Syst. Anal. Model. Simul., vol. 43, no. 2, pp. 165–173,
Feb. 2003.
[19] E. I. Vlahogianni, M. G. Karlaftis, and J. C. Golias, “Optimized and
meta-optimized neural networks for short-term traffic flow prediction: A
genetic approach,” Transp. Res., Part C–Emerg. Technol., vol. 13, no. 3,
pp. 211–234, Jun. 2005.
[20] H. A. Edgerton and L. E. Kolbe, “The method of minimum variation for
the combination of criteria,” Psychometrika, vol. 1, no. 3, pp. 183–187,
Sep. 1936.
[21] J. M. Bates and C. W. J. Granger, “The combination of forecasts,” Oper.
Res. Q., vol. 20, pp. 451–468, 1969.
[22] M. J. Lawerence, R. H. Edmundson, and M. J. O’Connor, “The accuracy
of combining judgmental and statistical forecasts,” Manag. Sci., vol. 32,
no. 12, pp. 1521–1532, 1986.
[23] S. Makridakis,“Why combining works?” Int. J. Forecast., vol. 5, no. 4,
pp. 601–603, 1989.
[24] G. P. Zhang, “Time series forecasting using a hybrid ARIMA and neural
network model,” Neurocomput., vol. 50, pp. 159–175, Jan. 2003.
[25] Y. Gao and M. J. Er, “NARMAX time series model prediction: Feedfor-
ward and recurrent fuzzy neural network approaches,” Fuzzy Sets Syst.,
vol. 150, no. 2, pp. 331–350, Mar. 2005.
[26] Y. Jiang, “Prediction of freeway traffic flows using Kalman predictor in
combination with time series,” Transp. Q., vol. 57, no. 2, pp. 99–118,
2003.
[27] N. E. El Faouzi, “Combining predictive schemes in short-term traffic
forecasting,” in Proc. 14th ISTTT, A. Ceder, Ed., Jerusalem, Israel, 1999,
pp. 471–487.
[28] W. Zheng, D. Lee, and Q. Shi, “Short-term freeway traffic flow prediction:
Bayesian combined neural network approach,” J. Transp. Eng., vol. 132,
no. 2, pp. 114–121, Feb. 2006.
[29] M. V. D. Voort, M. Dougherty, and S. Watson, “Combining Kohonen maps
with ARIMA time series models to forecast traffic flow,” Transp. Res.,
Part C–Emerg. Technol., vol. 4, no. 5, pp. 307–318, Oct. 1996.
[30] B. L. Bowerman and R. T. O’Connell, Forecasting and Time Series: An
Applied Approach, 3rd ed. Florence, Italy: Brooks/Cole, 1993.
[31] J. E. Hanke, A. G. Reitsch, and D. W. Wichern, Business Forecasting,
7th ed. Boston, MA: Prentice–Hall, 2001.
[32] R. A. Yaffee and M. McGee, Introduction to Time Series Analysis and
Forecasting. San Diego, CA: Academic, 2000.
[33] G. E. P. Box and G. M. Jenkins, Time Series Analysis, Forecasting and
Control, 2nd ed. San Francisco, CA: Holden-Day, 1970.
[34] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal
representations by error propagation,” in Parallel Distributed Processing,
D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,
1986, pp. 318–362.
[35] E. B. Baum and D. Haussler, “What size net gives valid generalization?”
Neural Comput., vol. 1, no. 1, pp. 151–160, 1989.
[36] E. Matzner-Løbber, A. Gannoun, and J. G. De Gooijer, “Nonparametric
forecasting: A comparison of three kernel-based methods,” Commun.
Stat., Theory Methods, vol. 27, no. 7, pp. 1593–1617, 1998.
[37] D. Nguyen and B. Widrow, “Improving the learning speed of 2-layer
neural networks by choosing initial values of the adaptive weights,” in
Proc. Int. Joint Conf. Neural Netw., 1990, vol. 3, pp. 21–26.
[38] C. M. Bishop, Neural Network for Pattern Recognition. Oxford, U.K.:
Clarendon, 1995.
[39] L. J. Tashman, “Out-of-sample tests of forecasting accuracy: An analysis
and review,” Int. J. Forecast., vol. 16, no. 4, pp. 437–450, Oct. 2000.
[40] D. D. Thomakos and J. B. Guerard, Jr., “Naïve, ARIMA, nonparametric,
transfer function and VAR models: A comparison of forecasting perfor-
mance,” Int. J. Forecast., vol. 20, no. 1, pp. 53–67, Jan.–Mar. 2004.
Man-Chun Tan received the M.S. degree in math-
ematics from Jinan University, Guangzhou, China,
in 1995 and the Ph.D. degree in control theory and
control engineering from the South China University
of Technology, Guangzhou, in 2000.
From July 2000 to September 2000, he was a
Research Assistant with the Department of Civil
Engineering, University of Hong Kong, Hong Kong.
From June 2001 to May 2002, he was a Research
Associate with the University of Hong Kong. He is
currently a Professor with the Department of Math-
ematics, College of Information Science and Technology, Jinan University.
His research interests include traffic flow prediction and control, transportation
planning, stability analysis of dynamic systems, and artificial neural networks.
S. C. Wong received the B.Sc.(Eng) and M.Phil.
degrees from the University of Hong Kong, Hong
Kong, and the Ph.D. degree from University College
London, London, U.K.
He is currently a Professor with the Department
of Civil Engineering, University of Hong Kong.
His research interests include optimization of traf-
fic signal settings, continuum modeling for traffic
equilibrium problems, land use and transportation
problems, dynamic highway and transit assignment
problems, urban taxi services, and road safety.
Jian-Min Xu received the Ph.D. degree from the
South China University of Technology, Guangzhou,
China.
He is currently a Professor and a Doctoral Advisor
with the College of Traffic and Communication,
South China University of Technology. As a Project
Principal, he has accomplished many national or
ministerial/provincial projects, some of which are
National 863 High-Tech Programs or National Fund
Projects. He is the author or a coauthor of more than
100 journal papers. He is the holder of three patents.
His research interests include intelligent transportation systems and intelligent
control theory and its application.
Zhan-Rong Guan received the B.S. degree in man-
agement information system from the College of In-
formation Science and Technology, Jinan University,
Guangzhou, China, in July 2005. From September
2005 to June 2007, he worked towards the M.S.
degree in applied mathematics with the Department
of Mathematics, Jinan University.
His research interests include traffic flow forecast-
ing and design of information systems.
Peng Zhang received the B.Sc. degree from Sichuan
University, Sichuan, China, and the M.Sc. and Ph.D.
degrees from the University of Science and Technol-
ogy of China, Hefei, China.
He is currently a Professor with the Shanghai
Institute of Applied Mathematics and Mechanics,
Shanghai University, Shanghai, China. His research
interests include traffic flow theory and computa-
tional methods.
... The data-based approach is mainly divided into three types: a parametric method like Auto-Regressive Integrated Moving Average (ARIMA) remain classical [24], a non-parametric method like support vector regression (SVR) [9], k-nearest neighbor methods (k-NN) [2] and artificial neural networks (ANN) [16], and deep learning method like Recurrent Neural Network (RNN) [3], Graph Convolutional Neural Network (GCN) [8], [15], [25], [27], [28], [30]. The parametric method usually relies on the stationarity assumption, which contradicts the traffic flow. ...
... However, these methods cannot handle sudden changes since they rely too much on stationary assumptions. Non-parametric methods mainly include k-nearest neighbor (KNN) [2], support vector regression (SVR) [9], and artificial neural networks (ANN) [16]. KNN predicts future traffic data by using clustering algorithms [2]. ...
... SVR utilizes the support vector of traffic data [9] and can combine with artificial intelligence algorithms such as genetic algorithms and simulated annealing algorithms [6]. Furthermore, ANNbased methods were proposed to capture the mapping relation between historical and future traffic data [16]. However, due to the dynamicity and nonlinearity of traffic data, the performance of shallow ANNs is unsatisfactory. ...
... Traffic data collected from individual sensors/stations are typically time-series data. Therefore, in the early stage, traditional time-series analysis methods, for example, Historical Average (HA) [1], Auto-Regressive Integrated Moving Average (ARIMA) [2] and its variants [3,4], are adopted for traffic forecasting. Such methods usually rely on the linear assumption of time series. ...
Article
Full-text available
In the era of data‐driven transportation development, traffic forecasting is crucial. Established studies either ignore the inherent spatial structure of the traffic network or ignore the global spatial correlation and may not capture the spatial relationships adequately. In this work, a Dynamic Spatial‐Temporal Network (DSTN) based on Joint Latent Space Representation (JLSR) is proposed for traffic forecasting. Specifically, in the spatial dimension, a JLSR network is developed by integrating graph convolution and spatial attention operations to model complex spatial dependencies. Since it can adaptively fuse the representation information of local topological space and global dynamic space, a more comprehensive spatial dependency can be captured. In the temporal dimension, a Stacked Bidirectional Unidirectional Gated Recurrent Unit (SBUGRU) network is developed, which captures long‐term temporal dependencies through both forward and backward computations and superimposed recurrent layers. On these bases, DSTN is developed in an encoder‐decoder framework and periodicity is flexibly modeled by embedding branches. The performance of DSTN is validated on two types of real‐world traffic flow datasets, and it improves over baselines.
... . Such a simple forecasting approach is utilized in various research domains and also transportation [45], [46] for performance comparison of more robust forecasting models. As stated by [47], naive models should not be treated as forecasting models but rather as a benchmark to disqualify proposed prediction models that perform worse on a problem than a naive model. ...
Article
Recently, efforts have been made to standardize signal phase and timing (SPaT) messages. These messages contain signal phase timings of all signalized intersection approaches. This information can thus be used for efficient motion planning, resulting in more homogeneous traffic flows and uniform speed profiles. Despite efforts to provide robust predictions for semi-actuated signal control systems, predicting signal phase timings for fully-actuated controls remains challenging. This paper proposes a time series prediction framework using aggregated traffic signal and loop detector data. We utilize state-of-the-art machine learning models to predict future signal phases’ duration. The performance of a Linear Regression (LR), Random Forest (RF), a light gradient-boosting machine (LightGBM), a bidirectional Long-Short-Term-Memory neural network (BiLSTM) and a Temporal Convolutional Network (TCOV) are assessed against a naive baseline model. Results based on an empirical data set from a fully-actuated signal control system in Zurich, Switzerland, show that state of the art machine learning models outperform conventional prediction methods.
... Still, it has been gradually discarded by researchers due to its oversimplification and low prediction accuracy. The Autoregressive Integrated Moving Average (ARIMA) [14][15][16] model and its variants are classical methods in traffic flow forecasting based on statistical theory. Ahmed et al. [17] investigated highway traffic and occupancy time series using ARIMA and finally found that the best metrics for all datasets were found when the model was ARIMA (0,1,3). ...
Article
Full-text available
Due to the regular influence of human activities, traffic flow data usually exhibit significant periodicity, which provides a foundation for further research on traffic flow data. However, the temporal dependencies in traffic flow data are often obscured by entangled temporal regularities, making it challenging for general models to capture the intrinsic functional relationships within the data accurately. In recent years, a plethora of methods based on statistics, machine learning, and deep learning have been proposed to tackle these problems of traffic flow forecasting. In this paper, the Transformer is improved from two aspects: (1) an Efficient Attention mechanism is proposed, which reduces the time and memory complexity of the Scaled Dot Product Attention; (2) a Generative Decoding mechanism instead of a Dynamic Decoding operation, which accelerates the inference speed of the model. The model is named EGFormer in this paper. Through a lot of experiments and comparative analysis, the authors found that the EGFormer has better ability in the traffic flow forecasting task. The new model has higher prediction accuracy and shorter running time compared with the traditional model.
Article
Accurate forecasting of airline passenger traffic is important for facilitating the effective management and planning of aviation resources. In this study, we explore the air passenger traffic in the Norwegian aviation industry by collecting the passenger flow data and the corresponding measurements of the weather conditions affecting the flow from the different airports in Norway. We then proposed nonlinear autoregressive with exogenous input (NARX) forecasting models to predict air passenger traffic in advance. The NARX models account for the nonlinearity and nonstationarity in the passenger flow and allow the accurate forecasting of air passenger traffic. We perform experiments to demonstrate the effectiveness of two variants of the NARX model and compare their performances against long short‐term memory (LSTM), a deep learning method. We show that the proposed NARX model achieves the best prediction accuracy compared to LSTM, which is considered as a state‐of‐the‐art approach for fitting sequential data.
Article
Generating realistic vehicle speed trajectories is a crucial component in evaluating vehicle fuel economy and in predictive control of self‐driving cars. Traditional generative models rely on Markov chain methods and can produce accurate synthetic trajectories but are subject to the curse of dimensionality. They do not allow to include conditional input variables into the generation process. In this paper, we show how extensions to deep generative models allow accurate and scalable generation. Proposed architectures involve recurrent and feed‐forward layers and are trained using adversarial techniques. Our models are shown to perform well on generating vehicle trajectories using a model trained on GPS data from Chicago metropolitan area.
Article
It is essential to predict traffic flow rates dynamically and accurately for traffic engineers to efficiently control traffic flows and reduce traffic delays. This paper introduces a method for prediction of freeway traffic flows. The method combines the combination of the Kalman control theory and the times series theory into a tool for traffic flow prediction. It is illustrated that the combination method provides more accurate traffic flow prediction than using either one of the two theories individually. With the prediction model, the traffic flow on a given freeway in the next time interval (five to 15 minutes) can be predicted using traffic data at the current and past time intervals. Dynamic traffic predictions with the developed model can be performed for individual lanes as well as for all the lanes of each travel direction. It is also shown that a dynamic prediction of traffic flow rate with this prediction model would also constitute a dynamic prediction of traffic congestion if the traffic capacity was given.
Article
Hybrid autoregressive integrated moving average (ARIMA) and artificial neural network models were employed in the short-term traffic flow prediction. Using the good linear fitting ability of ARIMA and the strong nonlinear mapping ability of artificial neural network, the traffic flow time series was considered to be composed of a linear autocorrelation structure and a nonlinear structure. ARIMA model was used to predict the linear component of traffic flow time series and the artificial neural network model was applied to the nonlinear residual component prediction. Results show that the hybrid model, which takes advantage of the unique strength of the two models in linear and nonlinear modeling, can produce more accurate predictions than that of single model. The hybrid model can be an efficient method to the short-term traffic flow prediction.
Article
An introduction to forecasting Basic statistical concepts Forecasting by using regression analysis Simple linear regression Multiple regression Topics in regression analysis Forecasting by using time series regression, decomposition methods and exponential smoothing Time series regression Decomposition methods Exponential smoothing Forecasting by using basic techniques of the box Jenkins methodology Nonseasonal, box-Jenkins models and their tentative identification Estimation, diagnostic checking and forecasting for nonseasonal box-Jenkins models An introduction to box-Jenkins seasonal modelling Forecasting by using advanced technology of the box-Jenkins methodology General box-Jenkins seasonal modelling Using the box-Jenkins methodology to improve time series regression models and to implement exponential smoothing Transfer functions and intervention models.
Article
Two separate sets of forecasts of airline passenger data have been combined to form a composite set of forecasts. The main conclusion is that the composite set of forecasts can yield lower mean-square error than either of the original forecasts. Past errors of each of the original forecasts are used to determine the weights to attach to these two original forecasts in forming the combined forecasts, and different methods of deriving these weights are examined.
Article
In this paper the use of three kernel-based nonparametric forecasting methods - the conditional mean, the conditional median, and the conditional mode -is explored in detail. Several issues related to the estimation of these methods are discussed, including the choice of the bandwidth and the type of kernel function. The out-of-sample forecasting performance of the three nonparametric methods is investigated using 60 real time series. We find that there is no superior forecast method for series having approximately less than 100 observations. However, when a time series is long or when its conditional density is bimodal there is quite a difference between the forecasting performance of the three kernel-based forecasting methods.