ArticlePDF Available

An Aggregation Approach to Short-Term Traffic Flow Prediction

April 2009
IEEE Transactions on Intelligent Transportation Systems 10(1):60 - 69

April 2009
10(1):60 - 69

DOI:10.1109/TITS.2008.2011693

Source
IEEE Xplore

Authors:

Man-Chun Tan

Jinan University (Guangzhou, China)

Su Cheong Wong

Universiti Tunku Abdul Rahman

Show all 5 authorsHide

In this paper, an aggregation approach is proposed for traffic flow prediction that is based on the moving average (MA), exponential smoothing (ES), autoregressive MA (ARIMA), and neural network (NN) models. The aggregation approach assembles information from relevant time series. The source time series is the traffic flow volume that is collected 24 h/day over several years. The three relevant time series are a weekly similarity time series, a daily similarity time series, and an hourly time series, which can be directly generated from the source time series. The MA, ES, and ARIMA models are selected to give predictions of the three relevant time series. The predictions that result from the different models are used as the basis of the NN in the aggregation stage. The output of the trained NN serves as the final prediction. To assess the performance of the different models, the naive, ARIMA, nonparametric regression, NN, and data aggregation (DA) models are applied to the prediction of a real vehicle traffic flow, from which data have been collected at a data-collection point that is located on National Highway 107, Guangzhou, Guangdong, China. The outcome suggests that the DA model obtains a more accurate forecast than any individual model alone. The aggregation strategy can offer substantial benefits in terms of improving operational forecasting.

Framework of the DA approach.

…

Figures - uploaded by Peng Zhang

Content may be subject to copyright.

Content uploaded by Peng Zhang

Content may be subject to copyright.

Content uploaded by Peng Zhang

Content may be subject to copyright.

60 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009

An Aggregation Approach to Short-Term

Trafﬁc Flow Prediction

Man-Chun Tan, S. C. Wong, Jian-Min Xu, Zhan-Rong Guan, and Peng Zhang

Abstract—In this paper, an aggregation approach is proposed

for trafﬁc ﬂow prediction that is based on the moving average

(MA), exponential smoothing (ES), autoregressive MA (ARIMA),

and neural network (NN) models. The aggregation approach as-

sembles information from relevant time series. The source time

series is the trafﬁc ﬂow volume that is collected 24 h/day over

several years. The three relevant time series are a weekly similarity

time series, a daily similarity time series, and an hourly time series,

which can be directly generated from the source time series. The

MA, ES, and ARIMA models are selected to give predictions of

the three relevant time series. The predictions that result from the

different models are used as the basis of the NN in the aggregation

stage. The output of the trained NN serves as the ﬁnal prediction.

To assess the performance of the different models, the naïve,

ARIMA, nonparametric regression, NN, and data aggregation

(DA) models are applied to the prediction of a real vehicle trafﬁc

ﬂow, from which data have been collected at a data-collection point

that is located on National Highway 107, Guangzhou, Guangdong,

China. The outcome suggests that the DA model obtains a more

accurate forecast than any individual model alone. The aggrega-

tion strategy can offer substantial beneﬁts in terms of improving

operational forecasting.

Index Terms—Autoregressive moving average (ARIMA) model,

data aggregation (DA), exponential smoothing (ES), moving aver-

age (MA), neural network (NN), time series, trafﬁc ﬂow prediction.

I. INTRODUCTION

TRAFFIC ﬂow forecasting is an essential part of transporta-

tion planning, trafﬁc control, and intelligent transporta-

tion systems [1]–[19]. In particular, short-term trafﬁc volume

forecasts support proactive dynamic trafﬁc control. As a result,

forecasting technologies have gotten the attention of trafﬁc en-

Manuscript received January 9, 2007; revised June 3, 2007 and February 14,

2008. Current version published February 27, 2009. This work was supported in

part by the Research Grants Council of the Hong Kong Special Administrative

Region, China, under Project HKU 7176/07E, by the University of Hong Kong

under Grant 10207394, by the National Natural Science Foundation of China

under Grant 50578064 and Grant 70629001, by the Natural Science Foundation

of Guangdong Province, China, under Grant 06025219, and by the National

Basic Research Program of China under Grant 2006CB705500. The Associate

Editor for this paper was H. Mahmassani.

M.-C. Tan and Z.-R. Guan are with the Department of Mathematics, College

of Information Science and Technology, Jinan University, Guangzhou 510632,

China (e-mail: tanmc@jnu.edu.cn; jameswingkwan@163.com).

S. C. Wong is with the Department of Civil Engineering, University of

Hong Kong, Hong Kong (e-mail: hhecwsc@hkucc.hku.hk).

J.-M. Xu is with the College of Trafﬁc and Communication, South China

University of Technology, Guangzhou 510641, China (e-mail: aujmxu@

scut.edu.cn).

P. Zhang is with the Shanghai Institute of Applied Mathematics and

Mechanics, Shanghai University, Shanghai 200072, China (e-mail: pzhang@

mail.shu.edu.cn).

Color versions of one or more of the ﬁgures in this paper are available online

at http://ieeexplore.ieee.org.

Digital Object Identiﬁer 10.1109/TITS.2008.2011693

gineers and researchers. A wide variety of techniques has been

applied in the context of short-term trafﬁc ﬂow forecasting,

depending upon the type of data that are available and the

potential end use of the forecast. These techniques include time

series analysis [6], [7], Bayesian networks [8], neural networks

(NNs) [9]–[12], fuzzy NNs [13], [14], nonparametric regression

(NP) [15], [16], and intelligence computation [17]–[19].

It is almost universally agreed in the forecasting literature

that no single method is the best in every situation. Since the

early work of Edgerton and Kolbe [20] and Bates and Granger

[21], the literature on this topic has signiﬁcantly expanded.

Numerous researchers have demonstrated that combining the

predictions of several models frequently results in prediction

accuracy higher than that of the individual models [22], [23].

Using a hybrid model has become a common practice to im-

prove forecasting accuracy [24], [25], and a combination of sev-

eral models is employed in trafﬁc ﬂow forecasting [26]–[29].

The moving average (MA) and exponential smoothing (ES)

models are popular in time series forecasting, and their strength

lies in their good short-term accuracy combined with quick

low-cost updating. However, the MA and ES models do not

handle trend or seasonality well [30]. The autoregressive MA

(ARIMA) model and artiﬁcial NNs are often compared, with

mixed conclusions regarding their forecasting performance.

The ARIMA model and the Box–Jenkins methodology are

quite ﬂexible in several typical time series such as the pure

autoregressive (AR), pure MA, and combined AR and MA

(ARIMA) models. The major limitation of the ARIMA model

is the preassumed linear form of the model. Artiﬁcial NNs

were introduced as efﬁcient tools for modeling and forecasting

approximately two decades ago. The major advantage of NNs is

their ﬂexible nonlinear modeling capability. Zhang [24] pointed

out that neither the ARIMA model nor NNs are adequate to

model and forecast time series because the ARIMA model can-

not deal with nonlinear relationships, and the NN model alone

is not able to handle linear and nonlinear patterns equally well.

It is desirable to exploit the strengths of each individual

approach, which should, in turn, produce a better overall result.

In this paper, a data aggregation (DA) approach for trafﬁc ﬂow

forecasting is presented that is based on an NN. The objective

of DA is to maximize useful information content by combining

data and knowledge from different models.

II. DATA STRUCTURE AND AGGREGATION STRATEGY

In this section, we describe the DA approach for vehicle

trafﬁc ﬂow forecasting. The trafﬁc ﬂow data are collected from

certain data collection points and are aggregated in 1-h periods,

TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 61

24 h/day. Let q(t)be the 1-h trafﬁc ﬂow that is collected

within the time interval (t−1,t](or tfor short), where tis an

integer. q(t)is the source time series. By analyzing the observed

trafﬁc ﬂow data, it can be found that the trafﬁc ﬂow pattern is

almost cyclical every week and that it is similar every weekday

(Monday to Friday) and similar every weekend (Saturday and

Sunday). Thus, three relevant time series are constructed for the

DA approach. They are the daily similarity time series s1(t),

the weekly similarity time series s2(t), and the hourly time

series s3(t).

1) s1(t)is a set that includes the previous trafﬁc ﬂow record

within the same time interval on k1days before q(t)

s1(t)={q(t−24k1),q(t−24(k1−1)) ,...,q(t−24)}.

2) s2(t)is a set that includes the trafﬁc ﬂow record in

sequential k2weeks before q(t). The data in time series

s2(t)will be on the same weekday or weekend

s2(t)= {q(t−7×24 ×k2),q(t−7×24 ×(k2−1))

...,q(t−7×24)}.

To forecast the trafﬁc ﬂow at time interval ton a certain

Thursday, for example, we also select the data at time

interval ton the previous k2Thursday.

3) s3(t)is a set that includes the previous k3trafﬁc ﬂow data

before q(t)

s3(t)={q(t−k3),q(t−k3+1),...,q(t−1)}.

Different models can be selected to forecast the three relevant

time series. Let ˆqi(t)be the forecast value that results from

model ifor time series si(t),i=1,2,3.

In the DA stage, an NN model is used to produce the ﬁnal

predictions

ˆqDA(t)=f(ˆq1(t),ˆq2(t),ˆq3(t)) (1)

where f(.)is the nonlinear function that is determined by the

trained NN.

There are many popular models, including the naïve, MA,

ES, nonseasonal ARIMA, and seasonal ARIMA (SARIMA)

models, that can be applied to time series prediction [16], [24].

Choosing a proper model to forecast each of the time series

s1(t),s2(t), and s3(t)is the primary task. Two important fac-

tors are considered in the model-selection stage: effectiveness

and simplicity.

As the MA and ES models cannot handle trend or seasonality

well [31] and the ARIMA model is often used to forecast hourly

time series [6], [7], we use the ARIMA model to forecast time

series s3(t). The MA and ES models are chosen to forecast

s1(t)or s2(t). It is found in the case study that choosing the

MA model for s1(t)and the ES model for s2(t)or exchanging

them does not signiﬁcantly affect the ﬁnal prediction, and there

is no evidence that either method is superior.

A notable characteristic of trafﬁc ﬂow is that it shows a very

repeatable pattern in time. Time series that exhibit a repeatable

pattern are modeled through the use of seasonal differencing

and seasonal parameters. Such models are called a SARIMA

model. We try to replace the nonseasonal ARIMA model with

Fig. 1. Framework of the DA approach.

the SARIMA model to forecast s3(t). It is found that using the

SARIMA model to forecast s3(t)produces a better prediction

than using the nonseasonal ARIMA model, but the ﬁnal predic-

tion accuracy of the DA model shows no improvement in many

cases with the replacement of the nonseasonal ARIMA model

with the SARIMA model. This is because the daily similarity of

time series s1(t)and the weekly similarity of time series s2(t)

have already captured the repeatable pattern of trafﬁc ﬂow, and

consequently, the trained NNs in the DA stage have the ability

to reveal the repeatable pattern.

The naïve model has the simplest form and is used to serve as

the worst case approach [16]. We use the naïve and SARIMA

models for comparison purposes in the case study.

Finally, the process to produce the forecast value ˆqDA(t)by

the DA approach is summarized as follows (see Fig. 1).

Step 1) Select time series s1(t), and use the MA model to

produce the forecast value ˆq1(t)(see Section III-A1).

Step 2) Select time series s2(t), and use the ES model to

produce the forecast value ˆq2(t)(see Section III-A2).

Step 3) Select time series s3(t), and use the nonseasonal

ARIMA model to produce the forecast value ˆq3(t)

(see Section III-A3).

Step 4) In the DA stage, an NN model is used to produce the

ﬁnal forecast value by (1) (see Section III-B).

III. FORMULATION OF DA FOR TIME

SERIES FORECASTING

In this section, ytdenotes the actual value at period t,ˆyt+1 is

the forecast value for the next period, and ˆyt+pis the forecast

for pperiods into the future.

A. Individual Submodels

1) MA Model: An MA of order kis computed by

ˆyt+1 =yt+yt−1+yt−2+···+yt−k+1

k(2)

where kis the number of terms in the MA [31].

62 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009

The MA technique deals only with the latest kperiods of

known data; the number of data points in each average does not

change as time continues. The MA model does not handle trend

or seasonality well.

2) ES Model: ES is a forecasting method that seeks to

isolate trends or seasonality from irregular variation. It has been

found to be most effective when the components that describe

the time series change slowly over time [32].

Holt developed an ES method, Holt’s two-parameter method

[31], which allows for evolving local linear trends in a time

series

Ht=αyt+(1−α)(Ht−1+bt−1)(3)

bt=γ(Ht−Ht−1)+(1−γ)bt−1(4)

ˆyt+p=Ht+btp(5)

where Htis the new smoothed value, αis the smoothing

constant for the data (0 ≤α≤1),γis the smoothing constant

for the trend estimate 0≤γ≤1,btis the trend estimate, and p

is the period to be forecast into the future. The weights can be

selected by minimizing the measure of the mean square error

(MSE).

3) ARIMA Model: A general ARIMA model of order

(r, d, s)representing the time series can be written as

φ(B)∇dyt=θ(B)et(6)

where etrepresents the random error term at time t,Bis

a backward-shift operator deﬁned by Byt=yt−1and related

to ∇by ∇=1−B,∇d=(1−B)d, and dis the order of

differencing. φ(B)and θ(B)are the AR and MA operators of

orders rand s, respectively, which are deﬁned as

φ(B)=1−φ1B−φ2B−···−φrBr(7)

θ(B)=1−θ1B−θ2B−···−θsBs(8)

where φi(i=1,2,...,r)are the AR coefﬁcients, and θj(j=

1,2,...,s)are the MA coefﬁcients.

Box and Jenkins [33] developed a practical approach to

building ARIMA models that has had a fundamental im-

pact on time series analysis and forecasting applications.

The Box–Jenkins methodology includes model identiﬁcation,

parameter estimation, diagnostic checking, and model forecast-

ing [24] and consists of the following three iterative steps. First,

we determine whether the time series is stationary or nonsta-

tionary. If it is nonstationary, it is transformed into a stationary

time series by applying a suitable degree of differencing. This

gives the value of d. Then, appropriate values of rand sare

found by examining the autocorrelation function (ACF) and

partial ACF (PACF) of the time series. Having determined r,

d, and s, the coefﬁcients of the AR and MA terms are estimated

using a nonlinear least squares method. In this paper, the time

series was analyzed using statistical software.

B. DA Based on an NN

The NN model that is employed in this paper consists of an

input layer with three neurons, a hidden layer with n1neurons,

Fig. 2. Feedforward BP network.

Fig. 3. Data collection point at Xia Yuan, Guangzhou.

and an output layer with one neuron (see Fig. 2). The NN model

maps the input vector (ˆq1(t),ˆq2(t),ˆq3(t)) to the output vector

ˆqDA(t), in which the hyperbolic tangent sigmoid transfer func-

tion is used in the hidden layer and the linear transfer function

is used in the output layer. In the training or learning stage

of the NN model, the weights or parameters of the network

are iteratively modiﬁed on the basis of a set of input–output

patterns known as a training set to minimize the deviance

or error between the output that is obtained by the network

and the observed output. The weights are initialized to small

values based on the technique of Nguyen and Widrow [37].

We normalize the data to a value between 0 and 1. The number

of hidden units and the learning rule are chosen through system-

atic experimentation. The learning rule that is commonly used

in this type of network is the backpropagation (BP) algorithm

or gradient descent method. Several different modiﬁcations of

the BP learning algorithm are tried in the training course.

To obtain a network that is capable of generalizing and

performing well with new cases, data samples are usually

subdivided into three sets: 1) a training set, 2) a validation set,

and 3) a test set [38]. During the learning stage of the network,

an excessive number of parameters or weights in relation to the

problem at hand and to the number of training data may lead

to overﬁtting. This phenomenon occurs when the model ﬁts

the irrelevant features that are present in the training data too

closely instead of ﬁtting the underlying function that relates the

inputs and outputs. This will result in the loss of the capacity to

generalize learning to new cases [35].

In this paper, cross validation and an early-stopping tech-

nique are used in the optimizing training process to avoid

TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 63

Fig. 4. Observed trafﬁc ﬂow at time intervals 7, 8, 9, 10, 11, and 12 in November 2005.

the overtraining (or overﬁtting) phenomenon of NNs. Early

stopping means that the termination of training is controlled by

the error of the validation sets, rather than by the error of the

training set.

IV. CASE STUDY

A. Study Area

A data set from January 1, 2005 to December 30, 2005 was

collected from a detector on National Highway 107, Xia Yuan,

Huangpu, Guangzhou, Guangdong, China (see Fig. 3). The

trafﬁc ﬂow data were aggregated and averaged into 1-h periods,

24 h/day.

B. Goodness-of-Fit Statistics

Three goodness-of-ﬁt statistics are used to assess the forecast

accuracy of the results.

1) The root MSE (RMSE) is calculated as

RMSE =







n=1

(yn−ˆyn)2.(9)

2) The percentage absolute error (PAE) is calculated as

PAE(n)=|yn−ˆyn|

×100%.(10)

3) The mean absolute percentage error (MAPE) is cal-

culated as

MAPE =1



n=1

|yn−ˆyn|

.(11)

Here, ynand ˆynare the observed and the forecast values

of observation n, respectively, and Nis the total number of

observations.

C. Selection of Models and Parameters

In the case study, we ﬁrst construct the three relevant time

series. Then, individual submodels are built as follows.

1) For time series s1(t), we try different orders of 1≤k=

k1≤10. That is, we use the trafﬁc ﬂow at time interval t

on previous ksuccessive days to forecast the trafﬁc ﬂow

at time interval ton the current day. The MA model is

used to forecast these time series.

For example, Fig. 4 shows the trafﬁc ﬂow at time

intervals 7,8,...,12 for 30 days in November 2005. We

vary the order kin (2) in the MA model. Fig. 5 shows the

RMSEs of the MA models with different values of kfor

time interval 11, in which the RMSE is calculated based

on 30 −kobservations (days) because the trafﬁc ﬂow on

the ﬁrst kdays in November cannot be predicated using

the MA model of order k. Based on the results, we choose

k=3when the RMSE is the smallest. Fig. 6 shows the

observed and forecast trafﬁc ﬂows at time interval 11 for

k=3in the MA model.

64 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009

Fig. 5. RMSEs of the MA models varying with order kfor s1(11).

Fig. 6. Observed trafﬁc ﬂow and predictions that result from the MA model

for s1(11).

2) For time series s2(t), we select as much data as possible

from January 2005 forward to forecast the same time

interval for the same weekday or weekend. Fig. 7 shows

six series at time intervals 7,8,...,12 on all Tuesdays

in 2005. Holt’s ES method is employed to forecast these

time series.

For example, to forecast the trafﬁc ﬂow at time

interval 7 on Tuesday, November 29, 2005, we use the

data at time interval 7 on all Tuesdays from January 5 to

November 22 in 2005. Fig. 8 shows the trafﬁc ﬂow s2(7)

at time interval 7 on all Tuesdays in 2005. To choose the

proper Holt’s ES model, we try all of the combinations

of the smoothing constants α=0,0.1,...,0.9,1and

γ=0,0.1,...,0.9,1. We select the combination (α=

0.1,γ =0.1) to ﬁt time series s2(7) when the sum-of-

square error is the smallest. The predictions that are

produced by the ES method are also shown in Fig. 8.

Note that time series s1(t)and s2(t)can be generated

directly from the source data. A computer program can

automatically produce the predictions from the MA and

ES models if the parameters are speciﬁed, such as the

order k=3in the MA model and the smoothing con-

stants α=0.1and γ=0.1in the ES model (see Figs. 6

and 8).

3) For time series s3(t), by experimentation, we select k3=

48. That is, we use the trafﬁc ﬂow data in the previous

48 h (two days) to forecast the trafﬁc ﬂow at the current

time, because Hanke et al. [31] have pointed out that more

than 40 observations are required to develop the ARIMA

model. The Box–Jenkins methodology of ARIMA mod-

eling is employed.

We observe that time series s3(t)is nonstationary. We

try to transform it into a stationary time series by applying

a suitable degree of differencing d. Then, we examine the

ACF and PACF of the time series to ﬁnd the appropriate

values of rand s. Here, we denote the ARIMA model

with the parameters r,d, and sas ARIMA(r, d, s).

For example, we take 48 hourly trafﬁc ﬂow data on

November 21 and 22, 2005. This time series is differ-

enced once, and the differenced data vary about a ﬁxed

level, i.e., zero. After checking the ACF and PACF of the

differenced data, the parameters can be chosen as (r,s=

0or 1) and d=1. Table I shows the p-value and resid-

ual mean square for the three models ARIMA(1, 1, 0),

ARIMA(0, 1, 1), and ARIMA(1, 1, 1).

In Table I, the p−value =0.809 in model ARIMA(1,

1, 1) indicates that coefﬁcient θ1is not signiﬁcant from

zero at the 5% level, so θ1can be dropped from the model.

Thus, model ARIMA(1, 1, 1) is not chosen. Of the other

two models, the residual mean square of the ARIMA(1, 1,

0) model is the smallest. The p−value =0.002 in model

ARIMA(1, 1, 0) is good enough to indicate that the coefﬁ-

cient φ1is signiﬁcant from zero. Therefore, this model is

the best ﬁtting model and is chosen to forecast time series

s3(t). The resultant model can be explicitly written as

ˆq3(t)=q(t−1) + 0.4351(q(t−1) −q(t−2)).Fig.9

shows a comparison of the observed and predicted trafﬁc

ﬂows from the ARIMA(1, 1, 0) model for the chosen

time series.

D. Design of NNs

In designing NN models, the number of neurons in the

hidden layer is an important feature that needs to be carefully

chosen. To avoid the overtraining or overﬁtting problem of

NNs, we use three data sets: 1) a training set, 2) a validation

set, and 3) a testing set. Although there is no precise rule on

the optimum size of these data sets, it is recommended that the

training set should be the largest [38].

For example, the 2005 data (ˆq1(t),ˆq2(t),ˆq3(t),q(t)) from

Sunday, November 6 to Saturday, November 12 are cho-

sen as the training set, in which (ˆq1(t),ˆq2(t),ˆq3(t)) is the

input to the NN and q(t)is the output. The 2005 data

(ˆq1(t),ˆq2(t),ˆq3(t),q(t)) on November 13 and 14 are chosen

as the validation set. The 2005 data on Sunday, November 20,

are chosen as testing set 1 and the data on Wednesday,

November 23 as testing set 2.

A series of NNs with different numbers of neurons in the

hidden layer are trained. The number of neurons varies from

TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 65

Fig. 7. Observed trafﬁc ﬂow on all Tuesdays in 2005.

Fig. 8. Observed trafﬁc ﬂow and predictions that result from the ES model for

s2(7) on Tuesday.

TAB L E I

RESULTS FROM DIFFERENT ARIMA MODELS

3 to 20, and the RMSEs are calculated for both the training

and the testing set. According to its generalization ability

on the testing set, the lower the value of the RMSE is, the

better the network model is. Fig. 10 shows the curve of the

Fig. 9. Observed trafﬁc ﬂow and predictions that result from the ARIMA(1,

1, 0) model for s3(t).

RMSE versus the number of hidden-layer neurons when the

Levenberg–Marquardt (LM) BP algorithm is used. In Fig. 10,

we ﬁnd that the best number of hidden-layer neurons is 12.

Therefore, a 3-12-1 NN model is selected for further study.

The curves of the RMSEs for the training and validation

sets versus the learning epochs that use the technique of early-

stopping training are shown in Fig. 11, which shows that the

RMSE swiftly decreases in both the training and validation

sets when the epoch is less than eight. The RMSE achieves the

lowest value when the epoch is eight and remains almost steady

when the epoch increases.

66 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009

Fig. 10. Hidden-node numbers versus the MSE on the training and testing

sets.

Fig. 11. RMSE of the training and testing sets versus the learning epochs.

TAB L E II

LEARNING ALGORITHM COMPARISON

∗LM: Levenberg-Marquardt backpropagation; RP: resilient backpropagation;

GD: gradient descent backpropagation; GDA: gradient descent with

adaptive learning rate back-propagation; GDM: gradient descent

with momentum backpropagation; GDX: gradient descent with

momentum and adaptive learning rate backpropagation.

Several BP learning rules are selected in the training course.

Let the maximum training epoch be 5000 and the MSE con-

vergence goal be 0.001. The learning epochs and the RMSEs

of the outputs of the trained NN on testing sets 1 and 2 are

listed in Table II, in which the early-stopping technique is

not applied. Table II shows that the convergence goal is met

at epoch 10 by the LM algorithm and at epoch 4293 by the

resilient backpropagation (RP) algorithm, but the maximum

training epoch is reached and the convergence goal is not met

by the other algorithms. The RMSEs by the LM algorithm is

similar to that by the RP algorithm, while the LM algorithm

has the quickest convergence. The LM algorithm is the best

learning rule in this case. Similar studies have been carried out

in trafﬁc ﬂow forecasting on other data sets. It is found that the

number of neurons in the hidden layers of the NN is between 4

and 24.

E. Comparison of Results

The DA approach, denoted by the DA model, integrates

three submodels (MA, ES, and ARIMA) using an NN. The

predictions from the MA, ES, and ARIMA models are used as

input to the NN in the DA stage, and the output of the trained

NN is the ﬁnal prediction.

Several single-source models, including the naïve, ARIMA,

NP, and NN models, are applied to time series s3(t).The

ARIMA model is the same as that in the DA approach. We

compare their predictions on the testing sets with those of the

DA model in the case study.

1) The naïve (or no-change) model for trafﬁc ﬂow forecast-

ing has the simplest form

ˆqNa(t)=q(t−1)

where ˆqNa(t)is the forecast value at time interval t.

2) NP is a potential approach to trafﬁc ﬂow forecasting [16],

[36]. Kernel estimation is a nonparametric estimation

technique, and the applicability of this technique requires

that both the kernel function and bandwidth be suitably

chosen [15]. Smith et al. [16] provided an extensive

comparison of the forecasting performance of six non-

parametric methods, which are called “straight average,”

“weighted by inverse of distance,” “adjusted by V(t),”

“adjusted by both V(t)and Vhist(t),” “adjusted by both

weight and distance,” and so forth. It is demonstrated

that among these models, the “adjusted by V(t)” model

has the simplest form and relatively satisfactory perfor-

mance, i.e.,

V(t+1)= 1



i=1

Vi(t+1)Vc(t)

Vi(t)

where V(t)is the trafﬁc ﬂow at time interval t,Vc(t)is an

element in the current state, and Vhist(t)is the historical

average volume at the time of day and day of the week

that are associated with time interval t(see [16] for a

detailed description of the model). In this paper, forecasts

are generated using K-nearest neighbor forecasts for

values of Kbetween 5 and 40, and it is found that when

K=25, the model produces the best prediction on the

basis of MAPEs.

3) An artiﬁcial NN as a single-source model is used for com-

parison with the DA approach. To show the difference, the

NN in the DA approach is denoted by NN1 and the NN

for the purpose of comparison is denoted by NN2.

TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 67

As described above, the NN1 model is trained to ﬁt the

nonlinear function in (1). The NN2 model is used to ﬁt the

nonlinear relationship

ˆqNN(t)=f1(q(t−1),q(t−2),...,q(t−l)) .

The NN1 and NN2 models have different inputs. The inputs

of the NN1 model are the predictions from the MA, ES, and

ARIMA models. The inputs of the NN2 model are the trafﬁc

ﬂow records at previous lsuccessive time intervals, and its

output is the prediction of trafﬁc ﬂow at time interval t.

The NN2 model is constructed with one input layer, one

hidden layer, and one output. The number of inputs land the

number of hidden neurons in the NN2 model are also optimized

by experimentation.

Note that all of the models (naïve, ARIMA, NP, and NN2) are

employed to produce forecasts for time series s3(t). To produce

the prediction of s3(t), each model may need a different length

of historical data. Hence, the sample size should be properly

chosen for each model. For example, the length of time of the

training data for the naïve model is no more than one day before

the forecasting time, while that for the training data for the NP

model covers several weeks before. For each model, we choose

the parameters by observing the best ﬁtting or forecasting. We

compare their forecasts on the same test sets.

Testing set 1 is the trafﬁc record on Sunday, November 20,

2005. Testing set 2 is the trafﬁc record on Wednesday,

November 23, 2005. The training sample for the above models

covers the data from January 1, 2005 to November 12, 2005.

The validation set for the NN2 model is the observed trafﬁc

ﬂow record on November 13 and 14, 2005.

The models above and the DA model are evaluated for their

out-of-sample forecasting performance using a recursive train-

ing sample (see [39] for a detailed discussion on this method).

Let Tbe the forecasting origin, and we generate forecasts for

time periods T+1,T +2,...,T +N. The procedure of the

N-step-ahead forecast is given as follows.

1) Select a training set. Let T∗

0(<T)be the index of the

ﬁnal trafﬁc ﬂow record in the training set.

2) Identify and estimate each model (naïve, ARIMA, NP,

and NN2) using the training set.

3) Identify and estimate the models (MS, ES, ARIMA, and

NN1) in the DA approach, where the ARIMA model is

the same as that in step 2), using the methods that are

described in Sections IV-B, C, and D. In this step, we use

the historical data before Tthat cover the training set in

step 1), as we may need much more data to construct time

series s1(t)and s2(t).

4) Compute the N-step-ahead forecasts for the models

(naïve, ARIMA, NP, NN2, and DA).

5) Advance the time index by one and increase the training

sample by one, i.e., set T∗

1=T+1, and go to step 4).

Estimate the same models and iterate over all of the

values of the testing data set.

6) Compute the MAPEs between the N-step-ahead predic-

tors and the observed values.

As we advance the time index, we do not identify the

parameters for the models (naïve, ARIMA, NP, NN2, and DA).

TABLE III

MAPEs (IN PERCENTAGE)(IN VEHICLES PER HOUR)OF MODELS

WITH DIFFERENT TIME HORIZONS (ON TESTING SET 1)

TAB L E IV

MAPEs (IN PERCENTAGE)(IN VEHICLES PER HOUR)OF MODELS

WITH DIFFERENT TIME HORIZONS (ON TESTING SET 2)

We use the same model parameters that are identiﬁed using

the training sample. The advantages and disadvantages of this

evaluation procedure have been discussed in [40]. When eval-

uating the DA model, the N-step-ahead forecast may need the

one-step-ahead forecasts from time series s1(t)and s2(t).For

example, to produce the two-step-ahead forecast ˆqDA (t+2)

by the DA model, we need (ˆq1(t+2),ˆq2(t+2),ˆq3(t+ 2)),in

which ˆq1(t+2)is the one-step-ahead forecast from time series

s1(t+2),ˆq2(t+2) is the one-step-ahead forecast from time

series s2(t+2), and ˆq3(t+2) is the two-step-ahead forecast

from time series s3(t).

Tables III and IV show the MAPEs of the two testing sets

with the different forecast horizons N=1, 2, or 3. In this paper,

the NN2 model has three inputs, the number of neurons in the

hidden layer is 16, and the LM algorithm is used for its training.

The bold numbers in the tables indicate the best performance in

each column.

The MAPE statistics in Tables III and IV show that the

predictions that result from the NP and DA models are better

than the predictions that result from the naïve, ARIMA, and

NN2 models. The predictions that result from the NN2 model

are better than the predictions that result from the ARIMA

model. The DA approach provides the best one-step-ahead

forecasting results compared with the other models. The NP

and DA models have almost the same performance in two-

step-ahead and three-step-ahead forecasting. Although ARIMA

models are quite ﬂexible in many time series, the results are

not ideal when the time series is highly nonlinear. As the NN1

model in the DA approach has one input from the ARIMA

model, the poor performance of the ARIMA model affects the

prediction accuracy of the DA model. Fig. 12 shows the PAEs

that result from the NN1 model in the DA approach on the

training, validation, and testing sets. The PAE is not always

small in the application of the DA approach because of the

highly nonlinear characteristic of trafﬁc ﬂow.

V. C ONCLUSION

Combining different modeling schemes for the improvement

of prediction capability has become a common practice in

68 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 10, NO. 1, MARCH 2009

Fig. 12. PAEs of the DA model for the training, validation, and testing sets.

different application ﬁelds, but the approaches vary, and thus,

worthy of investigation are speciﬁc applications. In this paper,

aiming at the repeatable pattern of trafﬁc ﬂow, three relevant

time series have been constructed, and three models have been

used to forecast these series. The DA strategy that has been

proposed uses NN technology and aggregates the forecasting

values that result from the multiple models. The weekly simi-

larity time series and daily similarity time series can be easily

constructed from the source time series, and the forecasting

value of the MA and ES models can be automatically obtained

by a computer program once the parameters of the models

are speciﬁed. The predictions of the hourly time series by the

ARIMA model can easily be obtained with statistical software.

Therefore, the proposed DA approach is not a time-consuming

but, rather, a feasible job.

The relevant time series make full use of the information in

the source time series that is collected on a single detector. By

analyzing the forecasting performance of the naïve, ARIMA,

NP, NN, and DA models, we have shown that the DA model

can provide results that are more accurate than those of the other

models.

If an accident or nonrecurring congestion happens, the re-

peatable pattern of trafﬁc ﬂow is lost, and the prediction accu-

racy will be affected. In such a case, one solution is to install

sufﬁcient detectors on the upstream to track the trafﬁc ﬂow.

How to use the DA approach in a multiple-detector environment

deserves further study. The DA method that is based on NNs

may have considerable potential in forecasting technology.

ACKNOWLEDGMENT

The authors would like to thank the four anonymous refer-

ees for their helpful suggestions and critical and constructive

comments on an earlier version of the paper.

REFERENCES

[1] B. Abdulhai, H. Porwal, and W. Recker, “Short-term trafﬁc ﬂow predic-

tion using neuro-genetic algorithms,” J. Intell. Transp. Syst., vol. 7, no. 1,

pp. 3–41, Jan. 2002.

[2] M. C. Tan, J. M. Xu, and Z. Y. Mao, “Trafﬁc ﬂow modeling and on-

ramp optimal control in freeway,” Chin. J. Highw. Transp., vol. 13, no. 4,

pp. 83–85, 2000.

[3] M. C. Tan, C. O. Tong, and J. M. Xu, “Study and implementation of a

decision support system for urban mass transit service planning,” J. Inf.

Technol. Manage., vol. XV, no. 1/2, pp. 14–32, 2004.

[4] M. C. Tan, C. O. Tong, S. C. Wong, and J. M. Xu, “An algorithm for

ﬁnding reasonable paths in transit networks,” J. Adv. Transp., vol. 41,

no. 3, pp. 285–305, 2007.

[5] M. C. Tan, L. B. Feng, and J. M. Xu, “Trafﬁc ﬂow prediction based on

hybrid ARIMA and ANN model,” Chin. J. Highw. Transp., vol. 20, no. 4,

pp. 118–121, 2007.

[6] B. M. Williams, “Multivariate vehicular trafﬁc ﬂow prediction: Evalua-

tion of ARIMAX modeling,” Transp. Res. Rec., vol. 1776, pp. 194–200,

2001.

[7] B. M. Williams and L. A. Hoel, “Modeling and forecasting vehicular

trafﬁc ﬂow as a seasonal ARIMA process: Theoretical basis and empirical

results,” J. Transp. Eng., vol. 129, no. 6, pp. 664–672, Nov./Dec. 2003.

[8] S. Sun, C. Zhang, and G. Yu, “A Bayesian network approach to trafﬁc ﬂow

forecasting,” IEEE Trans. Intell. Transp. Syst., vol. 7, no. 1, pp. 124–131,

Mar. 2006.

[9] M. S. Dougherty and M. R. Cobbett, “Short-term inter-urban trafﬁc fore-

casts using neural networks,” Int. J. Forecast., vol. 13, no. 1, pp. 21–31,

Mar. 1997.

TAN et al.: AGGREGATION APPROACH TO SHORT-TERM TRAFFIC FLOW PREDICTION 69

[10] C. Ledoux, “An urban trafﬁc ﬂow model integrating neural networks,”

Transp. Res., Part C–Emerg. Technol., vol. 5, no. 5, pp. 287–300,

Oct. 1997.

[11] D. Shmueli, “Applications of neural networks in transportation planning,”

Prog. Plann., vol. 50, no. 3, pp. 141–204, Oct. 1998.

[12] H. Dia, “An object-oriented neural network approach to short-term

trafﬁc forecasting,” Eur. J. Oper. Res., vol. 131, no. 2, pp. 253–261,

Jun. 2001.

[13] H. B. Yin, S. C. Wong, J. M. Xu, and C. K. Wong, “Urban trafﬁc ﬂow

prediction using a fuzzy-neural approach,” Transp. Res., Part C–Emerg.

Technol., vol. 10, no. 2, pp. 85–98, Apr. 2002.

[14] L. W. Lan and Y. C. Huang, “A rolling-trained fuzzy neural network

approach for freeway incident detection,” Transportmetrica, vol. 2, no. 1,

pp. 11–29, 2006.

[15] N.-E. El Faouzi, “Nonparametric trafﬁc ﬂow prediction using kernel

estimator,” in Proc. 13th ISTTT, J.-B. Lesort, Ed., Lyon, France, 1996,

pp. 41–54.

[16] B. L. Smith, B. M. Williams, and R. K. Oswald, “Comparison of paramet-

ric and nonparametric models for trafﬁc ﬂow forecasting,” Transp. Res.,

Part C–Emerg. Technol., vol. 10, no. 4, pp. 303–321, Aug. 2002.

[17] H. Chen and S. Grant-Muller, “Use of sequential learning for short-term

trafﬁc ﬂow forecasting,” Transp. Res., Part C–Emerg. Technol.,vol.9,

no. 5, pp. 319–336, Oct. 2001.

[18] G. Huisken, “Soft-computing techniques applied to short-term trafﬁc ﬂow

forecasting,” Syst. Anal. Model. Simul., vol. 43, no. 2, pp. 165–173,

Feb. 2003.

[19] E. I. Vlahogianni, M. G. Karlaftis, and J. C. Golias, “Optimized and

meta-optimized neural networks for short-term trafﬁc ﬂow prediction: A

genetic approach,” Transp. Res., Part C–Emerg. Technol., vol. 13, no. 3,

pp. 211–234, Jun. 2005.

[20] H. A. Edgerton and L. E. Kolbe, “The method of minimum variation for

the combination of criteria,” Psychometrika, vol. 1, no. 3, pp. 183–187,

Sep. 1936.

[21] J. M. Bates and C. W. J. Granger, “The combination of forecasts,” Oper.

Res. Q., vol. 20, pp. 451–468, 1969.

[22] M. J. Lawerence, R. H. Edmundson, and M. J. O’Connor, “The accuracy

of combining judgmental and statistical forecasts,” Manag. Sci., vol. 32,

no. 12, pp. 1521–1532, 1986.

[23] S. Makridakis,“Why combining works?” Int. J. Forecast., vol. 5, no. 4,

pp. 601–603, 1989.

[24] G. P. Zhang, “Time series forecasting using a hybrid ARIMA and neural

network model,” Neurocomput., vol. 50, pp. 159–175, Jan. 2003.

[25] Y. Gao and M. J. Er, “NARMAX time series model prediction: Feedfor-

ward and recurrent fuzzy neural network approaches,” Fuzzy Sets Syst.,

vol. 150, no. 2, pp. 331–350, Mar. 2005.

[26] Y. Jiang, “Prediction of freeway trafﬁc ﬂows using Kalman predictor in

combination with time series,” Transp. Q., vol. 57, no. 2, pp. 99–118,

2003.

[27] N. E. El Faouzi, “Combining predictive schemes in short-term trafﬁc

forecasting,” in Proc. 14th ISTTT, A. Ceder, Ed., Jerusalem, Israel, 1999,

pp. 471–487.

[28] W. Zheng, D. Lee, and Q. Shi, “Short-term freeway trafﬁc ﬂow prediction:

Bayesian combined neural network approach,” J. Transp. Eng., vol. 132,

no. 2, pp. 114–121, Feb. 2006.

[29] M. V. D. Voort, M. Dougherty, and S. Watson, “Combining Kohonen maps

with ARIMA time series models to forecast trafﬁc ﬂow,” Transp. Res.,

Part C–Emerg. Technol., vol. 4, no. 5, pp. 307–318, Oct. 1996.

[30] B. L. Bowerman and R. T. O’Connell, Forecasting and Time Series: An

Applied Approach, 3rd ed. Florence, Italy: Brooks/Cole, 1993.

[31] J. E. Hanke, A. G. Reitsch, and D. W. Wichern, Business Forecasting,

7th ed. Boston, MA: Prentice–Hall, 2001.

[32] R. A. Yaffee and M. McGee, Introduction to Time Series Analysis and

Forecasting. San Diego, CA: Academic, 2000.

[33] G. E. P. Box and G. M. Jenkins, Time Series Analysis, Forecasting and

Control, 2nd ed. San Francisco, CA: Holden-Day, 1970.

[34] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal

representations by error propagation,” in Parallel Distributed Processing,

D. E. Rumelhart and J. L. McClelland, Eds. Cambridge, MA: MIT Press,

1986, pp. 318–362.

[35] E. B. Baum and D. Haussler, “What size net gives valid generalization?”

Neural Comput., vol. 1, no. 1, pp. 151–160, 1989.

[36] E. Matzner-Løbber, A. Gannoun, and J. G. De Gooijer, “Nonparametric

forecasting: A comparison of three kernel-based methods,” Commun.

Stat., Theory Methods, vol. 27, no. 7, pp. 1593–1617, 1998.

[37] D. Nguyen and B. Widrow, “Improving the learning speed of 2-layer

neural networks by choosing initial values of the adaptive weights,” in

Proc. Int. Joint Conf. Neural Netw., 1990, vol. 3, pp. 21–26.

[38] C. M. Bishop, Neural Network for Pattern Recognition. Oxford, U.K.:

Clarendon, 1995.

[39] L. J. Tashman, “Out-of-sample tests of forecasting accuracy: An analysis

and review,” Int. J. Forecast., vol. 16, no. 4, pp. 437–450, Oct. 2000.

[40] D. D. Thomakos and J. B. Guerard, Jr., “Naïve, ARIMA, nonparametric,

transfer function and VAR models: A comparison of forecasting perfor-

mance,” Int. J. Forecast., vol. 20, no. 1, pp. 53–67, Jan.–Mar. 2004.

Man-Chun Tan received the M.S. degree in math-

ematics from Jinan University, Guangzhou, China,

in 1995 and the Ph.D. degree in control theory and

control engineering from the South China University

of Technology, Guangzhou, in 2000.

From July 2000 to September 2000, he was a

Research Assistant with the Department of Civil

Engineering, University of Hong Kong, Hong Kong.

From June 2001 to May 2002, he was a Research

Associate with the University of Hong Kong. He is

currently a Professor with the Department of Math-

ematics, College of Information Science and Technology, Jinan University.

His research interests include trafﬁc ﬂow prediction and control, transportation

planning, stability analysis of dynamic systems, and artiﬁcial neural networks.

S. C. Wong received the B.Sc.(Eng) and M.Phil.

degrees from the University of Hong Kong, Hong

Kong, and the Ph.D. degree from University College

London, London, U.K.

He is currently a Professor with the Department

of Civil Engineering, University of Hong Kong.

His research interests include optimization of traf-

ﬁc signal settings, continuum modeling for trafﬁc

equilibrium problems, land use and transportation

problems, dynamic highway and transit assignment

problems, urban taxi services, and road safety.

Jian-Min Xu received the Ph.D. degree from the

South China University of Technology, Guangzhou,

China.

He is currently a Professor and a Doctoral Advisor

with the College of Trafﬁc and Communication,

South China University of Technology. As a Project

Principal, he has accomplished many national or

ministerial/provincial projects, some of which are

National 863 High-Tech Programs or National Fund

Projects. He is the author or a coauthor of more than

100 journal papers. He is the holder of three patents.

His research interests include intelligent transportation systems and intelligent

control theory and its application.

Zhan-Rong Guan received the B.S. degree in man-

agement information system from the College of In-

formation Science and Technology, Jinan University,

Guangzhou, China, in July 2005. From September

2005 to June 2007, he worked towards the M.S.

degree in applied mathematics with the Department

of Mathematics, Jinan University.

His research interests include trafﬁc ﬂow forecast-

ing and design of information systems.

Peng Zhang received the B.Sc. degree from Sichuan

University, Sichuan, China, and the M.Sc. and Ph.D.

degrees from the University of Science and Technol-

ogy of China, Hefei, China.

He is currently a Professor with the Shanghai

Institute of Applied Mathematics and Mechanics,

Shanghai University, Shanghai, China. His research

interests include trafﬁc ﬂow theory and computa-

tional methods.

SEAT: A Spatiotemporal Encode-Again Transformer for Traffic Prediction

Conference Paper

Full-text available

Oct 2023

Dynamic spatial‐temporal network for traffic forecasting based on joint latent space representation

Article

Full-text available

May 2024
IET INTELL TRANSP SY

In the era of data‐driven transportation development, traffic forecasting is crucial. Established studies either ignore the inherent spatial structure of the traffic network or ignore the global spatial correlation and may not capture the spatial relationships adequately. In this work, a Dynamic Spatial‐Temporal Network (DSTN) based on Joint Latent Space Representation (JLSR) is proposed for traffic forecasting. Specifically, in the spatial dimension, a JLSR network is developed by integrating graph convolution and spatial attention operations to model complex spatial dependencies. Since it can adaptively fuse the representation information of local topological space and global dynamic space, a more comprehensive spatial dependency can be captured. In the temporal dimension, a Stacked Bidirectional Unidirectional Gated Recurrent Unit (SBUGRU) network is developed, which captures long‐term temporal dependencies through both forward and backward computations and superimposed recurrent layers. On these bases, DSTN is developed in an encoder‐decoder framework and periodicity is flexibly modeled by embedding branches. The performance of DSTN is validated on two types of real‐world traffic flow datasets, and it improves over baselines.

Time-to-Green Predictions for Fully-Actuated Signal Control Systems With Supervised Learning

Article

Jan 2024

Recently, efforts have been made to standardize signal phase and timing (SPaT) messages. These messages contain signal phase timings of all signalized intersection approaches. This information can thus be used for efficient motion planning, resulting in more homogeneous traffic flows and uniform speed profiles. Despite efforts to provide robust predictions for semi-actuated signal control systems, predicting signal phase timings for fully-actuated controls remains challenging. This paper proposes a time series prediction framework using aggregated traffic signal and loop detector data. We utilize state-of-the-art machine learning models to predict future signal phases’ duration. The performance of a Linear Regression (LR), Random Forest (RF), a light gradient-boosting machine (LightGBM), a bidirectional Long-Short-Term-Memory neural network (BiLSTM) and a Temporal Convolutional Network (TCOV) are assessed against a naive baseline model. Results based on an empirical data set from a fully-actuated signal control system in Zurich, Switzerland, show that state of the art machine learning models outperform conventional prediction methods.

EGFormer: An Enhanced Transformer Model with Efficient Attention Mechanism for Traffic Flow Forecasting

Article

Full-text available

Jan 2024

Due to the regular influence of human activities, traffic flow data usually exhibit significant periodicity, which provides a foundation for further research on traffic flow data. However, the temporal dependencies in traffic flow data are often obscured by entangled temporal regularities, making it challenging for general models to capture the intrinsic functional relationships within the data accurately. In recent years, a plethora of methods based on statistics, machine learning, and deep learning have been proposed to tackle these problems of traffic flow forecasting. In this paper, the Transformer is improved from two aspects: (1) an Efficient Attention mechanism is proposed, which reduces the time and memory complexity of the Scaled Dot Product Attention; (2) a Generative Decoding mechanism instead of a Dynamic Decoding operation, which accelerates the inference speed of the model. The model is named EGFormer in this paper. Through a lot of experiments and comparative analysis, the authors found that the EGFormer has better ability in the traffic flow forecasting task. The new model has higher prediction accuracy and shorter running time compared with the traditional model.

Physics Guided Deep Learning-based Model for Short-term Origin-Destination Demand Prediction in Urban Rail Transit Systems Under Pandemic

Article

May 2024

Forecasting air passenger travel: A case study of Norwegian aviation industry

Article

Nov 2023

Accurate forecasting of airline passenger traffic is important for facilitating the effective management and planning of aviation resources. In this study, we explore the air passenger traffic in the Norwegian aviation industry by collecting the passenger flow data and the corresponding measurements of the weather conditions affecting the flow from the different airports in Norway. We then proposed nonlinear autoregressive with exogenous input (NARX) forecasting models to predict air passenger traffic in advance. The NARX models account for the nonlinearity and nonstationarity in the passenger flow and allow the accurate forecasting of air passenger traffic. We perform experiments to demonstrate the effectiveness of two variants of the NARX model and compare their performances against long short‐term memory (LSTM), a deep learning method. We show that the proposed NARX model achieves the best prediction accuracy compared to LSTM, which is considered as a state‐of‐the‐art approach for fitting sequential data.

Web (Network) Traffic Time Series Forecasting and Resource Optimization

Conference Paper

Jul 2023

Deep generative models for vehicle speed trajectories

Article

Oct 2023

Generating realistic vehicle speed trajectories is a crucial component in evaluating vehicle fuel economy and in predictive control of self‐driving cars. Traditional generative models rely on Markov chain methods and can produce accurate synthetic trajectories but are subject to the curse of dimensionality. They do not allow to include conditional input variables into the generation process. In this paper, we show how extensions to deep generative models allow accurate and scalable generation. Proposed architectures involve recurrent and feed‐forward layers and are trained using adversarial techniques. Our models are shown to perform well on generating vehicle trajectories using a model trained on GPS data from Chicago metropolitan area.

Air passenger flow forecasting using nonadditive forecast combination with grey prediction

Article

Sep 2023
J AIR TRANSP MANAG

Yi-Chung Hu

Short-term urban rail transit passenger flow forecasting based on fusion model methods using univariate time series

Article

Aug 2023
APPL SOFT COMPUT

Combining Predictive Schemes in Short-Term Forecasting

Article

Full-text available

Aug 1987

Prediction of freeway traffic flows using Kalman predictor in combination with time series

Article

Mar 2003

Y. Jiang

It is essential to predict traffic flow rates dynamically and accurately for traffic engineers to efficiently control traffic flows and reduce traffic delays. This paper introduces a method for prediction of freeway traffic flows. The method combines the combination of the Kalman control theory and the times series theory into a tool for traffic flow prediction. It is illustrated that the combination method provides more accurate traffic flow prediction than using either one of the two theories individually. With the prediction model, the traffic flow on a given freeway in the next time interval (five to 15 minutes) can be predicted using traffic data at the current and past time intervals. Dynamic traffic predictions with the developed model can be performed for individual lanes as well as for all the lanes of each travel direction. It is also shown that a dynamic prediction of traffic flow rate with this prediction model would also constitute a dynamic prediction of traffic congestion if the traffic capacity was given.

Traffic flow prediction based on hybrid ARIMA and ANN model

Article

Jul 2007

Hybrid autoregressive integrated moving average (ARIMA) and artificial neural network models were employed in the short-term traffic flow prediction. Using the good linear fitting ability of ARIMA and the strong nonlinear mapping ability of artificial neural network, the traffic flow time series was considered to be composed of a linear autocorrelation structure and a nonlinear structure. ARIMA model was used to predict the linear component of traffic flow time series and the artificial neural network model was applied to the nonlinear residual component prediction. Results show that the hybrid model, which takes advantage of the unique strength of the two models in linear and nonlinear modeling, can produce more accurate predictions than that of single model. The hybrid model can be an efficient method to the short-term traffic flow prediction.

Time Series Analysis: Forecasting and Control

Article

Jan 1976

Nonparametric traffic flow prediction using kernel estimator

Article

Jan 1996

Nour-Eddin El Faouzi

An introduction to time series analysis and forecasting. 2nd ed

Article

Robert Alan Yaffee

Forecasting and Time Series: An Applied Approach

Article

Nov 1994

An introduction to forecasting Basic statistical concepts Forecasting by using regression analysis Simple linear regression Multiple regression Topics in regression analysis Forecasting by using time series regression, decomposition methods and exponential smoothing Time series regression Decomposition methods Exponential smoothing Forecasting by using basic techniques of the box Jenkins methodology Nonseasonal, box-Jenkins models and their tentative identification Estimation, diagnostic checking and forecasting for nonseasonal box-Jenkins models An introduction to box-Jenkins seasonal modelling Forecasting by using advanced technology of the box-Jenkins methodology General box-Jenkins seasonal modelling Using the box-Jenkins methodology to improve time series regression models and to implement exponential smoothing Transfer functions and intervention models.

The Combination of Forecasts

Article

Dec 1969
OR

Two separate sets of forecasts of airline passenger data have been combined to form a composite set of forecasts. The main conclusion is that the composite set of forecasts can yield lower mean-square error than either of the original forecasts. Past errors of each of the original forecasts are used to determine the weights to attach to these two original forecasts in forming the combined forecasts, and different methods of deriving these weights are examined.

Nonparametric forecasting: A comparison of three kernel-based methods

Article

Jun 2007

In this paper the use of three kernel-based nonparametric forecasting methods - the conditional mean, the conditional median, and the conditional mode -is explored in detail. Several issues related to the estimation of these methods are discussed, including the choice of the bandwidth and the type of kernel function. The out-of-sample forecasting performance of the three nonparametric methods is investigated using 60 real time series. We find that there is no superior forecast method for series having approximately less than 100 observations. However, when a time series is long or when its conditional density is bimodal there is quite a difference between the forecasting performance of the three kernel-based forecasting methods.

Introduction to Time Series Analysis and Forecasting

Book

Jan 2008

An Aggregation Approach to Short-Term Traffic Flow Prediction

Abstract and Figures

Recommended publications

Improved grey prediction models for the trans-Pacific air passenger market

Pattern-Based Short-Term Traffic Forecasting under Urban Heterogeneous Conditions

Comparing Forecasting Approaches for Internet Traffic

Short-Time Traffic Forecasting of Urban Road Network: An ANN Model Based on DTW Clustering