Conference PaperPDF Available

Short-Term Traffic Prediction Using Long Short-Term Memory Neural Networks

July 2018

July 2018

DOI:10.1109/BigDataCongress.2018.00015

Conference: 2018 IEEE International Congress on Big Data (BigData Congress)

Authors:

Zainab Abbas

KTH Royal Institute of Technology

Ahmad Al-Shishtawy

Swedish Institute of Computer Science

Sarunas Girdzijauskas

KTH Royal Institute of Technology

Vladimir Vlassov

KTH Royal Institute of Technology

Traffic sensors placed on Stockholm highway.

…

LSTM architectures.

…

Input data representation.

…

Figures - uploaded by Sarunas Girdzijauskas

Content may be subject to copyright.

Content uploaded by Sarunas Girdzijauskas

Content may be subject to copyright.

Short-Term Trafﬁc Prediction Using Long Short-Term Memory Neural Networks

Zainab Abbas∗, Ahmad Al-Shishtawy†, Sarunas Girdzijauskas∗†, Vladimir Vlassov∗

∗KTH Royal Institute of Technology, Stockholm, Sweden. Email: {zainabab, sarunasg, vladv}@kth.se

†RISE SICS, Stockholm, Sweden. Email: {ahmad.al-shishtawy, sarunas.girdzijauskas}@ri.se

Abstract—Short-term trafﬁc prediction allows Intelligent

Transport Systems to proactively respond to events before they

happen. With the rapid increase in the amount, quality, and

detail of trafﬁc data, new techniques are required that can

exploit the information in the data in order to provide better

results while being able to scale and cope with increasing

amounts of data and growing cities. We propose and compare

three models for short-term road trafﬁc density prediction

based on Long Short-Term Memory (LSTM) neural networks.

We have trained the models using real trafﬁc data collected

by Motorway Control System in Stockholm that monitors

highways and collects ﬂow and speed data per lane every

minute from radar sensors. In order to deal with the challenge

of scale and to improve prediction accuracy, we propose to

partition the road network into road stretches and junctions,

and to model each of the partitions with one or more LSTM

neural networks. Our evaluation results show that partitioning

of roads improves the prediction accuracy by reducing the

root mean square error by the factor of 5. We show that we

can reduce the complexity of LSTM network by limiting the

number of input sensors, on average to 35% of the original

number, without compromising the prediction accuracy.

Keywords-LSTM; neural networks; trafﬁc prediction

I. INTRODUCTION

Smooth road trafﬁc ﬂow in urban cities is an ongoing

research challenge as the demand for the road infrastructure

increases faster than the speed at which cities can expand

it. The increase in demand leads to trafﬁc congestion that

has a direct negative impact on the society in many aspects

such as reduced trafﬁc safety, increased pollution, wasted

fuel and time, increased cost for businesses, etc.

Road transport administrations of developed large cities

acquire real-time trafﬁc data from multiple sources such

as infrastructure sensors, mobile data, bluetooth sensors,

and trafﬁc cameras and apply state-of-the-art techniques

to monitor and analyze road trafﬁc. These systems and

techniques altogether became known as Intelligent Trans-

portation System (ITS) which aims at providing innovative

solutions to tackle the trafﬁc management problem and

achieve smarter utilization of the transport network.

Short-term trafﬁc prediction is a vital component in any

ITS. Being able to accurately predict the state of the trafﬁc

in the near future enables ITS to be proactive rather than

reactive by actively mitigating potential problems before

they happen.

The common schools of thought on studying trafﬁc pre-

diction involve trafﬁc ﬂow theory-based models [1], [2],

statistical techniques [3]–[5] that commonly use regression

[6]–[8] and neural networks [9]. One of the limitations of

conventional statistical methods is the increase in complexity

when modelling spatial dependencies that involves the effect

on trafﬁc ﬂow from the surrounding points of interest. To

tackle this, multivariate methods were used that capture

the effect of correlated regions of interest [10], [11]. In

parallel to this, neural networks (NNs) for short-term trafﬁc

prediction were being explored [12]–[14].

Simple NNs are too shallow in structure to capture spatio-

temporal data dependencies efﬁciently. Deep learning has

proven to provide more accurate results in terms of learning

the complex and deep dependencies for the trafﬁc data [15].

For example, deep architectures were employed to predict

trafﬁc ﬂow in [16]. Similarly, [17] used deep architectures to

predict congestion. Deep Belief Networks were introduced

in [18] for trafﬁc ﬂow prediction. Stacked encoders were

used to learn trafﬁc ﬂow features [19] and for trafﬁc data

imputation [20]. Deep convolution NNs were used for trafﬁc

speed prediction in [21]. Authors of [22] introduced the

use of Long Short-Term Memory (LSTM) Networks for

trafﬁc prediction and shown that LSTMs are more accurate

compared to the other models considered in [22].

Considering these deep learning approaches, our work

is related to [22]–[25], which use LSTMs. It is unique

in a way that we further exploit LSTM capabilities that

were not fully utilized in the current approaches. We use

more ﬁne-grained and high resolution data, which makes

training of a single LSTM based model over the whole

highway network challenging because the model parameters

can increase signiﬁcantly. Therefore, we provide a way

to partition the road network and train LSTMs with data

streaming from sensors in those partitions. We also reduce

the complexity of our model by using only strategically

important sensors for prediction.

In this paper, we take a data driven approach to provide

accurate and scalable short-term trafﬁc density prediction for

Motorway Control Systems (MCS). An MCS is part of an

ITS that focuses on monitoring and controlling highways

in a city due to their importance in keeping a smooth

ﬂow in the city. We use data from the MCS in Stockholm

which monitors major highways and provides ﬂow and speed

information per lane every minute measured from radar

sensors (Figure 1) spread only around 150-400 meters apart

from each other (Figure 6).

2018 IEEE International Congress on Big Data

DOI 10.1109/BigDataCongress.2018.00015

Figure 1: Trafﬁc sensors placed on Stockholm highway.

However, even with a relatively small city as Stockholm,

a deep neural network can quite rapidly get very complex

as the number of inputs (sensors) increase. This complexity,

in turn, might lead to infeasibility due to time and resource

requirements for training, updating, and real-time prediction.

We propose, evaluate, and discuss various design choices

and architectures, ﬁrst, to improve the accuracy of predic-

tion, and second, to reduce the complexity (and potentially

the cost) while maintaining the accuracy.

We propose a novel way of exploiting LSTM networks

by partitioning the road network into smaller sections con-

taining on average 20-30 sensors and applying LSTMs on

each section. In particular, we show that after training the

LSTMs with the data from all the sensors, in the operational

phase our technique is capable of successfully predicting

short-term trafﬁc by using only a small fraction of the initial

sensors (on average up to 35% of sensors). This allows to

signiﬁcantly reduce the costs for ITS by deploying only a

small number of permanent sensors and relying on tempo-

rary (mobile) sensors for the training phases only, instead of

deploying a dense network of sensors permanently.

The main contributions of the paper are:

•We provide different prediction models based on deep

learning approach using LSTMs. These include a single

sensor model that is trained only for one sensor and

two multi-sensor models that take into account various

adjacently placed sensors. Moreover, we compare their

prediction quality and execution time.

•We show that taking into account the spatio-temporal

dependencies by using neighbouring sensors in our

multi-sensor models, allows improving the prediction

accuracy.

•We exploit the potential of our deep neural network in

the last model by training it in a way that can help

us predict the trafﬁc of an area by using only a small

fraction of total sensors deployed in that area.

The rest of the paper is organized as follows: Section II

explains the background work. Section III introduces our

models. Section IV contains the experimental methodology.

Section V explain the models in detail and the experimental

results. Finally, conclusion and future work is in Section VI.

II. BAC KG RO UN D

A. Trafﬁc Data

There exist different methods, such as mathematical and

statistical models, simulations and visualization, to study,

understand, and analyze road trafﬁc in order to plan, design

and operate transportation systems. The analysis can be

done at a microscopic scale where individual vehicles are

modelled or at a macroscopic scale where the aggregated

trafﬁc behaviour is being modelled.

The common factor of all methods is that they require

measured trafﬁc data. The main reason is that road trafﬁc

depends on the collective human behaviour, interactions, and

habits which differ widely between different areas because

of various reasons such as the characteristics of the road

users (e.g, age, driving experience), type of vehicles (e.g.,

cars or trucks) and their physical properties, environmental

aspects that affects behaviour (e.g., weather, road shape and

type, nearby points of interests) etc. All this makes analyzing

road trafﬁc more challenging and evolving over time.

Road trafﬁc data consists of a large number of space-

time parameters. In its most basic form, it consists of trafﬁc

counters which count the number of vehicles passing at

speciﬁc points (ﬂow) on the road. Trafﬁc data typically

include other parameters such as speed, vehicle mix (e.g.,

car/truck ratio), road occupancy, origin-destination, vehicle

trajectory. Trafﬁc data can beneﬁt from auxiliary data such as

information about accidents, road work, events and holidays,

weather, and road properties (lanes, type, speed limits).

There exists a variety of sensing techniques used to

collect trafﬁc data. Infrastructure or road-side sensors such

as inductive loops and radars are used to collect macroscopic

ﬂow data at ﬁxed points on the road. GPS and cellular

network data (known as ﬂoating car data) are used to

get vehicle trajectory for microscopic analysis. Bluetooth

sensors and automatic number plate recognition can be used

to obtain origin-destination and trip time information. Many

other techniques such as audio/video based vehicle detection

are also used to obtain trafﬁc data.

Floating car data (FCD) is obtained mainly from partici-

pating passengers carrying cell phones in the vehicle. FCD

can provide a good estimate of the trafﬁc speed but might

fail at providing an accurate estimate of the trafﬁc ﬂow and

density. The main advantages of FCD are the wide coverage

and small cost. Infrastructure sensors are more expensive to

install and maintain and they measure data at a ﬁxed location

limiting their coverage. However, data from infrastructure

sensors are more accurate and complete as they measure

and count all vehicles that pass them in real-time. Because

of the improved accuracy that comes at an increased cost,

infrastructure sensors are typically deployed only on critical

road sections such as highways. Macroscopic trafﬁc data

comes at different aggregation levels. The main parameters

of the aggregation are: 1) the frequency of aggregation

(e.g., ﬂow and average speed per minute vs. per hour). 2)

aggregation over lanes (i.e., data per lane or across all lanes).

3) spacing between sensors (e.g., every kilometer).

B. Elements of Trafﬁc Flow Theory

Trafﬁc ﬂow theory is the study of dynamic trafﬁc be-

haviour over the roads. It depends upon the driver’s reaction

towards different trafﬁc conditions [26], [27]. It is a common

practice to show the trafﬁc behaviours using three trafﬁc

variable, namely: flow q (vehicles per unit time), density

k(vehicles per unit distance) and speed v (distance per

unit time). The relation between these variables can be

represented by the following equation:

q=k×v(1)

Figure 2: The fundamental trafﬁc ﬂow diagram.

Figure 2 plots the relation between qand k. At low

density, the speed does not depend on the density, and

vehicles move with the free ﬂow speed vf. When the density

increases, the ﬂow can reach the maximum value qmax based

on road capacity. The density at this point is called the

critical density kcritical. Beyond this, the speed decreases

because it becomes difﬁcult for vehicles to overtake. Finally,

density reaches to kjam where maximum vehicles that can

ﬁt the road are stuck in a trafﬁc jam. This makes density k

an important parameter to indicate congestion.

C. Long Short Term Memory Networks

Recurrent Neural Networks (RNNs) recently became pop-

ular for learning and capturing latent patterns and behaviour

in the sequential data. In contrast to classic NN, the output

of RNN depends not only on the current input but also

on the previous state of the network, which acts as a

memory. Such conﬁguration makes RNNs naturally suitable

for modelling tasks involving sequential data and time series,

such as: handwriting recognition, natural language process-

ing, speech recognition, machine translation etc. However,

RNNs have major limitations, since in practice RNNs fail

to remember longer dependencies, as well as are difﬁcult to

train due to the vanishing gradient problem [28].

In our work, we use Long Short-Term Memory Networks

(LSTMs) [29] that are a variant of RNNs. LSTMs are ca-

pable of remembering long-term information by differently

computing the hidden state of the network. The hidden state

of LSTMs contains the chain of memory blocks which have

special gates to control the information maintained in each

cell of the memory block, effectively allowing LSTMs to

selectively decide what to keep or erase from the memory.

The outputs of LSTM are calculated by combining the

memory together with the previous state of the network as

well as the current input.

Complexity: The basic LSTM architecture (Figure 4(a))

consists of three layers: input, LSTM, and output layers.

Data from the input layer is fed to the LSTM layer, where

it recurrently ﬂows and the memory cells are updated with

values based on the input, output and forget gates. Next,

data from the output unit is sent to the output layer.

The computational complexity of an LSTM network per

time step and weight of LSTM is O(1) [29]. Therefore,

the learning complexity of it is O(W), where Wis the

number of weights in the network that can be computed

by the equation [30]:

W=n2

c×4 + ni×nc×4 + nc×no+nc×3(2)

Here, ncis the number of memory cells, niis the number

of inputs fed into the LSTM layer and nois the number of

outputs from the LSTM layer.

III. TRA FFIC PREDICTION

In this section, we talk about the input data and the

prediction models used in our work.

A. Time Series Data

Our input data consist of different trafﬁc parameters

measured by the sensors placed on the lanes of highways.

These sensors record the ﬂow qand speed vof vehicles

per minute passing the sensors. The density kis computed

from these parameters using equation1. The density can be

presented in the form of time-series as shown in Figure 3

(a) and (b).

(a) One month data. (b) One week data.

Figure 3: Density values for a sensor per minute.

We can see there is a pattern in weekly data shown in

Figure 3 (b) from a single sensor, where the high peaks

are weekdays and low ones are weekends. Similarly, there

is a pattern with respect to the time of the day. We want

our neural network to learn this density pattern in previous

time stamps and make predictions for future timestamps. To

achieve this, we use LSTMs to remember the pattern in data.

Since the trafﬁc contains spatial dependency, we take

neighbouring sensors into account for learning the trafﬁc

behaviour. In order to do this, we partition the highway net-

work into areas containing long road stretches and junctions.

Next, we deploy our models over these partitions. Details

about choosing the neighbouring sensors and highway par-

titioning are given in Section IV.

B. Model Design

(a) Normal LSTM architecture. (b) Stacked LSTM architecture.

Figure 4: LSTM architectures.

We propose three prediction models: 1) The (1-1) single-

sensor model that takes into account only one sensor and

predicts trafﬁc for the location of that sensor; 2) the (n-

n) multi-sensor model that considers nsensors on a given

area of road and gives predictions for the locations of all

nsensors; and 3) the (m-n) multi-sensor model that uses

only msigniﬁcant sensors from an area to make predictions

for all nsensors. The detailed working of these models is

explained in Section V.

Our models use deep RNNs to capture the complex non-

linear relation in the data more efﬁciently by making use

of the hierarchical layers compared to simple RNN [31].

Stacked LSTMs refers to the architecture where multiple

layers of LSTMs are placed over each other, as shown in

Figure 4 (b), to give a more powerful and deep network

compared to the conventional architecture in Figure 4 (a).

In order to estimate the trafﬁc density, we empirically found

that the stacked architecture improves our results compared

to the normal architecture. We used two layers in our model,

more than two layers did not improve the accuracy due to

over-ﬁtting.

The input density data that we fed into the network is

represented in the form of a space-time window. Consider

an area over the highway containing nsensors, labelled as

S1, S2, S 3, ..., Sn. We take a look-back of Ltime stamps.

If tis the current time stamp, then the look-back of L

time-stamps means t, t −1, t −2, ..., t −Lprevious density

values. Figure 5 shows the input data representation, where

each entry kt,s denotes the density value of a sensor s, i.e,

S1, S2, S 3, ..., SN , at time t, i.e,t, t −1, t −2, ..., t −L.

Figure 5: Input data representation.

The neural network is trained to predict the density

of the respective sensors corresponding to time stamps

t+ 1, t + 2, t + 3, ..., t +P, where Pcontrols the prediction

interval. After experimenting with different values of L, we

chose the value of 10 min. Less than this provided too little

information and resulted in less accuracy. Beyond this made

the input size large and the model did not give any improved

results.

IV. EXP ERI ME NTAL METHODOLOGY

This experimental work is focused on evaluating different

prediction models that we have proposed. Our experiments

are based on answering the following general questions:

•Accuracy: How accurately the road trafﬁc can be

estimated using neural networks?

•Accuracy Reﬁnement: Can the accuracy be improved

by considering the neighbouring sensors?

•Execution Time: Can the execution time (the training

time and prediction time) be improved by reducing the

complexity of a neural network?

•Scalability: How the prediction models can be de-

ployed over the highway network?

We later explain the dataset we use, followed by the

implementation and metrics that we measure.

A. DataSet

We use real-world trafﬁc data set from the Swedish

Transport Administration [32] that consists of readings from

sensors placed on Stockholm highways. Each lane of the

highway contains sensors that are separated by few hundred

meters. We have used one month data, which consists of

sensor readings per minute during that month, i.e, total

44640 minutes readings as shown in Figure 3 (a). The data is

further split into 70% training, 15% validation and 15% test

data. The entire highway network of Stockholm, for which

we have the sensor data, is shown in Figure 6 (a). This

highway network consists of long road stretches connected

together by different junction points. We took one of the long

stretch and one complicated junction for our experiments.

Figure 6 (b) contains the area with long stretch and Figure 6

(a) Stockholm highway. (b) Area 1: Long road stretch. (c) Area 2: Triangular junction.

Figure 6: Sensors placed on Stockholm highways.

B. Implementation

In our experiments we used a system with Intel(R)

Core(TM) i7-4980HQ CPU @ 2.80GHz processor, 16 GB

RAM and macOS 10.13 High Sierra. We built our machine

learning model using Python version 3.6.1. The libraries

used are Keras 2.0.9 and Tensorﬂow 1.3.0.

C. Metrics

We evaluate the following metrics for our models:

•Accuracy: We evaluate the accuracy of prediction

models by computing the Root Mean Square Error

(RMSE) and Mean Absolute Error (MAE) between

the predicted and actual trafﬁc density time series.

•Execution Time: We evaluate the execution time of

models by measuring their training time and predic-

tion time.

•Estimation Interval: We evaluate the change in accu-

racy of prediction models for different time intervals.

V. PREDICTION MOD ELS

In this work, we propose different models for short-term

trafﬁc prediction. For every model we discuss the parameters

that include: 1) The number of sensors a model covers

for prediction, i.e, the output units of the model, 2) the

computational complexity of the model, 3) the input units

used by the model and, 4) the number of memory blocks for

the LSTM network required by the model. Table I contains

values of these parameters for Area 1 (long road stretch) and

Area 2 (triangular junction) shown in Figure 6 (b) and (c).

The number of memory blocks mentioned in the Table I

are for a single LSTM layer, and our models have two

stacked LSTM layers. These memory blocks are empirically

chosen. We pick the number of memory blocks after which

the accuracy stops increasing.

We categorize our models into three types based on the

categorization criterion aforementioned.

A. Single Sensor (1-1) Model

This model works for predicting the trafﬁc density for

a single sensor. The input and output for this model being

trafﬁc density time series from a single sensor make it less

Area 1 Area 2

Model Memory

blocks

Inputs

units

Output

units

Input

units

Output

units

Single Sensor (1-1) 50 1 1 1 1

Multi-Sensor (n-n) 200 33 33 20 20

Multi-Sensor (m-n) 150 10 33 8 20

Table I: Parameters for different prediction models.

complicated because the LSTM network has to deal with one

time series. Figure 7 shows the single sensor model, where

the input density is taken from one sensor S1to estimate

the future density. In this simple model, the prediction only

depends upon the readings of the single sensor, without

considering any neighbouring sensors information.

Figure 7: Single Sensor (1-1) Model.

Experimental Setup: We consider a random sample of

sensors from Area 1 and Area 2 shown in Figure 6 (b)

and (c) for the single sensor model. The model is used for

each sensor and the execution time in terms of its training

and prediction time is measured. Next, the accuracy for

different estimation intervals, i.e, 10 min, 20 min and 30 min

is computed. We measure the accuracy as the Root Mean

Square Error (RMSE) and Mean Absolute Error (MAE). We

compare our model (LSTM-2 with two stacked layers) with

other classical baseline statistical models that include Auto

Regression (AR), Autoregressive Integrated Moving Average

(ARIMA) [3], Support Vector Regression (SVR) [33], and

neural network based models that include, Recurrent Neural

Network (RNN) with two layers, Feed Forward Neural

Network (FFN) with two layers and LSTM-1 with a single

LSTM layer.

Experimental Results: Table II shows the RMSE and

MAE values for different time intervals. As the results

indicate the stacked LSTM neural network (named LSTM-2

in the table) performs better than other prediction models.

The error eventually increases with the increase in estimation

interval. Next, we want to evaluate if the accuracy improves

by taking multiple sensors into account during prediction.

10 min 20 min 30 min

Model RMSE MAE RMSE MAE RMSE MAE

AR 6.87 5.9 7.46 6.31 8.09 6.85

SVR 8.30 7.68 9.19 8.71 10.61 10.19

ARIMA 7.67 6.74 9.40 8.34 10.86 9.81

RNN 5.60 2.63 6.65 3.32 7.75 3.39

FFN 5.62 2.46 6.87 3.46 7.86 3.34

LSTM-1 5.63 2.41 7.13 3.65 7.71 3.73

LSTM-2 5.49 2.41 6.62 3.07 7.60 3.45

Table II: Accuracy of different models for a single sensor.

B. Multi-Sensor (n-n) Model

In the multi-sensor, model we consider an area over the

highway and predict the density values for the sensors that

fall in that area. In this case, the prediction is done by taking

the neighbouring sensors into account. The neighbouring

sensors provide more data for prediction. Figure 8 (a) and (b)

show a road stretch with 10 sensors on it, all these sensors

are taken as input for this model.

This model is complex because the number of inputs and

the number of outputs is equal to the total number of all

the sensors that fall in that area. The more the number

of sensors, the more memory blocks are required and the

greater is the complexity of the model according to Eq. 2.

(a) Road Stretch.

(b) Neural Network.

Figure 8: Multi-sensor (n-n) model.

Experimental Setup: We take sensors that fall in Area 1

(long road stretch) and Area 2 (triangular junction) shown

in Figure 6 (b) and (c). For Area 1 we consider the highway

path going towards North. Area 2 is complicated because it

consists of vehicles going in different directions, making it

hard for the neural network to learn the relation between

sensors. Our experiments show RMSE up to 10 without

partitioning Area 2, which is reduced by a factor of 5

after partitioning, i.e, RMSE ≈2. Therefore, we partition

this area into paths consisting of cars going towards the

same direction. For example one of such paths is shown in

Figure 9. The red path is for cars going towards North from

West and East.

We compare our model (LSTM-2 with two stacked layers)

with neural network based models that include: Recurrent

Neural Network (RNN) with two layers, Feed Forward

Neural Network (FFN) with two layers and LSTM-1 with

single LSTM layer. We did not use statistical models because

of their poor accuracy results in the previous experiment

(Section V-A) and their complexity to implement a multi-

variate model.

Figure 9: Area 2: Path in the triangular junction.

The road section of the considered paths can be divided

further into three sections: 1) the entrance: it consists of

beginning two groups of sensors (a group contains sensors

from all the lanes placed at the same distance reference), 2)

the exit: it consists of last two groups and 3) the middle: it

contains all the remaining sensors. We evaluate our model

over these section of roads.

Experimental Results: Tables III and IV show the aver-

age RMSE and MSE values for Area 1 (long road stretch)

and Area 2 (triangular junction) over different estimation

intervals. According to our results, stacked LSTM with two

layers (LSTM-2) has better accuracy in most of the cases

compared to other models.

In order to check the accuracy distribution along areas, we

measure the prediction accuracy at the entrance, middle and

exit sections of areas. Figure 10 and 11 contain RMSE for

Area 1 (long road stretch) and Area 2 (triangular junction)

over 10 min, 20 min and 30 min estimation intervals. For

both areas, the error is higher at the entrance of an area,

followed by the middle section of the highway area; whereas,

the exit section has the lowest error. Furthermore, the error

is increasing with the increase in estimation interval. This

increase is more for the entrance section compared to other

sections. The reason for the least prediction error at the

exit section because the model has more information for

prediction towards the end of the area. Stacked LSTM model

(LSTM-2) has less error compared to others.

10 min 20 min 30 min

Model RMSE MAE RMSE MAE RMSE MAE

RNN 3.33 2.35 4.6 3.41 5.0 3.49

FFN 3.36 2.36 4.4 3.39 5.2 3.44

LSTM-1 3.14 2.22 3.8 2.67 4.1 3.29

LSTM-2 2.94 2.06 2.94 2.06 3.22 2.24

Table III: Prediction accuracy for multiple sensor using

different models in Area 1, (long road stretch).

10 min 20 min 30 min

Model RMSE MAE RMSE MAE RMSE MAE

RNN 2.48 1.49 2.52 1.51 2.67 1.07

FFN 2.48 1.45 2.56 1.52 3.04 1.53

LSTM-1 2.38 1.40 2.46 1.45 2.57 1.55

LSTM-2 2.35 1.43 2.45 1.49 2.51 1.52

Table IV: Prediction accuracy for different models for mul-

tiple sensor in Area 2, (triangular junction).

C. Multi-Sensor (m-n) Model

The multi-sensor (m-n) model is a variant of the multi-

sensor (n-n) model introduced in V-B using the stacked

LSTM (LSTM-2) model. Instead of nsensors that fall in

the area under consideration, we take only msensors from

those nsensors and predict the output for all nsensors. The

sensors in the mset include boundary sensors, and sensors

located at exits and entry points to the highway. The reason

to include those sensors is that they are more important in

terms of affecting the trafﬁc ﬂow. Intuitively, if we know the

behaviour of cars entering and exiting the highway, we have

to guess what happens inside the highway. Therefore, we

consider the entry and exit sensors as inputs for our neural

networks to predict density for all the sensors. Figure 12 (a)

and (b) show a road stretch with 10 sensors on it, only the

boundary sensors S1, S2, S 9, S10, and the sensors located

at entry and exit points S3and S8, are taken as input for

this model. In this way, we reduce the complexity of the

neural network by reducing the number of inputs units and

the memory blocks based on Eq. 2.

Experimental Setup: The experimental setup for the

Multi-sensor (m-n) model is similar to one used for the

Multi-sensor (n-n) in Section V-B. The purpose of this

experiment is to evaluate if we can reduce the complexity of

the LSTM network by limiting the number of input sensors

without compromising the prediction accuracy.

Experimental Results: Figure 13 (a) and (b) show RMSE

of the (m-n) model for Area 1 and Area 2 over 10 min, 20

min and 30 min estimation intervals. The error is increasing

with the increase in estimation interval. The entrance has

highest error compared to other sections.

Congestion Detection: Our density predictions are useful

for detecting congestion in the road trafﬁc. From our experi-

ments we ﬁnd the critical density, kcritical (see Figure 2), to

be between 35 and 40 vehicles per km. Using our model, we

were able to correctly predict congestion, i.e, density values

near to kjam (see Figure 2), 94% of the time.

D. Comparison

Now that we know the accuracy of our models, we want

to know how fast do they perform. For this reason, we

compared the execution time of our models by measuring the

training time and prediction time, shown in Figure 14. The

single sensor (1-1) model is fast because it is considering

one sensor at a time. It might take longer execution time if

we run several such models together for multiple predictions

over limited resources of a system. The (m-n) multi-sensor

model takes less training and prediction time compared to

the (n-n) multi-sensor model. This is because the (m-n)

model has less input and memory units which reduce its

complexity and improve its execution time.

E. Discussion

Our experimental results show that using neighbourhood

sensor information gives higher prediction accuracy than

using a single sensor data. This is because the neural network

is fed with more information. It learns the behaviour of

trafﬁc better by using sensors placed together over a path of

the highway. The reading of sensors placed at the entrance of

highway indicates the trafﬁc conditions that will propagate

towards the middle and exit sensors. In other words, model

learns more information for the middle and the exit section.

Therefore, the prediction for these sections is better than

the entry section. Additionally, we observed that improving

the complexity of a model by reducing the input units

and memory units improves its execution time. Such lower

complexity model has a strong potential to be applied within

edge computing domain in the future.

VI. CONCLUSION AND FUTURE WO RK

Our work comprises of three prediction models for esti-

mating trafﬁc density using stacked LSTM neural networks.

We have implemented and compared these models over

different sections of Stockholm highways using real datasets.

Our multi-sensor (m-n) model that uses input readings

from only msigniﬁcant sensors rather than all nsensors,

predicts density for all nsensors with acceptable accuracy

comparable to the multi-sensor (n-n) model, which takes

input from all nsensors. Initially, all sensors are required to

train the model, and after training only signiﬁcant sensors

can be kept for prediction over all sensors with acceptable

accuracy. To train the model, temporary sensors can be

deployed together with signiﬁcant sensors and then the

former can be removed or shut down. This allows reducing

the number of sensors and saving the infrastructure cost.

0.0

2.5

5.0

7.5

10.0

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(a) RNN

0.0

2.5

5.0

7.5

10.0

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(b) FNN

0.0

2.5

5.0

7.5

10.0

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

0.0

2.5

5.0

7.5

10.0

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(d) LSTM-2

Figure 10: RMSE of (n-n) models for the Area 1: long road stretch.

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(a) RNN

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(b) FNN

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(d) LSTM-2

Figure 11: RMSE of (n-n) models for Area 2: triangular junction.

(a) Road Stretch.

(b) Neural Network.

Figure 12: Multi-sensor (m-n) model.

Our future work includes investigating on how accuracy

depends on the size of road segments and the number of

sensors. We will also research on how aggregation levels im-

pact accuracy. We expect that ﬁne-grained aggregation used

in this paper, captures more details but is more challenging

to predict due to high noise levels compared to a smoother

coarse-grain aggregation that captures only general trends.

We intend to develop a method to optimally partition the

road network and to place sensors in order to achieve high

prediction accuracy while lowering the infrastructure cost.

0.0

2.5

5.0

7.5

10.0

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(a) RMSE (Area 1)

10min 20min 30min

Prediction Interval

RMSE

Entrance Middle mExit

Exit

(b) RMSE (Area 2)

Figure 13: RMSE of (m-n) model for different road sections

of (Area 1: long road stretch and Area 2: triangular junction).

100

150

Model−1 Model−2 Model−3

times_divsion

Time (s)

Prediction Tranining

1-1

m-n

n-n

Figure 14: Execution Time Comparison.

ACKNOWLEDGMENT

This work was supported by the project BADA: Big Auto-

motive Data Analytics in the funding program FFI: Strategic

Vehicle Research and Innovation (grant 2015-00677) ad-

ministrated by VINNOVA the Swedish government agency

for innovation systems, by the project BIDAF: Big Data

Analytics Framework for a Smart Society (grant 20140221)

funded by KKS the Swedish Knowledge Foundation, and by

the Erasmus Mundus Joint Doctorate in Distributed Com-

puting (EMJD-DC) programme funded by the Education,

Audiovisual and Culture Executive Agency (EACEA) of the

European Commission under FPA 2012-0030.

REFERENCES

[1] C. F. Daganzo, “The cell transmission model: A dynamic

representation of highway trafﬁc consistent with the hydro-

dynamic theory,” Transportation Research Part B: Method-

ological, vol. 28, no. 4, pp. 269–287, 1994.

[2] A. Skabardonis and N. Geroliminis, “Real-time estimation of

travel times on signalized arterials,” Tech. Rep., 2005.

[3] M. S. Ahmed and A. R. Cook, “Analysis of freeway trafﬁc

time-series data by using box-jenkins techniques,” Trans-

portation Research Record Journal of the Transportation

Research Board, no. 722, 1979.

[4] B. M. Williams and L. A. Hoel, “Modeling and forecasting

vehicular trafﬁc ﬂow as a seasonal arima process: Theoretical

basis and empirical results,” Journal of transportation engi-

neering, vol. 129, no. 6, pp. 664–672, 2003.

[5] N. Juri, A. Unnikrishnan, and S. Waller, “Integrated trafﬁc

simulation-statistical analysis framework for online prediction

of freeway travel time,” Transportation Research Record:

Journal of the Transportation Research Board, no. 2039, pp.

24–31, 2007.

[6] P. E. Pfeifer and S. J. Deutrch, “A three-stage iterative

procedure for space-time modeling phillip,” Technometrics,

vol. 22, no. 1, pp. 35–47, 1980.

[7] S. Clark, “Trafﬁc prediction using multivariate nonparametric

regression,” Journal of transportation engineering, vol. 129,

no. 2, pp. 161–168, 2003.

[8] H. Sun, H. X. Liu, H. Xiao, R. R. He, and B. Ran, “Short term

trafﬁc forecasting using the local linear regression model,” in

82nd Annual Meeting of the Transportation Research Board,

Washington, DC, 2003.

[9] M. G. Karlaftis and E. I. Vlahogianni, “Statistical methods

versus neural networks in transportation research: Differ-

ences, similarities and some insights,” Transportation Re-

search Part C: Emerging Technologies, vol. 19, no. 3, pp.

387–399, 2011.

[10] A. Stathopoulos and M. G. Karlaftis, “A multivariate state

space approach for urban trafﬁc ﬂow modeling and predic-

tion,” Transportation Research Part C: Emerging Technolo-

gies, vol. 11, no. 2, pp. 121–135, 2003.

[11] B. Williams, “Multivariate vehicular trafﬁc ﬂow prediction:

evaluation of arimax modeling,” Transportation Research

Record: Journal of the Transportation Research Board, no.

1776, pp. 194–200, 2001.

[12] M. S. Dougherty, H. R. Kirby, and R. D. Boyle, “The use of

neural networks to recognise and predict trafﬁc congestion,”

Trafﬁc engineering & control, vol. 34, no. 6, 1993.

[13] P. Vythoulkas, “Alternative approaches to short term trafﬁc

forecasting for use in driver information systems,” Trans-

portation and trafﬁc theory, vol. 12, pp. 485–506, 1993.

[14] H. Zhang, “Recursive prediction of trafﬁc conditions with

neural network models,” Journal of Transportation Engineer-

ing, vol. 126, no. 6, pp. 472–481, 2000.

[15] Y. Bengio et al., “Learning deep architectures for ai,” Foun-

dations and trends® in Machine Learning, vol. 2, no. 1, pp.

1–127, 2009.

[16] N. G. Polson and V. O. Sokolov, “Deep learning for short-

term trafﬁc ﬂow prediction,” Transportation Research Part C:

Emerging Technologies, vol. 79, pp. 1–17, 2017.

[17] X. Ma, H. Yu, Y. Wang, and Y. Wang, “Large-scale trans-

portation network congestion evolution prediction using deep

learning theory,” PloS one, vol. 10, no. 3, p. e0119044, 2015.

[18] W. Huang, G. Song, H. Hong, and K. Xie, “Deep architecture

for trafﬁc ﬂow prediction: deep belief networks with multitask

learning,” IEEE Trans. on Intelligent Transportation Systems,

vol. 15, no. 5, pp. 2191–2201, 2014.

[19] Y. Lv, Y. Duan, W. Kang, Z. Li, and F.-Y. Wang, “Trafﬁc ﬂow

prediction with big data: a deep learning approach,” IEEE

Transactions on Intelligent Transportation Systems, vol. 16,

no. 2, pp. 865–873, 2015.

[20] Y. Duan, Y. Lv, W. Kang, and Y. Zhao, “A deep learning

based approach for trafﬁc data imputation,” in Intelligent

Transportation Systems (ITSC), IEEE 17th International Con-

ference on. IEEE, 2014, pp. 912–917.

[21] X. Ma, Z. Dai, Z. He, J. Ma, Y. Wang, and Y. Wang, “Learn-

ing trafﬁc as images: a deep convolutional neural network for

large-scale transportation network speed prediction,” Sensors,

vol. 17, no. 4, p. 818, 2017.

[22] X. Ma, Z. Tao, Y. Wang, H. Yu, and Y. Wang, “Long short-

term memory neural network for trafﬁc speed prediction using

remote microwave sensor data,” Transportation Research Part

C: Emerging Technologies, vol. 54, pp. 187–197, 2015.

[23] Z. Zhao, W. Chen, X. Wu, P. C. Chen, and J. Liu, “Lstm

network: a deep learning approach for short-term trafﬁc

forecast,” IET Intelligent Transport Systems, vol. 11, no. 2,

pp. 68–75, 2017.

[24] Y. Wu and H. Tan, “Short-term trafﬁc ﬂow forecasting with

spatial-temporal correlation in a hybrid deep learning frame-

work,” arXiv preprint arXiv:1612.01022, 2016.

[25] M. Fouladgar, M. Parchami, R. Elmasri, and A. Ghaderi,

“Scalable deep trafﬁc ﬂow neural networks for urban trafﬁc

congestion prediction,” in International Joint Conference on

Neural Networks (IJCNN). IEEE, 2017, pp. 2251–2258.

[26] G. Whitham, “On kinematic waves ii. a theory of trafﬁc ﬂow

on long crowded roads,” in Proc. R. Soc. Lond. A, vol. 229,

no. 1178. The Royal Society, 1955, pp. 317–345.

[27] P. I. Richards, “Shock waves on the highway,” Operations

research, vol. 4, no. 1, pp. 42–51, 1956.

[28] Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term

dependencies with gradient descent is difﬁcult,” IEEE trans.

on neural networks, vol. 5, no. 2, pp. 157–166, 1994.

[29] S. Hochreiter and J. Schmidhuber, “Long short-term mem-

ory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.

[30] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory

recurrent neural network architectures for large scale acoustic

modeling,” in Fifteenth annual conference of the international

speech communication association, 2014.

[31] M. Hermans and B. Schrauwen, “Training and analysing deep

recurrent neural networks,” in Advances in neural information

processing systems, 2013, pp. 190–198.

[32] Traﬁkverket, https://www.traﬁkverket.se/, 2010.

[33] H. Drucker, C. J. Burges, L. Kaufman, A. J. Smola, and

V. Vapnik, “Support vector regression machines,” in Advances

in neural information processing systems, 1997, pp. 155–161.

Traffic Stream Characteristics Estimation Using In-Pavement Sensor Network

Conference Paper

Full-text available

Nov 2023

Mu’ath A Al-Tarawneh

The numbers of vehicles on the roads has increased tremendously. Also, the number of roads that are constantly experiencing traffic jams during morning and evening peak hours has increased significantly, which calls for a better understanding of traffic stream characteristics and car-following models. Traffic stream macroscopic parameters (speed, flow, and density) could be estimated through a number of traffic-flow theory models. In order to collect accurate data regarding fundamental of traffic stream parameters, a traffic monitoring system is needed to present the data from different roads. In this study, a real-time traffic monitoring system is introduced for traffic macroscopic parameters estimation. The sensor network has been constructed using a set of linear fiber optic sensors. In order to validate the system for this study, the system was installed at MnROAD facility, Minnesota. Fiber optic sensor detects the propagated strains in highway pavement due to the vehicle movements through the changes of the laser beam characteristics. Traffic flow can be estimated by tracking the peak of each axle passed over the sensor or within the sensitivity area, time mean speed (TMS), and space mean speed (SMS). SMS can be estimated by the different times a vehicle arrived at the sensors. The density can be determined either by using fundamental traffic flow theory model or estimating the time that vehicles occupy the sensor layout. Real traffic was used to validate the sensor layout. The results show the capability of the system to estimate traffic stream characteristics successfully

Estimating Toll Road Travel Times Using Segment-Based Data Imputation

Article

Full-text available

Aug 2023

Efficient and sustainable transportation is crucial for addressing the environmental and social challenges associated with urban mobility. Accurate estimation of travel time plays a pivotal role in traffic management and trip planning. This study focused on leveraging machine learning models to enhance travel time estimation accuracy on toll roads under diverse traffic conditions. Two models were developed for travel time estimation under a variety of traffic conditions on the Don Muang Tollway, Bangkok, Thailand: a long short-term memory (LSTM) recurrent neural network model and a support vector regression (SVR) model. Missing data were treated using the proposed segment-based data imputation method. Unlike other studies, the effects of missing input data on the travel time model performance were also analyzed. Traffic parameters, such as speed and flow, along with other relevant parameters (time of day, day of the week, holiday indicators, and a missing data indicator), were fed into each model to estimate travel time on each of the four specific routes. The LSTM and SVR results had similar performance levels based on evaluating the all-day pooled data. However, the mean absolute percentage errors were lower for LSTM during peak periods, while SVR performed slightly better during off-peak periods. Additionally, LSTM coped substantially better than SVR with unusual traffic fluctuations. The sensitivity analysis of the missing input data in this study also revealed that the LSTM model was more robust to the high degree of missing data than the SVR model.

AI-Integrated Traffic Information System: A Synergistic Approach of Physics Informed Neural Network and GPT-4 for Traffic Estimation and Real-Time Assistance

Article

Full-text available

Jan 2024

Traffic management systems have primarily relied on live traffic sensors for real-time traffic guidance. However, this dependence often results in uneven service delivery due to the limited scope of sensor coverage or potential sensor failures. This research introduces a novel approach to overcome this limitation by synergistically integrating a Physics-Informed Neural Network-based Traffic State Estimator (PINN-TSE) with a powerful Natural Language Processing model, GPT-4. The purpose of this integration is to provide a seamless and personalized user experience, while ensuring accurate traffic density prediction even in areas with limited data availability. The innovative PINN-TSE model was developed and tested, demonstrating a promising level of precision with a Mean Absolute Error of less than four vehicles per mile in traffic density estimation. This performance underlines the model’s ability to provide dependable traffic information, even in regions where conventional traffic sensors may be sparsely distributed or data communication is likely to be interrupted. Furthermore, the incorporation of GPT-4 enhances user interactions by understanding and responding to inquiries in a manner akin to human conversation. This not only provides precise traffic updates but also interprets user intentions for a tailored experience. The results of this research showcase an AI-integrated traffic guidance system that outperforms traditional methods in terms of traffic estimation, personalization, and reliability. While the study primarily focuses on a single road segment, the methodology shows promising potential for expansion to network-level traffic guidance, offering even greater accuracy and usability. This paves the way for a smarter and more efficient approach to traffic management in the future.

Recurrent Neural Network-Based Energy Management System in Electric Vehicle Application with Hybrid Energy Sources

Chapter

Full-text available

Apr 2024

Uncertainty-Aware Traffic Prediction using Attention-based Deep Hybrid Network with Bayesian Inference

Article

Full-text available

Jul 2023

Traffic congestion has an adverse impact on the economy and quality of life and thus accurate traffic flow forecasting is critical for reducing congestion and enhancing transportation management. Recently, hybrid deep-learning approaches show promising contributions in prediction by handling various dynamic traffic features. Existing methods, however, frequently neglect the uncertainty associated with traffic estimates, resulting in inefficient decision-making and planning. To overcome these issues, this research presents an attention-based deep hybrid network with Bayesian inference. The suggested approach assesses the uncertainty associated with traffic projections and gives probabilistic estimates by applying Bayesian inference. The attention mechanism improves the ability of the model to detect unexpected situations that disrupt traffic flow. The proposed method is tested using real-world traffic data from Dhaka city, and the findings show that it outperforms than other cutting-edge approaches when used with real-world traffic statistics.

Fast Machine Learning-Based High Fidelity Mesoscopic Modeling Tool for Traffic Simulation

Conference Paper

Jun 2024

A Systematic and Comprehensive Study on Machine Learning and Deep Learning Models in Web Traffic Prediction

Article

Apr 2024
ARCH COMPUT METHOD E

The practice of predicting the traffic that is headed toward a specific website is known as web traffic prediction. To govern a network, network traffic forecasting is crucial. Since clients could experience long wait times and leave a website without a suitable demand prediction, web service providers must evaluate web traffic on a web server very carefully. It is an objective that predicting network traffic is a proactive way to assure safe, dependable, and high-quality network communication. The aim of this paper is to find out the algorithms that can be best fitted for web traffic prediction. If the traffic is more than the server can handle, then it will show error to the people who are reaching the website. So, it becomes difficult to handle a large amount of traffic. One option is we can increase the number of servers but for this to know how many servers should be increased we have to forecast the web traffic. This is one of the applications of web traffic forecasting. To improve traffic control decisions, it is necessary to estimate future web traffic. In this paper, we have discussed the most efficient algorithms that can be utilized for web traffic prediction. Here, SVM, LSTM, and ARIMA are discussed which are comparatively more efficient and optimized algorithms. Many algorithms can be used to predict this website traffic, but the algorithms discussed in this paper are found to be more optimized. So, overall this algorithm can be used for website prediction with great efficiency. These algorithms are found to be quite fast as compared to others and they also give a good accuracy score. So, the results show that the prediction precision is high if these algorithms are utilized.

Wind Power Forecast Model Performance Enhancement System Using Hybrid Approach

Conference Paper

Apr 2024

Krishan Kumar

Performance enhancement of wind power forecast model using novel pre‐processing strategy and hybrid optimization approach

Article

Dec 2023

Due to the energy crisis and environmental concerns, wind power has seen a considerable increase in use over the past 10 years as a source of renewable energy. Since wind is intermittent and variable, it is obvious that as penetration levels rise, the influence of wind power generation on the electric power system must be considered. Wind power forecasting is essential because large‐scale wind power integration will make it more difficult to plan, operate, and control the power system. An accurate forecast is an efficient way to deal with the operational problems brought on by wind variability. To better utilize wind energy resources, it is essential to increase prediction accuracy. Frequently, the noise in the dataset causes the ramp events to be misclassified or overestimated. The main emphasis of this study is the pre‐processing of wind power data that produces precise time‐series data while reducing noise or artifact content and maintaining the swing feature of the original data. For this task, the data augmentation approach is proposed, where the augmented data (synthetic data) will be added up with the training data, in such a way as to make the strategy useful for real‐time applications. As the next step, the significant features, such as higher‐order statistical features and lower‐order statistical features are extracted. The extracted features act as the input to the recurrent neural network (RNN) classifier, the weights of which are tuned using the proposed honey badger crow (HBCro) optimization algorithm. The proposed HBCro optimization algorithm acts as the major contribution of the proposed model, and it is modeled with the integration of the concepts of the crow search optimization (CSO) algorithm and the honey badger optimization algorithm (HBA). The proposed system is simulated in MATLAB and the effectiveness of the proposed method is validated by comparison with other conventional methods in terms of Error measures. Furthermore, the developed HBCro‐based RNN obtained efficient performance in terms of MSE, MAE, RMSE, RMPSE, MAPE, MARE, MSRE, and RMSRE of 0.1578, 0.0442, 0.2102, 123.238, 124.72, 0.9944, 1.1799, and 1.0732, respectively.

Short-term traffic forecasting model – prevailing trends and guidelines

Article

Full-text available

Dec 2022

The design parameters serve as an integral part of developing a robust short-term traffic forecasting model. These parameters include scope determination, input data preparation, output parameters, and modelling techniques. This paper takes a further leap to analyse the recent trend of design parameters through a Systematic Literature Review (SLR) based on peer-reviewed articles up to 2021. The key important findings are summarised along with the challenges to performing short-term traffic forecasting. Intuitively, this paper offers insights into the next wave of research that contributes significantly to industries.

Scalable deep traffic flow neural networks for urban traffic congestion prediction

Conference Paper

Full-text available

May 2017

Scalable Deep Traffic Flow Neural Networks for Urban Traffic Congestion Prediction

Article

Full-text available

Mar 2017

Tracking congestion throughout the network road is a critical component of Intelligent transportation network management systems. Understanding how the traffic flows and short-term prediction of congestion occurrence due to rush-hour or incidents can be beneficial to such systems to effectively manage and direct the traffic to the most appropriate detours. Many of the current traffic flow prediction systems are designed by utilizing a central processing component where the prediction is carried out through aggregation of the information gathered from all measuring stations. However, centralized systems are not scalable and fail provide real-time feedback to the system whereas in a decentralized scheme, each node is responsible to predict its own short-term congestion based on the local current measurements in neighboring nodes. We propose a decentralized deep learning-based method where each node accurately predicts its own congestion state in real-time based on the congestion state of the neighboring stations. Moreover, historical data from the deployment site is not required, which makes the proposed method more suitable for newly installed stations. In order to achieve higher performance, we introduce a regularized Euclidean loss function that favors high congestion samples over low congestion samples to avoid the impact of the unbalanced training dataset. A novel dataset for this purpose is designed based on the traffic data obtained from traffic control stations in northern California. Extensive experiments conducted on the designed benchmark reflect a successful congestion prediction.

Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed Prediction

Article

Full-text available

Apr 2017
SENSORS-BASEL

This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.

Support vector regression machines

Article

Full-text available

Jan 1997
Adv Neural Inform Process Syst

Deep learning for short-term traffic flow prediction

Article

Jun 2017
TRANSPORT RES C-EMER

We develop a deep learning model to predict traffic flows. The main contribution is development of an architecture that combines a linear model that is fitted using regularization and a sequence of layers. The challenge of predicting traffic flows are the sharp nonlinearities due to transitions between free flow, breakdown, recovery and congestion. We show that deep learning architectures can capture these nonlinear spatio-temporal effects. The first layer identifies spatio-temporal relations among predictors and other layers model nonlinear relations. We illustrate our methodology on road sensor data from Interstate I-55 and predict traffic flows during two special events; a Chicago Bears football game and an extreme snowstorm event. Both cases have sharp traffic flow regime changes, occurring very suddenly, and we show how deep learning provides precise short term traffic flow predictions.

LSTM Network: A Deep Learning Approach For Short-Term Traffic Forecast

Article

Jan 2017

Short-term traffic forecast is one of the essential issues in intelligent transportation system. Accurate forecast result enables commuters make appropriate travel modes, travel routes, and departure time, which is meaningful in traffic management. To promote the forecast accuracy, a feasible way is to develop a more effective approach for traffic data analysis. The availability of abundant traffic data and computation power emerge in recent years, which motivates us to improve the accuracy of short-term traffic forecast via deep learning approaches. A novel traffic forecast model based on long short-term memory (LSTM) network is proposed. Different from conventional forecast models, the proposed LSTM network considers temporal-spatial correlation in traffic system via a two-dimensional network which is composed of many memory units. A comparison with other representative forecast models validates that the proposed LSTM network can achieve a better performance.

Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework

Article

Dec 2016

Deep learning approaches have reached a celebrity status in artificial intelligence field, its success have mostly relied on Convolutional Networks (CNN) and Recurrent Networks. By exploiting fundamental spatial properties of images and videos, the CNN always achieves dominant performance on visual tasks. And the Recurrent Networks (RNN) especially long short-term memory methods (LSTM) can successfully characterize the temporal correlation, thus exhibits superior capability for time series tasks. Traffic flow data have plentiful characteristics on both time and space domain. However, applications of CNN and LSTM approaches on traffic flow are limited. In this paper, we propose a novel deep architecture combined CNN and LSTM to forecast future traffic flow (CLTFP). An 1-dimension CNN is exploited to capture spatial features of traffic flow, and two LSTMs are utilized to mine the short-term variability and periodicities of traffic flow. Given those meaningful features, the feature-level fusion is performed to achieve short-term forecasting. The proposed CLTFP is compared with other popular forecasting methods on an open datasets. Experimental results indicate that the CLTFP has considerable advantages in traffic flow forecasting. in additional, the proposed CLTFP is analyzed from the view of Granger Causality, and several interesting properties of CLTFP are discovered and discussed .

The use of neural networks to recognise and predict traffic congestion

Article

Jan 1993

This paper shows how neuro computing can assist the pattern recognition of two main road system problems, namely the state of a road system and short term forecasting. Recognising congestion is subjective and is difficult to express such decision making algorithmically. There is also a need for rapid short term prediction. The benefit of a neural network is that it absorbs patterns in data and so can learn to generalise. The main features of a neural network approach, the trials of its application to a congestion recognition problem (based on Leicester SCOOT data), and the trials of short term forecasting of flows (again with SCOOT data) are all described. Included are graphs of predictions made. The paper goes on to discuss whether neural networks can be used to infer parameters not directly measureable on the street. -D.Jarratt

The cell transmission model: A dynamic representation of highway traffic consistent with the hydrodynamic theory

Article

Jan 1994
Transport Res

C.F. Daganzo

Training and analysing deep recurrent neural networks

Article

Short-Term Traffic Prediction Using Long Short-Term Memory Neural Networks

Figures

Recommended publications

A New Approach to the Weights of Predict Traffic Flow Based on Internet of Things Technology

Detecting Sybil Attacks in VANET: Exploring Feature Diversity and Deep Learning Algorithms with Insi...

Analysis and Comparison of LSTM Short-Term Traffic Prediction Performance

Intelligent Substation Communication Traffic Prediction Based on QAPSO-RBF Neural Network