ArticlePDF Available

Understanding urban bus travel time: Statistical analysis and a deep learning prediction

World Scientific
International Journal of Modern Physics B
Authors:

Abstract and Figures

Travel time reliability plays a key role in bus scheduling and service quality. Owing to various stochastic factors, buses often suffer from traffic congestion, delay and bunching, which leads to disturbances of travel time. Automatic vehicle location (AVL) could record the spatiotemporal information of buses, making it possible to understand the status of bus service. In this paper, we specifically analyze the statistical characteristics of travel time based on historic AVL data. Moreover, a Kalman filter-LSTM deep learning is proposed to estimate bus travel time. Numerical tests indicate that the travel time of bus routes shows a left-skewed and right-tail pattern with a good fit of the lognormal distribution. The bus service reliability fluctuates largely in the peak hours, especially the morning peak. Bus bunching and large bus time headway easily occur, and once it occurs, it will continue until destination. The Kalman filter-LSTM model outperforms the ensemble learning methods to predict travel time. This study could provide implications for transit schedule optimization to improve the bus service quality.
Content may be subject to copyright.
November 8, 2022 11:56 IJMPB S0217979223500340 page 1
FA
International Journal of Modern Physics B
Vol. 37, No. 4 (2023) 2350034 (23 pages)
©World Scientific Publishing Company
DOI: 10.1142/S0217979223500340
Understanding urban bus travel time: Statistical analysis
and a deep learning prediction
Yanjun Liu, Hui Zhang, Jianmin Jia and Baiying Shi
School of Transportation Engineering,
Shandong Jianzhu University, Jinan 250101, P. R. China
zhanghui@sdjzu.edu.cn
Wei Wang
School of Economics, Ocean University of China,
Qingdao 266100, P. R. China
walker@ouc.edu.cn
Received 11 June 2022
Revised 22 July 2022
Accepted 5 August 2022
Published 7 September 2022
Travel time reliability plays a key role in bus scheduling and service quality. Owing to
various stochastic factors, buses often suffer from traffic congestion, delay and bunch-
ing, which leads to disturbances of travel time. Automatic vehicle location (AVL) could
record the spatiotemporal information of buses, making it possible to understand the
status of bus service. In this paper, we specifically analyze the statistical characteristics
of travel time based on historic AVL data. Moreover, a Kalman filter-LSTM deep learn-
ing is proposed to estimate bus travel time. Numerical tests indicate that the travel time
of bus routes shows a left-skewed and right-tail pattern with a good fit of the lognormal
distribution. The bus service reliability fluctuates largely in the peak hours, especially
the morning peak. Bus bunching and large bus time headway easily occur, and once it
occurs, it will continue until destination. The Kalman filter-LSTM model outperforms
the ensemble learning methods to predict travel time. This study could provide impli-
cations for transit schedule optimization to improve the bus service quality.
Keywords: Public transport service; AVL data; travel time prediction; deep learning.
PACS numbers: 05.45.Tp, 07.07.Df, 89.90.+n
1. Introduction
During the past two decades, China has experienced fast urbanization, which brings
an explosive growth of people in many cities. Traffic congestion and air pollution
Corresponding author.
2350034-1
November 8, 2022 11:56 IJMPB S0217979223500340 page 2
FA
Y. Liu et al.
have become two critical problems in urban development. Public transport is rec-
ognized as one of the most effective ways to alleviate traffic congestion and air
pollution. Bus priority policies have been made in many large cities to promote
the use of bus systems.1–3 However, the proportion of bus users is at a low level
because of the bad service quality.4,5 Influenced by many factors, such as traffic
jam, traffic signal and weather condition, the travel time of bus is unstable. Passen-
gers may spend more time during the peak hours, which leads to transfer failure or
delay. It is better for operators to understand the travel time in advance to arrange
vehicles.
Typically , bus travel time contains two parts: travel time between stops
and dwell time. The travel time between stops is affected by road traffic condi-
tion and traffic signal, while dwell time is mainly influenced by the number of
boarding and alighting passengers. There were a large number of researches focus-
ing on the travel time analysis.6–9 Bus travel reliability is one of the most critical
problems when buses run on the road. Bus bunching is a classic unstable status that
two or more buses of a route are too close for many reasons such as more boarding
passengers or bad traffic conditions. Bus bunching could lead to long waiting time
and capacity waste, which reduces the enthusiasm of passengers to use bus systems.
Many researchers pay attention to mitigating bus bunching such as stop-skipping,
speed guidance and signal adjustment.10–12 Recently, applications of big spatiotem-
poral traffic data have drawn much attention.13–15 Big data technologies provide
powerful tools to understand the mechanism of bus travel with a large amount
of historic automatic vehicle location (AVL) data. Kathuria et al.16 analyzed the
travel time variability of bus rapid transit system with GPS data. Chepuri et al.17
proposed new reliability measures for bus routes with trajectory data. With the
help of AVL data, various strategies have been proposed to enhance the reliability
of buses with simulations and models.18–20 Nowadays, machine learning and deep
learning provide better ways to forecast travel time compared with traditional mod-
els. Serin et al.21 proposed a hybrid time series forecasting method to predict bus
travel time and found that it outperforms traditional approaches. Yuan et al.22 pro-
posed RNN and DNN deep learning methods to evaluate bus travel time. During
the past decades, many kinds of machine learning and deep learning methods have
been used to forecast bus travel time, such as recurrent neural network,23 Kalman
filter,24 LSTM,25 etc.
Accuracy is one of the most significant measures to assess the prediction method.
Studies show that hybrid deep learning methods consistently outperform tradi-
tional models.26 Inspired by that, we propose a Kalman filter-LSTM model to pre-
dict bus travel time. Categorical Boosting (CatBoost), Extreme Gradient Boosting
(XGBoost) and Light Gradient Boosting (LightGBM) are adopted as comparisons.
Besides that, this paper will analyze the whole process of travel time, including
travel time between stops, dwell time and travel reliability based on historic AVL.
This study could provide implications for transit authorities to promote the use of
public transport for passengers.
2350034-2
November 8, 2022 11:56 IJMPB S0217979223500340 page 3
FA
Understanding urban bus travel time
The main contributions are as follows.
To make better use of AVL data and analyze the whole process of bus driving
from the micro point of view, we propose a statistical framework to analyze the
bus travel time and reliability.
To assist bus companies in dispatching vehicles, we propose a Kalman filter-
LSTM deep learning method to forecast bus travel time.
The remaining paper is outlined as follows. Section 2 presents the literature
review. Section 3 gives the methodology. Section 4 describes the data used in this
paper. Section 5 illustrates the results. Section 6 concludes the paper.
2. Literature Review
2.1. Bus service reliability
Qualitative or quantitative analysis methods are used to explore public transit
reliability in most studies. Hu et al.27 designed questionnaires to examine how the
bus service performance works on passengers’ trip mode choice preference, revealing
that public transit reliability is more influential than other factors. Chakrabarti
and Giuliano28 proved that the service reliability determines transit patronage, and
the impact is more pronounced during the weekday peak. Yuan et al.29 found that
improving bus service reliability through better bus operation and management is
essential to boost the expectations and increase satisfaction for bus transit with the
elderly as their subjects. From the aforementioned research, it is noticed that bus
service reliability is a crucial criterion in evaluating bus transit operation. Some
indexes are established in terms of convenience and comfortability from passengers’
perspectives to evaluate the transit reliability.30 Shen et al.31 defined the occupancy
rate as the ratio of the actual number of passengers to the rated capacity of the
bus in China. They recorded the occupancy rate and used a questionnaire to rate
passengers’ comfort at five-minute intervals, revealing that passenger load factors
and in-vehicle time affect passengers’ choice of bus.
An increasing number of studies used AVL data to investigate transit service
reliability. Ricard et al.32 found that a log-logist distribution predicted the best
estimate of the true conditional probability density function. Albadvi et al.33 uti-
lized the principal component analysis to methodically weigh the reliability metrics,
indicating that on-time performance, time headway balance, the standard devia-
tion of bus travel time and the 50th percentile travel time are symbolic evaluation
indicators. Yan et al.34 proposed two different sets of evaluation metrics to assess
the operational performance of bus routes, one for operators and one for regulators.
Kathuria et al.16 used trip time variation analysis at the segment, route and net-
work levels from the operator’s perspective. Zhang et al.35 present an analysis of
the spatial-time characteristics of the bus network consisting of GPS data, IC card
data and route data. Jenelius36 introduced two indices of empirical service reliability
gap (ESRG) and reliability buffer time (RBT) by assuming virtual probe travelers
2350034-3
November 8, 2022 11:56 IJMPB S0217979223500340 page 4
FA
Y. Liu et al.
combining travel conditions with AVL and APC data. ¨
Ozuysal and C¸ ali¸skanelli37
developed a multivariate adaptive regression splines (MARS) model to predict the
reliability of bus routes considering the route layout and traffic conditions.
2.2. Bus travel time prediction
There were numerous studies to predict bus travel time using AVL data. Tang
et al.38 estimated the route travel time distribution by convolution theory, using
Gaussian mixture model to divide it into several traffic states. Zhong et al.39 pro-
posed ensemble learning methods to improve the prediction accuracy of bus travel
time. Bachu et al.40 developed spatial Support Vector Machine (SVMs) and tempo-
ral SVMs to predict the travel time under high variance conditions. Kumar et al.41
developed a Kalman filter method for bus arrival time prediction and compared the
algorithm with other prediction methods, and the results show that the bus arrival
time prediction based on the Kalman filter outperforms other models. Mazloumi
et al.42 established the neural network bus arrival time prediction model, taking the
road traffic volume, vehicle total load rate and schedule execution as the input vari-
ables. Zhou et al.43 predicted bus arrival time based on recurrent neural networks,
taking dynamic factors such as passenger numbers, stay time and bus efficiency as
variable input, and introduced attention mechanisms from an adaptive selection of
the most relevant input factors. Alam et al.44 found that the prediction accuracy
of long short-term memory recurrent (LSTM) neural network is better than artifi-
cial neural network (ANN), Support Vector Regression (SVR), autoregressive inte-
grated moving average (ARIMA) and historical averages. Kawatani et al.45 applied
Gradient Boosting Decision Trees (GBDT) to extract a valid feature from the probe
data, and the travel time over the previous interval was experimentally validated
as the lowest MAE compared to the other elements.
3. Methodology
In this part, we first introduce several indicators that are correlated with travel
time, including dwell time, route time between adjacent bus stops, time headway,
coefficient of time headway variation, travel time and one-way punctuality index
(OWPI). Then, the prediction model is proposed.
3.1. Measures based on stops
Bus stops are essential public transport facilities in cities and are designated places
for passengers to get on and off. We define route time between two adjacent bus
stops as the time spending on the road between two stops, containing running time
and waiting time for signals. The dwell time is the time spent at stops, including
door open, door close, boarding and alighting. Time headway is defined as the time
between two adjacent buses that belong to the same route. Time headway is one
2350034-4
November 8, 2022 11:56 IJMPB S0217979223500340 page 5
FA
Understanding urban bus travel time
Fig. 1. (Color online) The sketch map of bus operation.
of the most useful indicators to measure operational reliability. Figure 1 shows the
sketch map of the indicators.
(1) Dwell time
Dwell time covers all the time that it takes for the bus to slow down and enter
the station, stop for passengers to get on and off, accelerate away, and so on. It can
be given as
DWTm
n= DTm
nATm
n,(1)
where DWTm
nis the dwell time of the bus vehicle mat the station n, DTm
nand
ATm
ndenote the departure and arrival time of bus mat stop n, respectively.
(2) Route time between adjacent bus stops
It is defined as the route time from a bus stop to the next stop, and it can
be calculated by the difference between the arrival time at the bus stop and the
departure time at the last one. It is estimated as
RTm
n,n+1 = ATm
n+1 DTm
n,(2)
where RTm
n,n+1 is the route time of the bus mbetween the two adjacent stations n
and n+ 1.
(3) Time headway
The time headway is defined as the time interval between two adjacent vehicles
on one single route based on the arrival time of bus stops.46 The phenomenon that
a number of buses arrive at a stop simultaneously is called as bus bunching, which
dramatically reduces the level of service on the bus route. It can be calculated as
Hm
n= ATm
n+1 ATm
n,(3)
where Hm
nis the time headway of the vehicle mat the station n.
2350034-5
November 8, 2022 11:56 IJMPB S0217979223500340 page 6
FA
Y. Liu et al.
(1) Coefficient of time headway variation
In order to measure the degree of deviation of the real-time headway from
the scheduled time headway, we used the coefficient of variation, the ratio of the
standard deviation to its mean value of time headway. The large value of it is
considered to be high variance. The formula can be written as
CVH =HStd
HMean
,(4)
where CVH is the coefficient of variation of time headway, HStd and HMean express
the standard deviation and the mean value, respectively.
3.2. Measures based on routes
In order to understand the spatial-temporal characteristics of bus operations based
on stops, managers need to know about the situation of the entire route, which can
help them to better dispatch buses, arrange a reasonable departure timetable and
reduce waste of resources.
(1) Travel Time
A reliable assessment of travel time reliability not only helps bus managers to
keep track of bus operation rules but is also essential for predicting bus arrival
times. It refers to the total time from the start to the end of the travel, including
the route between bus stops and the dwell time on stops. It can be expressed as
TTm=
N
X
n=1
DWTm
n+
N1
X
n=1
RTm
n,n+1,(5)
where TTmis the travel time of the route m,Nis the total station number, DWTn
is the dwell time at the station n, RTm
n,n+1 is the route time between stations n
and n+ 1.
(2) One-way Punctuality Index
OWPI is defined as the difference between the relative error of actual travel
time and designed standard travel time. One-way punctuality is a measure of the
punctuality of a one-way bus journey. It describes the deviation of a bus travel time
in one direction from the designed standard one-way time and ranges from 0 to 1.
A value closer to 1 indicates a higher degree of punctuality in that direction.
RE =
|TTm
iSDTTm
t|
SDTTm
t
,|TTm
iSDTTm
t|<SDTTm
t,
1,otherwise,
(6)
OWPIm= 1 RE,(7)
where RE is the relative error, TTm
iis the travel time of the route mof vehicle i,
SDTTm
tis the designed standard one-way time of route min tperiod.
2350034-6
November 8, 2022 11:56 IJMPB S0217979223500340 page 7
FA
Understanding urban bus travel time
3.3. Prediction model
The reliability of bus travel time prediction is an essential metric for travelers to
use public transport. Applying accurate forecasting to intelligent transport systems
can improve the level of public transport management and attract more potential
passengers, which will increase the economic benefits of buses and reduce govern-
ment financial subsidies. In this paper, we propose a Kalman filter-LSTM model to
forecast the travel time of bus routes to improve accuracy. We also introduce three
ensemble machine learning methods to compare with the proposed model.
3.3.1. Kalman filter-LSTM model
In time series forecasting, dirty data can affect the final prediction results, as time
dependence plays a crucial role in the process of time series. Noise or outliers must
be handled with care according to special solutions. The Kalman filtering is essen-
tially an efficient recursive method that combines data from different sensors, with
possibly different units, with the same purpose, to obtain a more accurate measure-
ment. The Kalman filter algorithm is characterized by the ability to calculate the
optimal output result value for each step based on the historical time series recur-
sively. Then the best prediction can be given according to the previous estimates
and recent observations. This process plays a role in eliminating noise to a certain
extent. Therefore, the algorithm shows good adaptability when applied to some sys-
tem environments with large changes in external factors. We use it to smooth and
mitigate the raw data. Bus travel time is a typical time series. In the long term,
there are similar trends in the daily travel time variations, and in the short term,
each moment has its unique changes.
Long Short-Term Memory network, usually just called LSTM, is a special kind
of RNN capable of learning long-term dependencies. Compared with RNN, the
concept of cell state is added to the LSTM. The cell state determines which state of
being input, while not like RNN considers the most recent state. LSTM can predict
time-series better and effectively avoid the gradient disappearance and explosion
problem in long series data training. LSTM is a complicated model, which is shown
in Fig. 2, containing input gate i, output gate o, forget gate fand cell state c.
Structures called gates are carefully regulated to remove or add information to the
cell state.
The first step in LSTM is to decide what information from the output at the last
moment remained in the cell state now. This decision is made by the forget gate.
It looks at ht1and xt, and outputs a number between 0 and 1 for each number
in the cell state ct1. One stands for keeping it all, and zero stands for getting rid
of it completely. Then, the input gate decides which values to be updated, and a
vector of new candidate values ˜ct, added to the present cell state, is created by a
tan hlayer. The second step is that the input gate combines these two to update
the state. Finally, the output gate decides what parts of the cell state to output
are multiplied with tan h, whose value is between 1 and 1. Therefore, we can only
2350034-7
November 8, 2022 11:56 IJMPB S0217979223500340 page 8
FA
Y. Liu et al.
Fig. 2. (Color online) The Kalman filter-LSTM model.
output the parts we decided to. The LSTM model eqnarray formulas are described
below.
ft=σg(Wfxt+Ufht1+bf),(8)
it=σg(Wixt+Uiht1+bi),(9)
˜ct=σc(Wcxt+Ucht1+bc),(10)
ct=ftct1+it˜ct,(11)
ot=σg(Woxt+Uoht1+bo),(12)
ht=otσh(ct),(13)
where the subscript tindexes the time step, ft,itand otare three gates, ˜ctis the cell
input state, ctis the cell output state, ct1is the former cell output state, htis the
hidden cell output state. σg,σc,σhrepresent sigmoid function, hyperbolic tangent
function and hyperbolic tangent function, respectively. W,U,bare weight matrices
and bias vector parameters that need to be learned during training, respectively.
The operator denotes the Hadamard product.
3.3.2. Ensemble learning model
Ensemble learning combines multiple learners to form an ensemble system with solid
performance. In general, the ensemble learning model is composed of two steps,
starting with constructing a series of individual learners, followed by the definition
of a combined strategy to integrate these individual learners. As the characteristics,
the model can be applied to both classification and regression problems with the
advantages of more accurate forecasting and stability.
The GBDT model and a series of improved models of it have been used fre-
quently in many machine learning competitions with many prizes. These three
models, including Categorical Boosting (CatBoost),47 Extreme Gradient Boosting
2350034-8
November 8, 2022 11:56 IJMPB S0217979223500340 page 9
FA
Understanding urban bus travel time
(XGBoost)48 and Light Gradient Boosting (LightGBM),49 optimizing the predic-
tion accuracy and computational speed of GBDT, are used in our study for a
prediction.
CatBoost uses the oblivious tree as the base learner, with fewer parameters,
supporting categorical variables and high accuracy. XGBoost combines with clas-
sification and regression trees as a tree-based booster and generates new trees to
fit the residuals of the previous tree by iterating over and over again. LightGBM
adopts the idea of combining the histogram algorithm and the leaf-wise max depth
limitation strategy.
There are two main areas of the differences between the three Boosting algo-
rithms. The first is how the trees of the three models are constructed, with Cat-
Boost using a symmetric tree structure, XGBoost using a level-wise decision tree
construction strategy, LightGBM using a leaf-wise construction strategy, and where
the decision trees are all binary trees. The second is the processing of category fea-
tures. CatBoost is better for its processing of categorical features. It is efficient at
processing categorical features by means of feature coding such as target variable
statistics. XGBoost does not automatically process category features, which must
be manually transformed into numerical values before entering the model. In Light-
GBM, category features need to be specified, then the algorithm can automatically
process them.
4. Data Description
All the buses in Jinan have been equipped with GPS devices. The GPS device of
each vehicle records related data such as location, time, vehicle number, line num-
ber, station or not, direction, etc. We can achieve the bus arrival and departure
information for stations and running time of routes from the original data. The
AVL data used in this paper was derived from the Jinan Bus Company. A total of
1.28 million records of four routes in Jinan (K115, K116, K117, K202) in Decem-
ber 2018 are selected to analyze. The total length of the four bus routes is 19.5,
16.3, 12.8 and 29 km, respectively. Table 1 shows the data sample achieving from
the original data, including line number (LINE NO), bus number (BUS NO), direc-
tion (IS UP DOWN), stop number (LABEL NO), arrival time (REACH TIME),
departure time (DEPART TIME), date, longitude of station and latitude of sta-
tion. The direction “0” represents upstream and “1” represents downstream. It is
worth mentioning that we analyze a half-cycle of the data, i.e., upstream.
Figure 3 shows the area where the routes are located. As can be seen, the four
routes traverse the Jingshi Road, the longest urban trunk road in China. Jingshi
Road is the central axis of urban development in Jinan, a public transport corridor
connecting the city center with the eastern and western urban areas, and a frame-
work for urban development. Many buildings such as large shopping malls, office
areas and schools are built around it. The urban spatial morphology in Jinan grad-
ually develops into a belt-shaped structure based on this road. Jingshi Road is an
2350034-9
November 8, 2022 11:56 IJMPB S0217979223500340 page 10
FA
Y. Liu et al.
Table 1. Data sample.
LINE NO BUS NO IS UP DOWN LABEL NO REACH TIME DEPART TIME DATE LNG LAT
K115 7390 0 1 9:17:02 9:19:38 2018/12/12 117.2132 36.6734
K115 7390 0 2 9:20:28 9:20:51 2018/12/12 117.2097 36.6724
K115 7390 0 3 9:22:20 9:22:31 2018/12/12 117.1992 36.6745
K115 7390 0 4 9:24:17 9:24:30 2018/12/12 117.1948 36.6739
K115 7390 0 5 9:25:45 9:26:12 2018/12/12 117.1881 36.6724
K115 7390 0 6 9:28:18 9:28:53 2018/12/12 117.1827 36.6712
K115 7390 0 7 9:31:36 9:32:17 2018/12/12 117.1731 36.6693
K115 7390 0 8 9:32:55 9:33:23 2018/12/12 117.1675 36.6685
K115 7390 0 9 9:34:04 9:34:10 2018/12/12 117.1611 36.6681
K115 7390 0 10 9:36:19 9:36:42 2018/12/12 117.1521 36.6664
...... ...... ...... ...... ...... ...... ...... ...... ......
2350034-10
November 8, 2022 11:56 IJMPB S0217979223500340 page 11
FA
Understanding urban bus travel time
Fig. 3. (Color online) The routes of K115, K116, K117 and K202.
important window serving the public and one of the icons of Jinan. The primary
cause of studying these four routes is useful for improving the quality of this road
service supply and expanding the service area. In addition, there are crossovers and
overlaps between the four lines, all spanning 2–4 administrative regions and passing
public areas with high traffic.
There are some errors in the data due to many reasons, such as device failure
or signal interruption during data transmission. Therefore, we remove unreasonable
data before analysis, such as the data whose arrival time is later than departure
time.
5. Results
5.1. Measures based on stop
(1) Dwell time
Dwell time is an essential factor in travel time, reflecting the number of passen-
gers boarding and alighting at bus stops. It includes deceleration into the station,
stopping in the station, accelerating out of the station, and, if multiple vehicles are
entering the same station, dwell time will also be ought to include queuing time
outside the station.
Figure 4 shows the dwell time for the four routes over the five weekdays from
December 10 to 14, with the red line connecting the average of each stop and the
black line connecting the median of each stop. The amplitude of the violin represents
the distribution of dwell time, and the width of the violin means the amount at
that time. The first and last stops are not taken into account in this analysis,
where buses may stop for a period of time to wait for departure time. The dwell
time varies from station to station, with little difference in fluctuations at most bus
stops. Operators need to pay attention to several stations with large dwell times,
such as station 7 of K115, stations 2, 8 and 21 of K116, station 18 of K117, stations
2350034-11
November 8, 2022 11:56 IJMPB S0217979223500340 page 12
FA
Y. Liu et al.
Fig. 4. (Color online) Violin plots of dwell time of each station.
7 and 21 of K202. Station 7 of K115, station 18 of K117 and station 21 of K202
represent the same stop, Jingshi Road Shanshidong Road. This stop is surrounded
by a university and three hospitals, with more travelers. The other stops of long
dwell time are around a big shopping mall or on many different bus lines.
The bubble chart is shown in Fig. 5 to show the average dwell time at each
station for the four lines, with the bubble area representing the size of the dwell
time. It is worth noting that the average dwell time for the four lines in common
stops on Jingshi Road is very similar. Multiple buses may arrive at the same time
Fig. 5. (Color online) Bubble chart of average dwell times of each station.
2350034-12
November 8, 2022 11:56 IJMPB S0217979223500340 page 13
FA
Understanding urban bus travel time
or arrive very close to each other due to delays. While the traffic volume is big in
adjacent lanes, the behind bus does not depart until the former leaves the stop.
Vehicles of different bus routes in the area of the common stop affect each other,
resulting in the above phenomenon. It also means that once the congestion occurs
at a place with more passengers, it is easy to cause a chain reaction leading to too
long dwell time. Regulators and operators are required to reasonably set the types
of bus stops and rationalize the departure schedule for bus stops with multiple lines.
(2) Route time between two adjacent bus stops
Route time between two adjacent bus stops is a significant component of bus
travel time, and its fluctuation can have a more substantial impact on the level
of bus service. Figure 6 shows a violin plot of route time between two adjacent
bus stops for the four routes on December 12, with the mean and median between
individual stops connected by red and black curves, respectively. Compared to the
dwell time, the running time between two consecutive fluctuates more, related to
many factors such as the distance between bus stops, road congestion and the
number of intersections. The route time between adjacent stations on the four lines
is within 2 to 5 min, while the route time from station 21 to 22 in K116 is the largest
span, with a maximum value of 648. This section of road is located in Shuntai
Square, the largest office complex in Shandong Province, and is often congested.
Fig. 6. (Color online) Violin plots of section running time of each station.
2350034-13
November 8, 2022 11:56 IJMPB S0217979223500340 page 14
FA
Y. Liu et al.
Fig. 7. (Color online) The time-space diagrams on December 12.
Analysis of route time can help managers identify congestion-prone road segments
better, and then managers can control traffic signals to address this situation.
(3) Time headway
Time headway regularity implies the regulation of the transit fleet in order
to avoid too large or too small a gap between two consecutive transit vehicles.
Specifically, if the time headway of two consecutive transit vehicles is close to zero,
this phenomenon is called bunching, which would broadly undermine the service
level of the bus routes. Thus, measures should be performed if the time headway
is too large or small. Figure 7 shows the time-space trajectory of four routes on
December 12. It expresses that the bus bunching is more likely to occur during the
vehicle operation, as shown in red circles. Once the bus bunching occurs, it will
continue until the bus reaches its destination. It is likely to result in a shortage of
passengers on the latter of the bus, wasting capacity and causing passengers on the
bus route to wait longer for the next bus.
Figure 8 shows the coefficient of time headway variation for the four lines, and
the horizontal plane is its projection. It can be seen that the reliability is highest at
the first bus stop, and the deviation is increasing as the bus runs stop by stop, espe-
cially the last stops with the most extraordinary fluctuations. This characterizes
the reliability decreases along with the travel as a result of bus bunching and large
2350034-14
November 8, 2022 11:56 IJMPB S0217979223500340 page 15
FA
Understanding urban bus travel time
Fig. 8. (Color online) Coefficient of variation in time headway.
intervals. What’s more, it fluctuates more in the peak hours than the flat hours
throughout the day due to heavy traffic volume and frequent congestion. The indi-
cator fluctuates more in the morning peak than it in the evening, as more similarity
in purpose and time of trips in the morning. Compared with other lines, the K202
line fluctuates widely, and its values are between 0.06 and 1.31. Because the K202
is the longest of four routes, it departs from Jinan West Station and ends in the
eastern part of Jinan, with stations far apart and subject to more uncontrollable
factors.
5.2. Measures based on routes
(1) Travel time
Travel time reliability refers to measuring the time gap between the expected
and actual travel time. Reliability is low if the time gap between the expected and
actual travel times is large. Travel time reliability is one of the most important
measures of reliability in the transportation system for both service providers and
travelers.
Figure 9 shows the average travel time for each of the four bus routes in Decem-
ber 2018, serving a 30-min interval of arrival time as a time window. Each point
2350034-15
November 8, 2022 11:56 IJMPB S0217979223500340 page 16
FA
Y. Liu et al.
Fig. 9. (Color online) Travel time for every 30 min arrival time window.
represents the whole travel time from origin to destination. The travel time is highly
volatile throughout the day. The 90th percentile T90, the mean and the 10th per-
centile T10 of the travel times are also marked on the graph, effectively excluding
the extreme values and visualizing the fluctuations under different conditions. We
can also find that the travel time of all four routes takes the shape of a bimodal
distribution, with the lines, respectively, connecting T10, mean, T90 in an almost
parallel trend. In addition, the travel time is different due to the level of conges-
tion for different periods. Travel time in the morning peak is higher than it in the
evening peak because the purpose and access of people’s trips are more concen-
trated in the morning peak than in the evening peak.
Figure 10 presents a heat map of the coefficient of travel time variation for each
time window for the four routes. The smaller the coefficient of variation value is,
the more concentrated, more stable and less variable the period is. The heat map
provides a visual indication of the periods that need to be adjusted accordingly for
operators and regulators.
The working hours of the four lines are divided into flat or peak periods of
variable length according to Figs. 9 and 10. Statistical analysis of each period is
shown in Tables 2–5.
The average travel time of K116 during peak periods is close to the average
off-peak time, and other lines are not like the phenomenon above, indicating that
2350034-16
November 8, 2022 11:56 IJMPB S0217979223500340 page 17
FA
Understanding urban bus travel time
Fig. 10. (Color online) Coefficient of the travel time variation.
Table 2. Descriptive statistics for within-the-day travel time of K115.
Descriptive statistics 5:30–6:30 6:30–9:00 9:00–16:30 16:30–18:30 18:30–22:00
Count 270 731 1352 614 474
Avg TT (min) 56.18 70.82 59.82 69.82 53.4
Avg speed (km/h) 20.83 16.52 19.56 16.76 21.91
Standard deviation 4.5 10.19 4.18 8.74 3.95
Cv(%) 8.01 14.39 6.99 12.52 7.4
T50 (min) 56 70 59 69 53
T95 (min) 65 87 67 85 60
(T90 T10)/T50 (%) 17.86 37.14 16.95 31.88 18.87
Table 3. Descriptive statistics for within-the-day travel time of K116.
Descriptive statistics 6:30–7:00 7:00–9:00 9:00–16:30 16:30–8:30 18:30–22:00
Count 159 537 872 384 127
Avg TT (min) 53.11 61.1 53.04 61.38 54.83
Avg speed (km/h) 18.41 16.01 18.44 15.93 17.84
Standard deviation 4.16 8.21 3.32 6.22 3.77
Cv(%) 7.83 13.44 6.26 10.13 6.88
T50 (min) 52 61 52 61 55
T95 (min) 61 74.1 59 72 61
(T90 T10)/T50 (%) 19.23 34.43 13.46 26.23 18.18
the stability of the K116 line is better than others. The national average bus travel
speed is 20 km/h, but the average speed of those four lines during the peak period
is far below the intermediate level, indicating a poor level of service. The average
speed of the K202 line is higher than others, but the average TT during off-peak
hours is higher than the national level, which is related to the location of the line
and station settings. For route K202, the line is from the West Railway Station
area, where the population is small, to the central city. Moreover, the distance of
bus stops spacing is set more extensively, so the speed of the route is immense.
Cvindicates the ratio of the standard deviation of TT to its mean value, reflect-
ing the degree of deviance, and the value of the morning peak is the largest, fol-
lowed by the evening peak and off-peak. The average bus travel time is close to the
2350034-17
November 8, 2022 11:56 IJMPB S0217979223500340 page 18
FA
Y. Liu et al.
Table 4. Descriptive statistics for within-the-day travel time of K117.
Descriptive statistics 5:00–6:30 6:30–9:00 9:00–16:30 16:30–18:30 18:30–22:00
Count 357 1012 2073 720 558
Avg TT (min) 36.21 51.35 45.08 50.3 40.63
Avg speed (km/h) 21.21 14.96 17.04 15.27 18.9
Standard deviation 2.1 7.87 3.82 4.41 3.07
Cv(%) 5.81 15.32 8.48 8.78 7.55
T50 (min) 36 52 44 50 41
T95 (min) 40 64 52 58 46
(T90T10)/T50 (%) 13.43 37.94 21.24 22.47 19.61
Table 5. Descriptive statistics for within-the-day travel time of K202.
Descriptive statistics 5:30–6:00 6:00–8:30 8:30–16:00 16:00–18:30 18:30–22:00
Count 123 773 1632 625 485
Avg TT (min) 73.29 95.22 83.84 92.66 73.36
Avg speed (km/h) 23.74 18.27 20.75 18.78 23.72
Standard deviation 7.8 13.83 5.65 8.78 5.28
Cv(%) 10.65 14.52 6.73 9.47 7.19
T50 (min) 72 95 84 92 73
T95 (min) 88 118 94 106 81
(T90T10)/T50 (%) 26.93 36.85 17.29 24.47 17.26
median, indicating a relatively uniform time distribution. The 95th percentile of
travel time T95 is introduced to compare the maximum TT between different lines
better. (T90T10)/T 50 represents the distance between T90 and T10 relative to
the median, eliminating the effect of extreme values and the values larger than T90
and smaller than T10. It also indicates the distribution of 80% of the data around
the median value. Its values are 37 and 14 at peak and flat periods, respectively,
meaning the fluctuation of travel time at different times.
Figure 11 shows the travel time fit distributions of the four studied routes.
The lognormal distribution with the right tail is fitted to the travel time of the
four routes with R2= 0.95284, 0.84722, 0.94681, 0.97618, respectively. The fitted
formulae are as follows:
y= 0.02092 + 3.64029
0.088452πxe
(ln x
58.93787 )2
0.015647
,(14)
y= 0.01706 + 1.21543
0.052212πxe
(ln x
52.46858 )2
0.005452
,(15)
y= 0.00397 + 1.80165
0.124152πxe
(ln x
45.02212 )2
0.030826
,(16)
y= 0.00783 + 4.33468
0.105172πxe
(ln x
83.5007 )2
0.022121
.(17)
2350034-18
November 8, 2022 11:56 IJMPB S0217979223500340 page 19
FA
Understanding urban bus travel time
(2) One-way Punctuality Index
The OWPI on the four routes on December 12 is shown in Fig. 12. K116 and
K117 are more stable than the other two routes. Punctuality is poor during peak
hours and sometimes can even be more than twice as bad. Therefore, managers
need to adjust the planned travel time and the frequency of departures. It can be
found that the difference is significant, especially at peak time, so it is essential to
forecast and re-schedule the travel time.
5.3. Prediction
Travel time variation affects the cycle time and operating cost of routes from oper-
ators’ perspective, as excessive travel time results in additional or adjustable bus
arrangements, thus higher operating costs. Due to the poor bus service level, pre-
dicting travel time not only helps regulators and operators make timely adjustments
to congestion but also helps passengers to plan their routes. We used the travel
time on route K115 in December 2018 as a data sample, with the first 30 days as
the training set and the last day as the test set.
The Kalman filter-LSTM model is used. At first, we used the Kalman filter to
smooth the raw times series. Then, we constructed an LSTM model. After param-
eter analysis and simulation, both the input and output layers are equal to 1, and
the hidden layer is 3. The ranges of the hidden unit are set to {128,64,32}. The
batch size is set to 512. The epoch is set to 200. The ranges of learning rates are
{0.05,0.005}. To test its effectiveness, we used three Ensemble Learning Models,
CatBoost, XGBoost and LightGBM, to predict and compare their results.
We chose four indexes, i.e., mean absolute error (MAE), mean absolute percent-
age error (MAPE), root mean square error (RMSE) and coefficient of determina-
tion (R2), to verify the prediction accuracy of these models. MAE is an arithmetic
Fig. 11. (Color online) Travel time fit distributions on K115, K116, K117 and K202.
2350034-19
November 8, 2022 11:56 IJMPB S0217979223500340 page 20
FA
Y. Liu et al.
Fig. 12. (Color online) OWPI on K115, K116, K117 and K202.
average of the absolute errors between the predictions and the actual values, and
MAPE reflects the magnitude of the errors relative to the actual values. The smaller
the values of these two indexes are, the higher the prediction accuracy of the model
is. RMSE shows the robustness of the errors. The smaller the index value is, the
less volatile the model prediction is. R2, ranging from 0 to 1, indicates how well
the predictions approximate the real data points, where R2approaching one faster
suggests that the prediction model fits the data better. They are calculated as
MAE = 1
t
t
X
i=1 |yiˆyi|,(18)
MAPE = 100
t
t
X
i=1
|yiˆyi|
yi
,(19)
RMSE = v
u
u
t
1
t
t
X
i=1
(yiˆyi)2,(20)
R2= 1 Pt
i=1 (yiy)2
Pt
i=1 (y¯y)2,(21)
2350034-20
November 8, 2022 11:56 IJMPB S0217979223500340 page 21
FA
Understanding urban bus travel time
Fig. 13. (Color online) Bus travel time prediction on K115.
Table 6. The results of evaluation indexes of prediction models.
Model CatBoost XGBoost LightGBM LSTM Kalman filter-LSTM
MAE 3.35 3.47 3.55 3.30 3.10
MAPE 4.87% 5.10% 5.18% 4.73% 4.52%
RMSE 4.62 4.85 4.85 4.60 4.31
R20.87 0.86 0.86 0.88 0.90
where tis the number of the samples, yiis the predicted value, yis the actual value
and ¯yis the average of actual values.
Figure 13 shows the prediction value and actual value and the light red shadow
is a 5% error band. To see the performance of different predictions, we present the
results of MAE, MAPE, RMSE and R2in Table 6. As can be seen, the Kalman filter-
LSTM shows the highest accuracy with 0.9 of R2, which is superior to CatBoost,
XGBoost, LightGBM and LSTM.
6. Conclusions
The travel time of a bus is of significance for both operators and passengers. It
fluctuates during the daily operation due to many factors, such as traffic conditions,
signals, and the number of boarding and alighting passengers. The use of visualiza-
tion and big data technologies to analyze the trends and identify spatiotemporal
2350034-21
November 8, 2022 11:56 IJMPB S0217979223500340 page 22
FA
Y. Liu et al.
characteristics of transit operations can help transit agencies improve the overall
operational efficiency of the transit system. This paper proposes a set of indicators
from the perspectives of stops and routes for analyzing bus travel time based on
large AVL data, and establishes a Kalman filter-LSTM model to forecast bus travel
time.
The results show that travel time is longer and more variable in the morning and
evening peaks, exhibiting a bimodal distribution. Larger values and more significant
variations appear in the morning peak, with the overall distribution of travel time
taking on a lognormal distribution. Dwell time does not vary significantly from
station to station. The dwell time for different lines on common stops is very close
and most dwell times are no more than 1 min. The route time between adjacent
stations fluctuates and varies greatly, related to the actual road conditions. As the
vehicle travels, the farther the vehicle travels, the more unstable the trip service
level is. Bus bunching and large bus time headway easily occur, and once it occurs,
it will last to the destination. It is difficult to be changed by itself. The Kalman
filter-LSTM model has been tested and found to be a good predictor of bus travel
time for both passengers and operators, which is helpful in improving bus service
quality. Operators should focus on the peak hours, different lines of common stops,
low punctuality of routes, and rearrange the timetable and departure time according
to the travel time prediction.
This study could provide effective tools for operators to better understand the
whole process of public transport operation and improve the service quality of urban
bus system. Moreover, the travel time prediction could help to make rational plans
for routes and schedules. The future study will focus on the performance of whole
bus network.
Acknowledgments
This work is supported by National Natural Science Foundation of China
(Grant Nos. 42001396, 41901396), Youth Innovation Science and Technology Sup-
port Project in Colleges and Universities of Shandong Province (2021KJ058),
Shandong Provincial Natural Science Foundation (ZR2021MG032), Graduate
Education Quality Improvement Plan Program of Shandong Jianzhu Univer-
sity (YZKC202115) and the Social Science Planning Foundation of Qingdao
(QDSKL2001005).
References
1. C. McTigue et al.,Transp. Policy 91, 16 (2020).
2. J. Yang et al.,Int. J. Sustain. Transp. 14, 56 (2020).
3. K. Bhattacharyya et al.,Transp. Res. Rec. 2673, 646 (2019).
4. T. Zhou et al.,Reliab. Eng. Syst. Safe 217, 108090 (2022).
5. D. Yao et al.,Transp. Res. A 154, 329 (2021).
6. Md. M. Rahman et al.,Transp. Res. C 86, 453 (2018).
7. X. Zhong et al.,IET Intell. Transp. Syst. 14, 1524 (2020).
2350034-22
November 8, 2022 11:56 IJMPB S0217979223500340 page 23
FA
Understanding urban bus travel time
8. J. Lemus-Romani et al.,IEEE Access 9, 30359 (2021).
9. W. Zhang et al.,Comput. Ind. Eng. 158, 107444 (2021).
10. Y. Bie et al.,Comput.-Aided Civ. Infrastruct. Eng. 35, 4 (2020).
11. H. Zhang et al.,Proc. Inst. Civ. Eng. 174, 14 (2021).
12. Z. Cao et al.,Transp. Res. C 102, 370 (2019).
13. X. Q. Chen et al.,J. Adv. Transp. 2020, 7194342 (2020).
14. X. Q. Chen et al.,Physica A 565, 125574 (2021).
15. K. Jin et al.,Eng. Appl. Artif. Intel. 107, 104518 (2022).
16. A. Kathuria et al.,J. Transp. Eng. A 146, 05020003 (2020).
17. A. Chepuri et al.,Transp. Lett. 12, 363 (2020).
18. W. Wu et al.,Transp. Res. E 130, 61 (2019).
19. S. M. H. Moosavi et al.,IEEE Access 8, 201937 (2020).
20. W. Chen et al.,Sustainability 13, 5529 (2021).
21. F. Serin et al.,Physica A 579, 126134 (2021).
22. Y. Yuan et al.,Electronics 9, 1876 (2020).
23. Z. Xie et al.,Math. Probl. Eng. 2021, 6636367 (2021).
24. A. Achar et al.,IEEE Trans. Intell. Transp. 21, 1298 (2020).
25. N. C. Petersen et al.,Expert Syst. Appl. 120, 426 (2019).
26. R. B. Sharmila et al.,IET Intell. Transp. Syst. 13, 1429 (2019).
27. X. Hu et al.,Adv. Mech. Eng. 7, 1687814015573826 (2015).
28. S. Chakrabarti et al.,Transp. Policy 42, 12 (2015).
29. Y. Yuan et al.,Int. J. Sustain. Transp. 13, 761 (2019).
30. R. Rong et al.,Transp. Res. A 160, 80 (2022).
31. X. Shen et al.,SpringerPlus 5, 62 (2016).
32. L. Ricard et al.,Transp. Res. C 138, 103619 (2022).
33. A. Albadvi et al.,J. Ind. Eng. Manag. 5, 85 (2018).
34. Y. Yan et al.,J. Transp. Eng. 142, 04016029 (2016).
35. H. Zhang et al.,J. Adv. Transp. 2021, 6937228 (2021).
36. E. Jenelius, Transp. Res. A 117, 275 (2018).
37. M. ¨
Ozuysal and S. P. C¸ ali¸skanelli, Can. J. Civ. Eng. 45, 852 (2018).
38. J. Tang et al.,Physica A 545, 123759 (2020).
39. G. Zhong et al.,IEEE Intel. Transp. Syst. 14, 174 (2022).
40. A. K. Bachu et al.,Transportation 36, 221 (2021).
41. B. A. Kumar et al.,J. Transp. Eng. A 143, 04017012 (2017).
42. E. Mazloumi et al.,J. Transp. Eng. 138, 436 (2012).
43. X. Zhou et al.,Future Internet 11, 247 (2019).
44. O. Alam et al.,J. Ambient Intell. Humaniz. Comput. 12, 7813 (2021).
45. T. Kawatani et al.,Int. J. Intell. Transp. 19, 456 (2021).
46. H. Zhang et al.,IEEE Access 7, 96404 (2019).
47. A. Barnwal et al.,J. Comput. Graph. Stat. 19, 8004105 (2022).
48. X. L. Zhang et al.,Nat. Resour. Res. 29, 711 (2020).
49. J. Wei et al.,Atmos. Chem. Phys. 21, 7863 (2021).
2350034-23
... Analysis of the travel time behavior of public transit services is a well-researched topic. Several researchers analyzed the reliability [11] and variability [12] of travel time [7] at various spatial-temporal scales using a variety of data sources [13]. Further, researchers analyzed public transit bus data for estimation and prediction of travel time using statistical [2,14,15], mathematical [14,16], data mining [17,18], machine learning [19,[20][21][22][23][24][25], and hybrid [3,26] approaches and used these predictions to forecast bus arrival time at bus-stops. ...
Article
Full-text available
Public transit service is a sustainable and eco-friendly alternative for commuting, and promoting its usage is the need of the day. An understanding of the variability of travel time can aid service operators to improve the reliability and ridership of public transport. Gaining insights into the variability of travel time is a data-intensive task, and most of the existing studies utilize multiple traffic-related datasets. However, most cities lack the infrastructure to collect multiple data sets, hence in the current study, the location data of public transit buses were used for the analysis. The study was conducted in Tumakuru city, India at two spatial levels, namely route and segment, and further at temporal levels such as the day-of-the-week and departure time window. Wilcoxon signed-rank test was applied to identify similar spatial-temporal aggregations, and a few aggregations demonstrated similarity. Consistent with the existing literature, six statistical distributions were selected to fit the data through the Kolmogorov-Smirnov test. The results emphasized that the Logistic distribution is the best fit at all spatial-temporal aggregation levels, and the lognormal and GEV distributions offered better fit for a few aggregation levels. Logistic distribution is recommended for operations planners and researchers to conduct reliability analysis and travel time forecasting in the future.
... Its applications range from predicting stock prices and diagnosing diseases to forecasting traffic flow [33,34]. Notably, LSTM has gained significant traction among researchers for predicting bus travel times [35][36][37]. LSTM excels at capturing segment-level and long-term information in traffic data due to its intricate structure, as illustrated in Figure 4. ...
Article
Full-text available
Urban transportation systems are increasingly burdened by traffic congestion, a consequence of population growth and heightened reliance on private vehicles. This congestion not only disrupts travel efficiency but also undermines productivity and urban resident’s overall well-being. A critical step in addressing this challenge is the accurate prediction of bus travel times, which is essential for mitigating congestion and improving the experience of public transport users. To tackle this issue, this study introduces the Hybrid Temporal Forecasting Network (HTF-NET) model, a framework that integrates machine learning techniques. The model combines an attention mechanism with Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) layers, enhancing its predictive capabilities. Further refinement is achieved through a Support Vector Regressor (SVR), enabling the generation of precise bus travel time predictions. To evaluate the performance of the HTF-NET model, comparative analyses are conducted with six deep learning models using real-world digital tachograph (DTG) data obtained from intracity buses in Cheonan City, South Korea. These models includes various architectures, including different configurations of LSTM and GRU, such as bidirectional and stacked architectures. The primary focus of the study is on predicting travel times from the Namchang Village bus stop to the Dongnam-gu Public Health Center, a crucial route in the urban transport network. Various experimental scenarios are explored, incorporating overall test data, and weekday and weekend data, with and without weather information, and considering different route lengths. Comparative evaluations against a baseline ARIMA model underscore the performance of the HTF-NET model. Particularly noteworthy is the significant improvement in prediction accuracy achieved through the incorporation of weather data. Evaluation metrics, including root mean squared error (RMSE), mean absolute error (MAE), and mean squared error (MSE), consistently highlight the superiority of the HTF-NET model, outperforming the baseline ARIMA model by a margin of 63.27% in terms of the RMSE. These findings provide valuable insights for transit agencies and policymakers, facilitating informed decisions regarding the management and optimization of public transportation systems.
... They built what is known as a collaborative intelligent transportation system (CITS), which is based on nine separate algorithms for the prediction of present and future trip times. Kalman filter-LSTM deep learning was suggested by Liu et al. (2023) as a method for estimating the trip time of buses. According to the results of numerical studies, the journey time of bus routes demonstrates a pattern with a leftskewed centre and a right-tail that is well-fitted by the lognormal distribution. ...
Article
Full-text available
The efficient management of road traffic is crucial for enhancing transportation routing and improving overall traffic flow. However, the conventional methods can not accurately analyze real-time traffic data and do not provide valuable insights for effective transportation routing decisions. So, this work proposes a deep learning-based approach for road traffic condition monitoring networks (RTCM-Net) for various illumination conditions. Initially, the fuzzy block-based histogram equalization (FBHE) method enhances the colour properties of the input image, which improves the low-light conditions, haze removal, illumination condition balancing, and colour balancing. The proposed approach leverages deep learning techniques, specifically convolutional time domain neural networks (CTDNN), to learn and extract meaningful features from road traffic data. By training the CTDNN model on a large-scale dataset comprising historical traffic patterns, the system can effectively capture complex traffic conditions and identify anomalies or congestion in real time. Finally, the RTCM-Net is capable of classifying the high, low, dense traffic, fire attack, and accident classes from the input images. The proposed RTCM-Net achieved high accuracy at 99.51%, sensitivity at 98.55%, specificity at 98.92%, F-measure at 99.98%, precision at 99.42%, Matthews Correlation Coefficient (MCC) at 99.72%, Dice as 98.62%, and Jaccard as 99.48% scores, indicating its effectiveness in classifying and monitoring road traffic conditions, which are higher than traditional approaches.
... Moreover, Dai et al. (2019) highlight that link travel time primarily influences total travel times. This leads to an overall route travel time distribution showing a right-skewed, right-tailed pattern that fits well with a lognormal distribution (Liu et al., 2023), especially within a specific distance threshold (Rahman et al., 2018). To address the minimum value limitation of the lognormal distribution (which is zero), a shifted parameter is introduced, resulting in a more suitable shifted lognormal distribution (Dai et al., 2019). ...
... For the past decades, great efforts have been made to develop numerous travel time functions and estimation or prediction models. In general, these methods can be roughly divided into the following three major categories (Wei et al 2018): (1) parametric methods, such as Kalman Filter (KF) (Deeshma et al. 2015;Raju et al. 2015;Murthy et al 2016Murthy et al , 2017, autoregressive Moving Average (ARMA), and Autoregressive Integrated Moving Average (ARIMA); (2) non-parametric techniques, including k Nearest Neighbor (k-NN), Support Vector Machine (SVM), Deep Learning Methods (Abdollahi et al 2020), and temporal convolutional networks (Tag Elsir et al. 2023); (3) combined models that employ the parametric and non-parametric methods in the forecasting framework simultaneously (Zhang and Liu 2011), such as Kalman filter-LSTM deep learning (Liu et al 2023), attention-based spatial-temporal graph convolutional networks on lowresolution data approach (Li et al 2022). Admittedly, travel time function, that is to explain the mathematical relationship between the time passing through a link and the influencing conditions, like traffic flow on this link, and in essence is a kind of statistical method, is one of the most important research contents. ...
Article
Full-text available
Travel time estimation or prediction is of great significance to intelligent transportation systems. Apparently, many factors have a significant impact on travel time in the traffic trip. Combined estimation or prediction models for travel time can deal with many influence factors and effectively ameliorate the estimation or prediction accuracy, but uncertainty and fuzziness could be generated internally in the calculation process of the weight coefficients. A new weighted travel time function, that can handle more influence factors compared with the traditional travel time functions, is presented. In addition, a novel determination method of the weight coefficients of the presented function based on fuzzy soft set, dealing well with uncertainty and fuzziness, is proposed. Then, a user equilibrium assignment embedding the weighted travel time function is formulated. The validity and superiority of the weighted travel time function, that two typical travel time functions are used as the components, in the actual travel time estimation are validated. The mean absolute percentage error (MAPE), maximum absolute relative error (MARE), mean squared error (MSE) and root mean square error (RMSE) indicate that the proposed travel time function has the superior estimating performance. The corresponding user equilibrium assignment are validated by an example related to a medium-size network. Link flow patterns and average travel time at equilibrium of the network are also investigated in the numerical example.
... Faced with a brand we have not come into contact with, we will know it by virtue of its product function and appearance design or by its loud name, interesting graphics, and vivid colors, all of which are gradually familiarized in the process of brand communication and are the key to our first impression of the brand [2]. Therefore, the design of the brand name, logo, color, and other elements that catch people's eyes at the first time are particularly important [3][4]. Consumers can make associations through the brand logo and have a better feeling for the brand [5]. ...
Article
Full-text available
In order to have a better product display and thus attract consumers’ purchases and increase the economic benefits of the enterprise, in this paper, we propose a deep learning model for brand 3D image design. A feedforward neural network that estimates the error of previous layers based on the error of the output layer assigns the convolutional kernel weight parameters of the network in the interval and stops when the error reaches a preset accuracy or reaches a preset maximum learning count. The locally-aware convolutional neural network acquires local features that are finer than the global features and outputs the feature maps of the convolutional layers after passing the activation function to calculate the sensitivity of the sampled layer units. Given the sensitivity information of the feature map, the gradient of the kernel function weights is obtained, and the updated parameters are trained to achieve feature map recursion and solve the image boundary problem. A 3D recurrent neural network is constructed using data-driven multiple or single images, transformed into a low-dimensional feature matrix, processed with 3D pixel data, extracted perceptual features, and generated high-resolution images. The analysis of the results shows that the CD value of the used model is 0.477 and the EMD value is 0.579, which makes the constructed 3D images with more obvious detail levels and more accurate structural design, while the model of Pixel2Mesh focuses more on surface information, so the generated model is more realistic and closer to the real image.
Article
Accurate forecasting of bus travel time and its uncertainty is critical to service quality and operation of transit systems: it can help passengers make informed decisions on departure time, route choice, and even transport mode choice, and it also support transit operators on tasks such as crew/vehicle scheduling and timetabling. However, most existing approaches in bus travel time forecasting are based on deterministic models that provide only point estimation. To this end, we develop in this paper a Bayesian probabilistic model for forecasting bus travel time and estimated time of arrival (ETA). To characterize the strong dependencies/interactions between consecutive buses, we concatenate the link travel time vectors and the headway vector from a pair of two adjacent buses as a new augmented variable and model it with a mixture of constrained multivariate Gaussian distributions. This approach can naturally capture the interactions between adjacent buses (e.g., correlated speed and smooth variation of headway), handle missing values in data, and depict the multimodality in bus travel time distributions. Next, we assume different periods in a day share the same set of Gaussian components, and we use time-varying mixing coefficients to characterize the systematic temporal variations in bus operation. For model inference, we develop an efficient Markov chain Monte Carlo (MCMC) algorithm to obtain the posterior distributions of model parameters and make probabilistic forecasting. We test the proposed model using the data from two bus lines in Guangzhou, China. Results show that our approach significantly outperforms baseline models that overlook bus-to-bus interactions, in terms of both predictive means and distributions. Besides forecasting, the parameters of the proposed model contain rich information for understanding/improving the bus service, for example, analyzing link travel time and headway correlation using covariance matrices and understanding time-varying patterns of bus fleet operation from the mixing coefficients. Funding: This research is supported in part by the Fonds de Recherche du Quebec-Societe et Culture (FRQSC) under the NSFC-FRQSC Research Program on Smart Cities and Big Data, the Canadian Statistical Sciences Institute (CANSSI) Collaborative Research Teams grants, and the Natural Sciences and Engineering Research Council (NSERC) of Canada. X. Chen acknowledges funding support from the China Scholarship Council (CSC). Supplemental Material: The e-companion is available at https://doi.org/10.1287/trsc.2022.0214 .
Article
Full-text available
Today, the use of the GPS global positioning system for intra- and extra-urban transportation is a necessary and undeniable matter. Assessing the reliability and stability of intra-city bus routes has been raised as an important issue that has a great impact on the quality of bus services. In this research, from the GPS global positioning system data, which is related to the Yazd city bus system; It has been used to evaluate the reliability and stability of city buses for intra-city transfers. One of the goals investigated in this research is to investigate the parameters affecting bus bunching and analyze the stability of the system as well as the reliability of the public transportation system. In this research, section travel time, dwell time, headway and bus bunching have been analyzed in terms of time and place. In fact, it is checked that how is in different time periods of the day, the situation of batch movement occurrence and stability . In this research, the prediction methods of Linear Regression, Support Vector Regression, Random Forest and Gradient Boosting Regression have been used to predict the reliability and stability assessment of bus travel routes. The Gradient Boosting Regression Model for predicting the reliability of the bus travel route and also for predicting headway has a lower error and better performance than the rest of the prediction models. The results of this research will help urban planners in understanding the stability of stations, identifying key points that affect the stability of stations, identifying key stations and providing better transportation services in the future, in order to reduce the waiting time of passengers.
Article
Full-text available
Real-time bus travel time prediction has been an interesting problem since past decade, especially in India. Popular methods for travel time prediction include time series analysis, regression methods, Kalman filter method and Artificial Neural Network (ANN) method. Reported studies using these methods did not consider the high variance situations arising from the varying traffic and weather conditions, which is very common under heterogeneous and lane-less traffic conditions such as the one in India. The aim of the present study is to analyse the variance in bus travel time and predict the travel time accurately under such conditions. Literature shows that Support Vector Machines (SVM) technique is capable of performing well under such conditions and hence is used in this study. In the present study, nu-Support Vector Regression (SVR) using linear kernel function was selected. Two models were developed, namely spatial SVM and temporal SVM, to predict bus travel time. It was observed that in high mean and variance sections, temporal models are performing better than spatial. An algorithm to dynamically choose between the spatial and temporal SVM models, based on the current travel time, was also developed. The unique features of the present study are the traffic system under consideration having high variability and the variables used as input for prediction being obtained from Global Positioning System (GPS) units alone. The adopted scheme was implemented using data collected from GPS fitted public transport buses in Chennai (India). The performance of the proposed method was compared with available methods that were reported under similar traffic conditions and the results showed a clear improvement.
Article
Full-text available
Fine particulate matter with a diameter of less than 2.5 µm (PM2.5) has been used as an important atmospheric environmental parameter mainly because of its impact on human health. PM2.5 is affected by both natural and anthropogenic factors that usually have strong diurnal variations. Such information helps toward understanding the causes of air pollution, as well as our adaptation to it. Most existing PM2.5 products have been derived from polar-orbiting satellites. This study exploits the use of the next-generation geostationary meteorological satellite Himawari-8/AHI (Advanced Himawari Imager) to document the diurnal variation in PM2.5. Given the huge volume of satellite data, based on the idea of gradient boosting, a highly efficient tree-based Light Gradient Boosting Machine (LightGBM) method by involving the spatiotemporal characteristics of air pollution, namely the space-time LightGBM (STLG) model, is developed. An hourly PM2.5 dataset for China (i.e., ChinaHighPM2.5) at a 5 km spatial resolution is derived based on Himawari-8/AHI aerosol products with additional environmental variables. Hourly PM2.5 estimates (number of data samples = 1 415 188) are well correlated with ground measurements in China (cross-validation coefficient of determination, CV-R2 = 0.85), with a root-mean-square error (RMSE) and mean absolute error (MAE) of 13.62 and 8.49 µg m−3, respectively. Our model captures well the PM2.5 diurnal variations showing that pollution increases gradually in the morning, reaching a peak at about 10:00 LT (GMT+8), then decreases steadily until sunset. The proposed approach outperforms most traditional statistical regression and tree-based machine-learning models with a much lower computational burden in terms of speed and memory, making it most suitable for routine pollution monitoring.
Article
Full-text available
To solve the problems of bus bunching and large gaps, this study combines bus holding and speed adjusting to alleviate them respectively considering the characteristics of passenger’s perceived waiting time. The difference between passenger’s perceived waiting time at stops and actual time is described quantitatively through the expected waiting time of passengers. Bus holding based on a threshold method is implemented at any stops for bunching buses, and speed adjusting based on a Markovian decision model is implemented at limited stops for lagging buses. Simulations based on real data of a bus route show that the integrated control strategy is able to improve the service reliability and to decrease passengers’ perceived waiting time at stops. Several insights have been uncovered through performance analysis: (1) The increase of holding control strength results in improvement of the headway regularity, and leads to a greater perceived waiting time though; (2) Compared to traveling freely, suitable speed guidance will not slow down the average cruising speed in the trip; (3) The scale of passenger demand and through passengers are the two key factors influencing whether a stop should be selected as a speed-adjusting control point.
Article
Full-text available
This article proposes an optimization model to set frequencies, vehicle capacities, required fleet and the stops serving each route along a transit corridor which minimize the total user and operating costs. The optimization problem is solved by applying the "Black Hole" algorithm, which imitates the movement of stars (solutions), towards a black hole (Best solution). The main contributions of the model are based on incorporating variable dwell times depending on bus stop demand not only to the passenger perceived journey times but also to the bus cycle times and on considering capacity constraints in both vehicles and bus tops. This led to a more accurate and realistic operating times and user perceived journey times. The application of the model to two case studies and the sensitivity analysis carried out demonstrate that for low levels of demand, constant dwell times can be assumed but being these times different between the different stops of the corridor, considering their demand. However, with high level of demand the difference found in operating costs and travel times strongly recommend incorporating variable dwell times in the model in order to achieve a more realistic design of transit corridor strategies.
Article
Full-text available
Examining the travel time variability (TTV) of buses, passenger cars and taxis is essential to obtain reliable travel time in urban daily trips. TTV analyses of three travel modes are conducted using travel time data collected on two urban arterial roads in Xi'an City. Firstly, the TTV is evaluated using statistical indexes. The results reveal that the TTV differs from vehicle to vehicle, period to period and site to site. Secondly, the finite mixture survival model is proposed to address the heterogeneity of travel time data by decomposing the population into several sub‐populations. Wasserstein distance and Kolmogorov–Smirnov test are used to further compare the sub‐populations of different vehicle types during different periods on different roads. Finally, based on the model analysis, it can be found that the finite mixture survival model is an accurate tool to examine the variability by capturing the heterogeneity of travel time data. The difference among the sub‐populations suggests different travel behaviours. It concludes that more diverse travel behaviours result in higher TTV. An accurate investigation on TTV is valuable for travellers’ mode choices and transportation management agencies to obtain reliable travel time information and improve traffic efficiency.
Article
An important aspect of the quality of a public transport service is its reliability, which is defined as the invariability of the service attributes. In order to measure reliability during the service planning phase, a key piece of information is the long-term prediction of the density of the travel time, which conveys the uncertainty of travel times. This work empirically compares probabilistic models for the prediction of the conditional probability density function (PDF) of the travel time and proposes a simulation framework taking as input the latter distributions to approximate the expected secondary delays, a measure of the reliability of public transport services. Two types of probabilistic models, namely similarity-based density estimation models and a smoothed logistic regression for probabilistic classification model, are compared on a dataset of more than 41,000 trips and 50 bus routes of the city of Montréal. A similarity-based density estimation model using a k-nearest neighbors method and a log-logistic distribution predicted the best estimate of the true conditional PDF of the travel time and generated the most accurate approximations of the expected secondary delays on this dataset. This model reduced the mean squared error of the expected secondary delay by approximately 9% compared to the benchmark model, namely a random forest. This result highlights the added value of modeling the conditional PDF of the travel time with probabilistic models.
Article
With the boom of big data, the Internet contains more and more personal behavior information, but it is difficult to extract effectively. A model involving multivariate processing capability must be constructed to deal with these time series with complex characteristics. In this paper, a novel hybrid model embedding Baidu Search Index is therefore proposed to implement multi-step ahead subway passenger flow forecasting. Firstly, we collect data from informative Baidu Search Index, reduce dimensionality, and screen out the powerful predictors by statistical analysis. Secondly, we extract matching common modes at similar time scales between the subway passenger flow and screened Baidu Search Index via multivariate mode decomposition being optimized by multi-objective algorithm. Furthermore, to eliminate pseudo statistical causality, we select the optimal combination of modal components between subway passenger flow and its corresponding Baidu Search Index at each time scale by an innovative multi–modal analysis strategy. Thirdly, we reconstruct the forecasting values of each selected optimal combination as the final results. The empirical results of Beijing, Shanghai and Guangzhou show that the proposed model can significantly outperform six benchmark models in both the level and directional accuracy. So introducing Baidu Search Index creates a sound opportunity to enhance the subway passenger flow forecasting ability.
Article
Unreliable transit services can negatively impact transit ridership and discourage passengers from regularly choosing public transport. As the most important content of bus service reliability, accurate bus arrival prediction can improve travel efficiency for enabling a reliable and convenient transportation system. Accordingly, this paper proposes a novel deep learning method, i.e. variational mode decomposition long short-term memory (VMD-LSTM), for bus travel speed prediction in urban traffic networks using a forecast of bus arrival information on variable time horizons. The method uses the temporal and spatial patterns of the average bus speed series. The results show that the VMD-LSTM model outperforms other models on forecasting bus link speed series in future time intervals, whereas the artificial neural network model achieves the worst prediction. In conclusion, the VMD-LSTM method can detect irregular peaks of transit samples from a series of temporal or spatial variations and performs better on major and auxiliary corridors.
Article
Bus timetables play an important role in improving the level of service and reducing operations costs of a bus transit system. Without dedicated bus lanes, bus travel times, which are important input data for bus timetabling, are usually time-dependent due to recurrent traffic congestion. However, few studies on bus timetabling have explicitly considered such travel time time-dependency in creating timetables. This paper addresses the problem of how to optimally modify an existing single-line bus timetable by slightly shifting vehicle departure times at the departure terminal and holding vehicles at other stops taking into account time-dependent travel times. The problem is mathematically formulated as a nonlinear programming model. According to the special structure and properties of the model, a derivative-free constrained compass search algorithm with revised step-size updating rule is applied to solve it. A case study of a bus line in Beijing, China is conducted to demonstrate the effectiveness and efficiency of the proposed model and solution algorithm. The case study results show that by utilizing the proposed methodology the optimized bus timetable can significantly reduce the total passenger travel time and improve ridership comfort, while rarely increasing the average vehicle cycle time. This study offers a promising and practical methodology for optimizing single-line bus service taking into account time-dependent travel times.
Article
Providing accurate information about travel time to passengers is important in public transportation. In this aspect, the travel time of busses between two consecutive stops can be handled as time series. Then, the future travel time can be predicted using time series forecasting methods. In this study, we propose a novel method with three-layer architecture to predict bus travel time between two stops. In the first layer of the proposed method, initial prediction is made by processing measured data. In the second layer, residuals are predicted in the specified depth. In the third layer, the final prediction is made by integrating the results of two previous layers with three different approach. The experiments were performed on the data, which were obtained from public transportation of Istanbul, using various time series forecasting methods in form of traditional and proposed architecture. The results show that proposed method outperforms traditional approach with approximately MAPE of 6.