Conference PaperPDF Available

Pattern-Based Bus Travel Time Prediction under HeterogeneousTraffic Conditions

Authors:

Abstract and Figures

In recent times, congestion levels have been increasing in Indian cities due to rapid changes in urbanization, which will lead to several negative impacts such as delays and pollution. There is a need to explore better traffic operations and management systems to overcome these problems. Attracting more travelers towards public transport from personal vehicles is one of the ways to reduce congestion levels. In this context, provision of next bus arrival time information at bus stops to passengers will help improve the situation. This comes under Advanced Public Transportation Systems (APTS), a major functional area of Intelligent Transportation Systems. The reliability of such information provided to passengers depends on the prediction method used, which in turns depends on the input data used in the method, which means that identifying the most significant/appropriate input data and using them in the method is important. So, in the present study, travel time pattern analysis was carried out to find the most significant inputs by performing valid statistical tests for each day of the week separately. Also, a model-based Kalman filtering algorithm was developed to predict bus travel time by using the identified patterns effectively based on temporal discretization under heterogeneous traffic conditions. The performance of the proposed algorithm shows a clear improvement in prediction accuracy when compared with a prediction method using space discretization
Content may be subject to copyright.
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
B. Anil Kumar
Graduate Student
Department of Civil Engineering
Indian Institute of Technology Madras
Chennai-600036
INDIA
E-mail: raghava547@gmail.com
Lelitha Vanajakshi
1
Associate Professor
Department of Civil Engineering
Indian Institute of Technology Madras
Chennai 600 036
INDIA
Ph: 91 44 2257 4291, Fax: 91 44 2257 4252
E-mail: lelitha@iitm.ac.in
and
Shankar C. Subramanian
Associate Professor
Department of Engineering Design
Indian Institute of Technology Madras
Chennai 600 036
INDIA
Ph: 91 44 2257 4705
E-mail: shankarram@iitm.ac.in
Paper Submitted for presentation and publication in
Transportation Research Record, Transportation Research Board,
National Research Council, Washington, D. C.
Word Count: Body text (6782) + Figures (1500) + Tables (1000) = 7032.
Submitted on: August 1, 2013
1
Corresponding author
Kumar, Vanajakshi, and Subramanian 1
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
ABSTRACT
In recent times, congestion levels have been increasing in Indian cities due to rapid changes
in urbanization, which will lead to several negative impacts such as delays and pollution. There
is a need to explore better traffic operations and management systems to overcome these
problems. Attracting more travelers towards public transport from personal vehicles is one of the
ways to reduce congestion levels. In this context, provision of next bus arrival time information
at bus stops to passengers will help improve the situation. This comes under Advanced Public
Transportation Systems (APTS), a major functional area of Intelligent Transportation Systems.
The reliability of such information provided to passengers depends on the prediction method
used, which in turns depends on the input data used in the method, which means that identifying
the most significant/appropriate input data and using them in the method is important. So, in the
present study, travel time pattern analysis was carried out to find the most significant inputs by
performing valid statistical tests for each day of the week separately. Also, a model-based
Kalman filtering algorithm was developed to predict bus travel time by using the identified
patterns effectively based on temporal discretization under heterogeneous traffic conditions. The
performance of the proposed algorithm shows a clear improvement in prediction accuracy when
compared with a prediction method using space discretization.
Kumar, Vanajakshi, and Subramanian 2
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
INTRODUCTION
India is a developing country in terms of economy, infrastructure, transportation and facilities
leading to rapid changes in all fields including transportation. Urban development leads to more
traffic resulting in several undesirable consequences such as traffic congestion, delay and
pollution. As demand in mobility increases, the use of Intelligent Transportation Systems (ITS)
for better management of traffic towards meeting these challenges is becoming more important.
ITS will help to provide reliable, effective and useful information to the travelers, by applying
latest developments in communication and information technology to surface transportation
systems. ITS is a general term in use, which encompasses different functional areas, of which
Advanced Public Transportation Systems (APTS) and Advanced Traveler Information Systems
(ATIS) seem to be useful in the Indian scenario. Attracting travelers to use public transportation
systems is a way to reduce traffic congestion. In this context, the prediction of both bus arrival
times and travel times is crucial to make the public transport more attractive. However, for this to
be effective, the information provided to passengers should be reliable. The present study
contributes to the area of bus travel time prediction for the development of accurate passenger
information systems.
The prediction techniques commonly used can be mainly classified into data-driven and
model-based techniques. Data-driven techniques require a good data base where as model-based
techniques require a relatively limited data base. However, irrespective of the amount of data
required, one should use the most significant/appropriate input for better prediction accuracy.
Schweiger (1) suggested that the performance of prediction techniques in terms of their accuracy
depends on the travel time patterns of the data collected. Identifying the most significant and
effective input data and using them in prediction methods will hopefully improve their
performance. Traffic patterns can be typically classified as yearly, monthly, weekly, daily and
hourly. Yearly pattern analysis checks whether the travel time data of same-day/same-time trip
of the previous year(s) have a similar pattern as that of the current trip. Similarly, monthly,
weekly and daily patterns are compared with the corresponding month’s, week’s and day’s trips
respectively. Trip-wise pattern analysis checks whether the current trip has a similar pattern as
that of the previous trips on the same day. This may help to capture the traffic conditions on that
particular day such as accidents and route diversions. In the present study, bus travel time
prediction using a model based approach is attempted. The most significant input data that need
to be used are identified by carrying out a pattern analysis of the data using statistical analysis.
The most significant inputs, thus identified, are used in the prediction model.
LITERATURE REVIEW
Many researchers have suggested different techniques to find out travel time patterns. Ohba et al.
(2) obtained travel time patterns from the mean of the smoothened travel times collected from
toll booths of expressways to predict travel time. Wu et al. (3) obtained travel time patterns from
the speeds of vehicles from loop detectors and concluded the weekly pattern to be significant in
that data. Kwon et al. (4) used the data obtained from loop detectors to obtain day-to-day travel
time trends to predict travel time using regression analysis. It was reported that there is a strong
dependence between two successive vehicle travel times within a day. Lee (5) used Global
Positioning Systems (GPS) data to analyze travel time patterns using historical travel time
Kumar, Vanajakshi, and Subramanian 3
trajectories similar to the current trip. Kumar (6) used GPS data to obtain travel time patterns by
using parametric statistical tests to predict bus travel times.
Various techniques are being used to predict bus arrival times such as historical and real-
time approaches, statistical techniques, machine-learning techniques and model-based
techniques. Historical methods predict the travel time of a particular time period (trip) by
averaging the previous many same time periods (trips). These methods will show better
performance under expected traffic conditions. However, under unexpected traffic conditions,
the prediction accuracy will be reduced. In real-time methods, travel time can be predicted for
the next time period by using the present time period’s value, i.e., it assumes that the future
travel time is the same as the present one. This method is reliable, if real-time data are
continuously available and traffic conditions are normal. Any disturbance in receiving data
causes deviation in the expected performance of the method. Statistical techniques are very
popular to predict travel times, which include time-series methods and regression techniques.
Time-series based predictions make the underlying assumption that historical travel patterns will
remain the same in future. This technique needs a large amount of reliable data. Regression
techniques will predict the dependent variable (travel time) by using an equation formed by a set
of independent variables that can affect travel time. These independent variables may include
road conditions, traffic conditions, signals, intersections, driver characteristics, and vehicle
composition. The accuracy of prediction depends on identifying and applying the suitable
independent variables. Machine learning techniques such as Artificial Neural Networks (ANN)
and Support Vector Machines (SVM) are commonly used to predict travel time because of their
ability to solve complex non-linear relationships. These types of techniques need large amount of
data to train the system. Model-based techniques develop models that can capture the dynamics
of the system by establishing mathematical relationships between appropriate variables. Many of
the model-based studies use estimation techniques such as the Kalman Filtering Technique
(KFT) for the estimation/prediction of traffic parameters such as density, travel time, etc. The
following table presents a summary of literature related to this study with appropriate remarks
identifying their features and limitations.
Table 1 Summary of Literature Review
S. No
Author
Technique used
Traffic
Characteristic
Remarks
1
Lin (7)
Empirical
analysis
Homogeneous
Used location data, schedule
information, waiting time at bus stops.
2
Bo et al. (8)
Linear
Regression
Homogeneous
Used one month weekdays GPS based
bus travel time data.
3
Jeong (9)
Regression
Homogeneous
Evaluated a historical data based
model, regression model and ANN
model.
4
Patnaik (10)
Regression
Homogeneous
Developed models using path-based
data. Study results were not
corroborated with field data.
5
Bhandari (11)
Auto Regressive
(AR) model
Homogeneous
Used seven months’ AVL data.
6
Chien et al. (12)
ANN
Homogeneous
Developed link-based and path-based
ANN models to predict bus arrival
Kumar, Vanajakshi, and Subramanian 4
time.
7
Dailey et al. (13)
Kalman filtering
Homogeneous
Compared a historical data based
model, regression model and ANN
model.
8
Son et al. (14)
Kalman filtering
Homogeneous
Predicted travel time from bus stop to
stop line at signalized intersections.
9
Shalaby (15)
Kalman filtering
Homogeneous
Used data collected from AVL and
APC to predict bus travel time.
10
Nanthawichit et
al. (16)
Kalman filtering
Homogeneous
Used data collected from GPS
equipped vehicles and loop detectors
to estimate traffic parameters,
compared results with historical,
regression and historical methods.
11
Shalaby (17)
Kalman filtering
Homogeneous
Used data collected from AVL and
APC to predict bus travel time.
12
et al. (18)
ANN model
Heterogeneous
Used GPS based collected data in
Chennai, compared ANN with
multiple linear regression models.
13
Vanajakshi et al.
(19)
Kalman filtering
Heterogeneous
Used only previous two buses data to
predict next bus travel time based on
space discretization approach.
14
Padmanabhan et
al. (20)
Kalman filtering
Heterogeneous
Included dwell time explicitly in space
discretization approach to predict bus
travel time.
Most of the studies discussed above dealt with homogeneous traffic conditions. A few
studies have been reported from heterogeneous traffic conditions that tend to exist in developing
countries. Rama Krishna et al. (18) used 25 trips of GPS data to develop Multiple Linear
Regression and ANN models. Vanajakshi et al. (19) used space discretization approach to
predict bus travel time. In space discretization, the route was spatially discretized into smaller
subsections. The travel time of a bus in the upcoming subsections were predicted. The reason for
such an approach was limited data availability. Under such scenario, space discretization, where
one required data from only two previous buses to implement the prediction algorithm, was
advantageous. The basic assumption in that approach is that the trip wise data are good enough
for prediction and the model hypothesized a relation in travel time between neighboring
subsections. Padmanabhan et al. (20) extended the above study by analyzing the dwell times
explicitly. Kumar (6) used GPS data to find out travel time patterns in the data and reported a
strong weekly pattern followed by trip-wise pattern.
It can be observed that the input data for the prediction methods were taken arbitrarily in
most of the above studies, except by Kumar (6). Kumar (6) used the same pattern for all days of
the week and for all traffic conditions to develop a method for predicting travel time. However,
none of the studies analyzed the travel time pattern of all days of the week separately. It is
important to do so, since patterns are not likely to be the same for all days in a week. For
example, the travel time on weekends may follow a different pattern compared to weekdays.
Identifying the most significant trips and incorporating them during the analysis will definitely
help in improving the accuracy of the prediction method. Also, it can be observed that studies
Kumar, Vanajakshi, and Subramanian 5
reported from heterogeneous traffic conditions mainly dealt with spatial discretization than time
discretization. This is mainly due to lack of availability of a temporal data base. The present
study is one of the first attempts to study the bus travel time prediction under heterogeneous
traffic conditions using temporal discretization. Thus, the present study has two objectives:
1.Analysis of travel time pattern for each day of the week separately by statistical analysis, and
2.To develop a bus travel time prediction method that uses the identified patterns based on
temporal discretization under heterogeneous traffic conditions.
DATA COLLECTION AND EXTRACTION
GPS, widely used to collect data for APTS applications, tracks vehicles continuously and
provides their location information. In the present study, data were collected by using
permanently fixed GPS units in Metropolitan Transport Corporation (MTC) buses in the
metropolitan city of Chennai, India. For the purpose of collecting data, an MTC bus route, 5C, is
selected that spans 15kms, connecting the Parry’s bus depot, located in the northern part of the
city, to the Taramani bus depot, in the southern part of the city. There are 25 bus stops and 14
signalized intersections in this route. The selected road stretch is a typical representative of
heterogeneous traffic conditions. The route depicts several types of urban roads with varying
geometric characteristics, volume levels and land use characteristics such as residential,
commercial and institutional areas. The collected GPS data include the ID of the GPS unit, time
stamp, and latitude and longitude of the location at which the entry was made. Real time
communication of this data was made possible through General Packet Radio Service (GPRS).
The collected data is stored using Sequential Query Language (SQL) database encompassing all
trips in a day.
The data from all 7 buses running in the selected route (route number 5C) reporting every
five seconds from 6 AM to 8 PM was used. The average headway between two consecutive
vehicles in this route is around 45 minutes. Thus, a total of 975 trips data were collected during
the 45 days data collection period from 1st January 2013 to 14th February 2013.
From the GPS data, the distance between two consecutive entries was calculated by using
the Haversine formulae (21), which gives the great circle distances between two points on a
sphere from their latitudes and longitudes as
, (1)
where r is the radius of the earth (6378.1 km), indicate the latitude of point 1 and point 2,
indicate the longitude of point1 and point 2. After this process, the data consist of the
travel times and the corresponding distance between consecutive locations for all the buses. The
entire section was divided into 150 subsections each of 100m length and the time taken to cover
each subsection was calculated by using the linear interpolation technique. To identify the
patterns in the data, fifteen days data from 29th January 2013 to 12th February 2013 were taken
as the output set. Since the headway between the buses is approximately 45 minutes, one trip for
every hour was used from 6 AM to 8 PM. Thus, a total of 210 (14 trips/day × 15 days trips) were
generated. After analyzing the collected data, all these trips were divided into 4 zones, based on
starting time of the each trip as morning off-peak (6 AM 8 AM), morning peak (8 AM 10
AM),, afternoon off-peak (10 AM 3 PM) and evening peak (3 AM 8 AM). Travel time
patterns for peak (both morning peak and evening peak together) and off-peak (both morning
peak and evening peak together) trips were analyzed separately for each day of the week. Each
Kumar, Vanajakshi, and Subramanian 6
peak and off-peak output trip was compared with the 28 previous days’ corresponding input trips
with the same starting time as that of the output trip.
TRAVEL TIME PATTERN ANALYSIS
To analyze the travel time patterns, the Z-test for the mean of a population of differences for
paired samples was conducted for the hypothesis testing at 5% level of significance (22). The test
compared each 100m subsections’ travel time of the output trip to the input trip to check whether
the difference in the mean of the pair is zero or not. The check for daily pattern analyzes the
significance of trips that happened on the same time period of the previous days to that of the
current trip. A basic assumption of the Z-test for the mean of a population of differences for
paired samples data is that the differences of 100 m subsection travel time of the output trip and
the input trip follow a normal distribution. Tests were carried out to find whether this assumption
is true by using a statistical measure “skewness” given as
, (2)
where is the sample mean, M is the sample median and s is the standard deviation of the
sample. According to Rees et al. (23), if the skewness value is greater than +1, the distribution
has positive skew and if the skewness value is less than -1, the distribution has negative skew. If
the skewness value lies in between -1 and +1, the distribution is roughly symmetrical, i.e., it
follows a normal distribution. In the present study, the skewness is calculated for the differences
of 100 m subsection travel times between the output trip and the input trip. The results of
skewness calculated for various trips on a sample day, 29th January 2013, are shown in Figure 2.
FIGURE 1 Skewness calculated for various trips on 29th January 2013.
From the figure it can be observed that the calculated skewness values lie within the
range of -1 and +1. Since none of the values are outside the range of -1 and +1, it can be
concluded that differences of 100 m subsection travel time of the output trip and the input trip
follow a normal distribution and hence the Z-test can be adopted for hypothesis testing. The Z-
test is used to test the hypothesis, and is given by
Kumar, Vanajakshi, and Subramanian 7
, (3)
where is the mean of differences of 100 m section travel time of the output trip and the input
trip, is the standard deviation of the sample and n is the sample size. In the present study, the
test was conducted at 5% level of significance. So, if the calculated Z-value lies in between -1.96
and +1.96, then we can say that the null hypothesis is accepted, which means that the mean of
differences is zero. To analyse the daily pattern, 5880 (15 days × 14 trips/day × 28 preceding
days’ same time trips) ‘Z’ values were calculated. Then, a ratio has been calculated between the
number of times the null hypothesis was accepted to the total number of times the hypothesis
was tested for each case. If the ratio is high, we can conclude that the target trip is significant in
predicting the current trip (output trip). The results obtained from the statistical analysis for two
sample cases are shown in Figure 3A and Figure 3B, and the detailed results are given in Table
2. Table 2 illustrates the trend in the rankings followed on each day for peak and off-peak traffic
conditions separately. Over all, it can be seen that Sunday has a strong weekly pattern without
any strong daily pattern, and the other days have strong daily pattern too.
FIGURE 3A Travel time patterns observed for Sunday peak period.
FIGURE 3B Travel time patterns observed for Monday peak period.
Kumar, Vanajakshi, and Subramanian 8
TABLE 2 Pattern analysis results
Rank
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
1
d-7
d-28
d-5
d-7
d-3
d-5
d-2
d-9
d-3
d-2
d-1
d-1
d-3
d-4
2
d-14
d-7
d-2
d-25
d-5
d-7
d-9
d-14
d-8
d-6
d-4
d-11
d-28
d-8
3
d-15
d-21
d-7
d-2
d-7
d-11
d-26
d-26
d-28
d-8
d-9
d-16
d-5
d-15
4
d-21
d-14
d-11
d-3
d-1
d-12
d-1
d-28
d-2
d-13
d-20
d-2
d-21
d-21
5
d-28
d-1
d-14
d-4
d-4
d-25
d-5
d-2
d-7
d-22
d-2
d-3
d-1
d-22
6
d-22
d-15
d-4
d-5
d-6
d-6
d-6
d-7
d-13
d-1
d-3
d-4
d-2
d-28
7
d-1
d-22
d-12
d-6
d-8
d-15
d-7
d-13
d-19
d-3
d-7
d-7
d-4
d-17
8
d-8
d-5
d-18
d-13
d-24
d-20
d-19
d-1
d-1
d-7
d-10
d-22
d-8
d-24
9
d-10
d-8
d-24
d-14
d-10
d-27
d-20
d-6
d-5
d-17
d-11
d-8
d-9
d-25
10
d-18
d-9
d-6
d-18
d-11
d-4
d-21
d-15
d-10
d-20
d-16
d-14
d-7
d-5
11
d-26
d-16
d-10
d-26
d-12
d-8
d-27
d-19
d-21
d-27
d-22
d-28
d-12
d-7
12
d-2
d-18
d-13
d-10
d-14
d-10
d-11
d-27
d-26
d-10
d-6
d-15
d-15
d-16
13
d-3
d-27
d-17
d-11
d-15
d-13
d-13
d-5
d-27
d-14
d-14
d-18
d-22
d-1
14
d-4
d-10
d-23
d-12
d-13
d-14
d-18
d-16
d-6
d-15
d-15
d-9
d-24
d-6
15
d-9
d-17
d-25
d-17
d-17
d-18
d-25
d-20
d-12
d-21
d-27
d-25
d-10
d-10
16
d-16
d-19
d-26
d-19
d-18
d-19
d-8
d-21
d-15
d-24
d-28
d-27
d-14
d-11
17
d-19
d-3
d-3
d-24
d-19
d-26
d-4
d-22
d-9
d-26
d-8
d-20
d-17
d-14
18
d-23
d-24
d-27
d-27
d-21
d-1
d-12
d-8
d-14
d-5
d-13
d-23
d-23
d-23
19
d-25
d-25
d-9
d-21
d-25
d-3
d-14
d-12
d-16
d-9
d-18
d-6
d-25
d-26
20
d-27
d-26
d-16
d-23
d-26
d-17
d-22
d-18
d-17
d-23
d-12
d-17
d-26
d-2
21
d-5
d-2
d-19
d-28
d-27
d-21
d-28
d-23
d-18
d-28
d-21
d-21
d-13
d-3
22
d-12
d-4
d-20
d-16
d-16
d-22
d-15
d-25
d-20
d-16
d-24
d-24
d-16
d-9
23
d-17
d-11
d-21
d-9
d-28
d-24
d-16
d-4
d-22
d-19
d-19
d-10
d-6
d-12
24
d-20
d-12
d-28
d-20
d-2
d-28
d-23
d-10
d-24
d-12
d-23
d-13
d-11
d-13
25
d-24
d-13
d-15
d-8
d-20
d-9
d-17
d-11
d-23
d-4
d-25
d-5
d-18
d-19
26
d-6
d-20
d-22
d-15
d-23
d-2
d-10
d-3
d-11
d-11
d-17
d-19
d-19
d-18
27
d-11
d-23
d-1
d-22
d-22
d-16
d-3
d-17
d-4
d-18
d-5
d-12
d-20
d-20
28
d-13
d-6
d-8
d-1
d-9
d-23
d-24
d-24
d-25
d-25
d-26
d-26
d-27
d-27
* (d-n) represents previous nth day same time trip
The above ranking was used while selecting the input data for the prediction method. In
order to take into account the present day traffic conditions, data from previous two buses (PV1
and PV2) were also taken into consideration. This data will reflect the effect of events that have
taken place in that subsection on that day.
BUS ARRIVAL TIME PREDICTION METHOD
A model based approach using Kalman filtering was adopted in this study for the bus travel
time/arrival time prediction. The KFT (24) can be used to estimate state variables, which are
used to characterize systems/processes that are described by state space models. The
implementation of the Kalman filter requires information regarding the system’s dynamics,
Kumar, Vanajakshi, and Subramanian 9
statistical information of the system disturbances and measurement errors. It uses the model and
system inputs to predict the a priori state estimate and uses the output measurements to obtain
the a posteriori state estimate. Overall, it is a recursive algorithm, so that new measurements can
be processed when they are obtained. It needs only the current instant state estimate, current
input and output measurements to calculate next instants state estimate. The evolution of travel
time between various time intervals in a given subsection is assumed to be
(4)
where A is a parameter which relates the time taken to travel in a given subsection, is the
travel time taken for covering the given subsection at time t and is the associated process
disturbance. The measurement process was assumed to be governed by
(5)
where is the measured travel time in a given subsection at time t and is the
measurement noise. It was further assumed that and are zero mean white Gaussian
noise signals with and being their corresponding variances. Thus, two sets of data are
required to implement the above scheme - one set of data for the time update equations to
calculate the parameter ‘A’ and another data set to be used in the measurement update equations
to generate the a posteriori travel time estimate. So, the pattern analysis results were arranged as
two sets of data in the order of preference from lower to higher for both peak and off-peak time
zones of all days as shown in Table 3. The data in column 1 were used to obtain the value of A
during each time interval and the data from column 2 were used to obtain the a posteriori travel
time estimate.
Table 3 Data set used for KF technique
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
d-13
d-11
d-8
d-1
d-9
d-22
d-24
d-3
d-25
d-4
d-26
d-5
d-27
d-20
d-6
d-24
d-22
d-15
d-23
d-20
d-10
d-17
d-11
d-23
d-17
d-25
d-19
d-18
d-20
d-17
d-28
d-21
d-2
d-28
d-23
d-16
d-24
d-22
d-23
d-19
d-11
d-6
d-12
d-5
d-20
d-19
d-16
d-17
d-15
d-28
d-20
d-18
d-24
d-21
d-16
d-13
d-27
d-25
d-16
d-9
d-26
d-25
d-22
d-14
d-17
d-16
d-12
d-18
d-26
d-25
d-23
d-19
d-27
d-3
d-21
d-19
d-12
d-4
d-14
d-9
d-13
d-8
d-23
d-17
d-16
d-9
d-26
d-25
d-18
d-17
d-8
d-25
d-15
d-12
d-28
d-27
d-14
d-10
d-4
d-3
d-23
d-17
d-13
d-15
d-18
d-13
d-6
d-27
d-15
d-14
d-24
d-22
d-2
d-26
d-13
d-10
d-14
d-12
d-11
d-27
d-26
d-21
d-6
d-22
d-15
d-12
d-18
d-10
d-6
d-24
d-11
d-10
d-21
d-20
d-10
d-5
d-16
d-11
d-7
d-9
d-8
d-1
d-18
d-12
d-24
d-8
d-19
d-7
d-1
d-19
d-10
d-7
d-8
d-4
d-22
d-28
d-4
d-14
d-6
d-4
d-6
d-5
d-13
d-7
d-3
d-2
d-2
d-1
d-21
d-15
d-11
d-7
d-1
d-7
d-1
d-26
d-2
d-28
d-20
d-9
d-21
d-5
d-14
d-7
d-2
d-5
d-5
d-3
d-9
d-2
d-8
d-3
d-4
d-1
d-28
d-3
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
TV
TV
TV
TV
TV
TV
TV
Kumar, Vanajakshi, and Subramanian 10
The steps in the algorithm were as follows:
1. The entire section of travel between origin and destination was divided into N
subsections of equal length (100 m).
2. The travel time data from column 1 were used to obtain the value of A through
, (6)
3. Let denote the travel time taken by the test vehicle (which is the vehicle for which
the travel time needs to be predicted) to cover a given subsection. It was assumed that
(7)
, (8)
where is the estimate of travel time of the TV in the tth time interval.
4. For , the following steps were performed:
a. The priori estimate of the travel time was calculated by using
, where the superscript denotes the a priori estimate and
the superscript denotes the a posteriori estimate.
b. The a priori error variance (denoted by ) was calculated using
(9)
c. The Kalman gain (denoted by K) was calculated by using
(10)
d. The a posteriori travel time estimate and error variance were calculated using,
respectively,
(11)
(12)
Thus, the objective here is to predict the travel time of the TV using the travel time
obtained from previous all vehicles including PV1 and PV2 in a given subsection.
RESULTS AND DISCUSSIONS
The results obtained from the implementation of the algorithm presented in the previous section
which will be referred to as the Time Discretization method, are discussed in this section. Since
one of the main contributions of this study is the incorporation of significant historic data as
input to predict bus travel time, a comparison was carried out with a scheme that do not use the
pattern based historic data named as space discretization method (19).
Kumar, Vanajakshi, and Subramanian 11
The scheme presented in this study was used to predict travel time for each 100 m
subsection. However, it was observed from the data that there will be a distance of atleast 500m
between the bus stops. Hence, the final comparison was made between predicted and measured
travel times for every 500m subsections. The predicted travel time to cover a 500 m subsection
was found out by adding the predicted travel time from the corresponding five 100 m travel time
values obtained from the prediction algorithm. The prediction was carried out for one week
period (29th January 4th February 2013). The Mean Absolute Percentage Error (MAPE) was
used to quantify the prediction accuracy, and was calculated by using
, (13)
where is the predicted travel time of TV to cover a given subsection and is the
corresponding travel time measured from the field. Figure 4A and Figure 4B shows sample
comparison of the predicted travel times and the measured travel times over 500 m subsections
along with the corresponding MAPE. It can be observed that the predicted values are closely
matching with the measured data.
Figure 4A Predicted and measured travel times for a peak period trip on 31st January 2013.
FIGURE 4B Predicted and measured travel times for an off-peak trip on 31st January 2013.
Kumar, Vanajakshi, and Subramanian 12
Earlier results from this study (25) showed that Sunday is having a distinctly different pattern
and the other days being very similar in their pattern. Hence, analysis was carried out to study
the effect of these patterns on the final application accuracy. The analysis was carried out for
various combinations of travel time patterns in time discretization approach as follows.
1. Method 1: Using the same travel time patterns for all days and without separating into
peak and off-peak traffic conditions.
2. Method 2: Using different travel time patterns for each day of the week, but without
separating into peak and off-peak traffic conditions.
3. Method 3: Using the same travel time patterns for all days together, but separating the
trips into peak and off-peak.
4. Method 4: Using Different travel time patterns for Sunday separately and all weekdays as
another group, and having peak and off-peak separated.
5. Method 5: Using different travel time patterns for each day of the week, and separating
the trips into peak and off-peak.
Figure 5 MAPE comparison with various approaches
The results obtained for a test period of one week is shown in Figure 5 along with space
discretization results. It can be observed that Method 3, 4 and 5 (analysing peak and off-peak
separately) are performing better than Method 1, Method 2 (without separating into peak and off-
peak) and Space Discretization. From Figure 5, it can also be observed that separating day wise
is not making much difference. It can be concluded that analysing the trips without having
separate day wise pattern but separating into peak and off-peak segments, may be the best
solution taking into account accuracy and model development effort.
A comparison of performance of the best method of time discretization (Method 5) was
carried out with the space discretization (19). The space discretization will use the data obtained
from previous two vehicles (PV1 and PV2) to predict the travel time for next vehicle (TV). The
errors obtained for all trips for a sample day (3rd February 2013) is shown in Figure 6.
Kumar, Vanajakshi, and Subramanian 13
FIGURE 6 MAPE values for all trips of a sample day.
From Figure. 6, it can be observed that the time discretization method is performing
better compared to space discretization in most of the cases. Table 4 shows the performance
comparison for all days.
Table 4 MAPE Comparison between Time and Space discretization’s
Date
Time
Discretization
Space
Discretization
29th January 2013
25.88
29.43
30th January 2013
29.16
33.57
31st January 2013
29.95
33.56
1st February 2013
29.88
32.53
2nd February 2013
30.32
31.95
3rd February 2013
20.55
22.10
4th February 2013
27.55
29.16
SUMMARY AND CONCLUSIONS
The accuracy of the bus arrival time information provided to passengers plays a key role in its
acceptance. In order to improve the accuracy of the system, one should develop the prediction
method carefully. One factor that can improve the performance of the prediction method is the
choice of the correct input data. The first part of the present study conducted a pattern analysis
separately for each day, using 45 days’ data to identify the most significant input that can be used
for the prediction method. The analysis was carried out separately for peak and off-peak periods
and used a parametric statistical test (Z-test). It was observed that Sundays followed a strong
weekly pattern whereas the other days followed a weekly and daily pattern. Data from trips on
the same day were also used in the prediction method, in addition to the most significant inputs
identified by pattern analysis, to take into account the same day variations. However, these
patterns may be site specific and may need to be identified for a new location. The methodology
used in the study can be followed to carry out similar analysis in a new location.
The identified pattern were used in the end application of bus travel time prediction. The
model developed was based on time discretization and represented the evolution of travel time in
Kumar, Vanajakshi, and Subramanian 14
a subsection over time. This was different from the earlier studies where evolution of travel time
between subsections was considered. The discretization over time is expected to reflect the effect
of roadway characteristics such as carriageway width, signalized intersections, etc., in a given
subsection more effectively. The performance of the method was compared with space
discretization followed in previous studies. Results obtained showed the proposed algorithm
performing better than the space discretization approach.
The study also analysed the effect of these patterns on the accuracy. A comparison of the
prediction accuracy with and without considering the day wise patterns and traffic condition wise
patterns was carried out. The results showed the patterns based on traffic condition having a
bigger impact on the prediction accuracy than the day wise pattern.
The main challenge in using the time discretization approach is the requirement of a
sufficiently large data set (say for a month). Further analysis can be carried out to identify the
optimum number of data points required for a reasonable accuracy. The proposed method may
be improved further by explicitly incorporating section specific characteristics such as bus stops
and signals. The predicted travel time obtained from the proposed algorithm can be expressed in
terms of time remaining or actual clock time and can be displayed at bus stops, within bus or
through web portals or cell phone messages.
ACKNOWLEDGEMENT
The authors acknowledge the support for this study as a part of the sub-project CIE/10-
11/168/IITM/LELI under the Centre of Excellence in Urban Transport project funded by the
Ministry of Urban Development, Government of India, through letter No. N-11025/30/2008-
UCD.
REFERENCES
1. Schweiger, C.L. Real-time bus arrival information systems. In Transportation Research
Board, TCRP synthesis 48, 2003.
2. Ohba, Yoshikazu, Hideki, and Masao. Travel time prediction method for expressway
using toll collection system data. 7th World Congress on Intelligent Transport Systems,
Turin, 2000.
3. Wu, C.H., D.C. Su, J. Chang, C.C. Wei, J.M. Ho, K.J. Lin, and D. Lee. An advanced
traveler information system with emerging network technologies. Proceedings of 6th
Asia-Pacific Conference Intelligent Transportation Systems Forum, pp. 230-231.
4. Kwon, J., B. Coifman, and P. Bickel. Day-to-day travel-time trends and travel-time
prediction from loop-detector data. In Transportation research board: Journal of the
Transportation Research Board, No. 1717, Transportation Research Board, National
Research Council, Washington, D.C.,2007, pp. 120-129.
5. Lee, and W. Chien. HTTP: a new framework for bus travel time prediction based on
historical trajectories. Proceedings of the 20th International Conference on Advances in
Geographic Information Systems, 2012, pp. 279-288.
6. Kumar, S.V., and L. Vanajakshi. Pattern identification based bus arrival time prediction.
Proceedings of the Institution of Civil Engineers Transport, 2011,
http://www.icevirtualli brary.com/content/article/10.1680/tran.12.00001. Accessed Jan,
14, 2013.
7. Lin, W.H., and J. Zeng. An Experimental Study of real-time Bus Arrival Time Prediction
with GPS Data. In Transportation Research Board: Journal of the Transportation
Kumar, Vanajakshi, and Subramanian 15
Research Board, No. 1666(1), Transportation Research Board, Washington, D.C., 1999,
pp. 13-20.
8. Bo, Y., L. Jing, Y. Bin, andZhongjen. An Adaptive Bus Arrival Time Prediction Model.
Eastern Asia Society for Transportation studies, 2009. http://www.easts.info/
publications/journalproceedings/journal2010/100064.pdf. Accessed Jan, 15, 2013.
9. Jeong, R., and L.R Rilett. Bus Arrival Time Prediction using Artificial Neural Network
Model. IEEE Intelligent Transportation Systems conference, Washington, D.C., 2004, pp.
988-993.
10. Patnaik, J., S. Chien, and A. Bladikas. Estimation of Bus Arrival Times using APC Data.
Journal of Public Transportation, No.1, 2004, pp. 1-20.
11. Bhandari, R.R. Bus Arrival Time Prediction using Stochastic Time Series and Markov
Chains. Ph. D dissertation, Department of Civil Engineering, New Jersey Institute of
Technology, 2005. http://archives.njit.edu/vol01/etd/2000s/2005/njit-etd2005-038/njit-
etd2005-038.pdf. Accessed Jan, 12, 2013.
12. Chien, S.I.J., Y. Ding, and C. Wei. Dynamic Travel time Prediction with Artificial Neural
Networks. Journal of Transportation Engineering 128, No. 5, 2002, pp.429-438.
13. Cathey, F.W., and D.J. Dailey. A Prescription for Transit Arrival/Departure Prediction
using AVL Data. In Transportation Research Part C: Emerging Technologies, No.11,
2003, pp. 241-264.
14. Son, B., H.J. Kim, C.H. Shin, and S.K. Lee. Bus Arrival Time Prediction Method for ITS
Application. Knowledge Based Intelligent Information and Engineering Systems, 2004,
Springer Berlin Heidelberg, pp. 8894.
15. Shalaby, A., and A. Farhan. Prediction models of bus arrival and departure times using
AVL and APC data. Journal of Public Transportation, Vol.7, No.1, 2004, pp. 41-60.
16. Nanthawichit, C., T. Nakatsuji, and H. Suzuki. Application of Probe vehicle data for real
time traffic state estimation and short time travel time prediction on a freeway. In
Transportation Research Board, (CD-ROM), Washington, D.C., 2006.
17. Chu, L., J.S. Oh, and W. Recker. Adaptive Kalman Filter Based Freeway Travel time
Estimation. In Proceedings of Transportation Research Board, Transportation Research
Board, National Research Council, Washington, D.C., 2005.
18. Ramakrishna, Y., V. Lakshmanan, and R. Sivanandan. Bus travel time prediction using
GPS data. http://www.gisdevelopment.net/proceedings/mapindia/2006 /student%20oral
/mi06stu_html. Accessed Feb, 12, 2013.
19. Vanajakshi, L., S.C. Subramanian, and R. Sivanandan. Travel Time Prediction under
Heterogeneous Traffic Conditions using GPS Data from Buses. IET Journal on
Intelligent Transportation Systems 3 (1), 2009, pp. 1-9.
20. Padmanabhan, R.P.S., K. Divakar, L. Vanajakshi, and S.C. Subramanian. Development
of a Real-Time Bus Arrival Prediction System for Indian Traffic Conditions. IET Journal
on Intelligent Transportation Systems 4 (3), 2009, pp. 189-200.
21. Chamberlain, R.G. Great Circle Distance between Two Points. http://www.movabletype
.co.uk/scripts/gis-faq-5.1.html. Accessed Feb, 14, 2013.
22. Sprinthall, and C. Richard. Basic statistical analysis: Pearson education group, Inc., New
York, 2003.
23. Rees, D.G. Essential Statistics: Chapman & Hall/CRC Publishing, Inc., London, 2001.
24. Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. Transaction
of the ASME-Journal of Basic Engineering 82(1), 1960, pp.35-45.
Kumar, Vanajakshi, and Subramanian 16
25. Kumar, A. B., L. Vanajakshi, S.C. Subramanian. Day-Wise Travel Time Pattern Analysis
under Heterogeneous Traffic Conditions. 2nd conference of transportation research
Group if India, 2013, (Accepted).
... Kumar et al. [17] analyzed trip-wise, daily, and weekly patterns of bus travel times and developed a bus arrival prediction model. Kumar et al. [18] studied the travel time patterns for different days of the week using GPS data obtained from public transit buses. Another study by Vlahogianni et al. [19] showed pattern-based prediction to be more accurate than the classical time-series approach for short-term traffic prediction. ...
... In the first part of the analysis, a trajectory analysis was carried out to analyze the patterns in peak and off-peak trajectories and daily trajectories. In the previous studies made under similar traffic conditions [18], it was assumed that the peak and off-peak trip trajectories could be identified manually, and such groups of trips were significantly different from each other. Based on that, for the case with fixed clusters, trips between 8:00 am and 10:59 am and between 3:00 pm and 7:59 pm were considered peak trips and others as off-peak trips. ...
... These overlaps observed in the trip trajectories raise the question of whether manual grouping can account for these highly varying travel time patterns. Previous studies from similar traffic conditions on the prediction of bus travel times [18,31] assumed that travel times followed weekly patterns. It was assumed that peak and off-peak timings remained constant over different weeks. ...
Article
Full-text available
The prediction of bus travel time with accuracy is a significant step toward improving the quality of public transportation. Drawing meaningful inferences from the data and using these to aid in prediction tasks is always an area of interest. Earlier studies predicted bus travel times by identifying significant regressors, which were identified based on chronological factors. However, travel time patterns may vary depending on time and location. A related question is whether the prediction accuracy can be improved with the choice of input variables. The present study analyzes this question systematically by presenting the input data in different ways to the prediction algorithm. The prediction accuracy increased when the dataset was grouped, and separate models were trained on them, the highest accurate case being the one where the data-derived clusters were considered. This demonstrates that understanding patterns and groups within the dataset helps in improving prediction accuracy.
... Developing a model that can take into account all these factors is a challenging task. Existing literature mainly focused on three types of models for travel time prediction: time series (Rajbhandari 2005;Suwardo et al., 2010;Kumar, Vanajakshi 2012), ANN (Jeong, Rilett 2004;Ramakrishna et al. 2006;Kumar et al. 2014b;Fan, Gurmu 2015;Chen et al. 2007;Vanajakshi, Rilett 2004;Mazloumi et al. 2011), and KFTs (Liu et al. 2014;Nanthawichit et al. 2003;Shalaby, Farhan 2004;Vanajakshi et al. 2009;Padmanaban et al. 2010;Chu et al. 2005;Kumar et al. 2014a). There were only a few studies that paid special attention to high variability issue (Mazloumi et al. 2011) using ANN. ...
... Kumar and Vana-jakshi (2014) proposed a statistical methodology to find out patterns in the data and used them as input to predict the next bus travel time using KFT. Kumar et al. (2014aKumar et al. ( , 2014b used GPS data to predict bus travel time using ANN and the obtained results were compared with KFT. Results showed that ANN gave better results when there is a large data set for network training. ...
... In spatial KFT approach (Vanajakshi et al. 2009), the travel time of a bus in an upcoming subsection was predicted using the travel time in the previous subsection. In temporal KFT approach (Kumar et al. 2014a(Kumar et al. , 2014b the travel time of a bus in a subsection was predicted using previous many buses travel times in the same subsection. ...
Article
Full-text available
Real-time bus travel time prediction has been an interesting problem since past decade, especially in India. Popular methods for travel time prediction include time series analysis, regression methods, Kalman filter method and Artificial Neural Network (ANN) method. Reported studies using these methods did not consider the high variance situations arising from the varying traffic and weather conditions, which is very common under heterogeneous and lane-less traffic conditions such as the one in India. The aim of the present study is to analyse the variance in bus travel time and predict the travel time accurately under such conditions. Literature shows that Support Vector Machines (SVM) technique is capable of performing well under such conditions and hence is used in this study. In the present study, nu-Support Vector Regression (SVR) using linear kernel function was selected. Two models were developed, namely spatial SVM and temporal SVM, to predict bus travel time. It was observed that in high mean and variance sections, temporal models are performing better than spatial. An algorithm to dynamically choose between the spatial and temporal SVM models, based on the current travel time, was also developed. The unique features of the present study are the traffic system under consideration having high variability and the variables used as input for prediction being obtained from Global Positioning System (GPS) units alone. The adopted scheme was implemented using data collected from GPS fitted public transport buses in Chennai (India). The performance of the proposed method was compared with available methods that were reported under similar traffic conditions and the results showed a clear improvement.
... Time series models assume that the historical traffic patterns will remain the same in the future, and their precision is highly dependent on the correspondence between real-time and historical traffic patterns [12]. Regression models build transparent relationships between travel times and a set of independent variables that can affect travel times [13,14,15]. Patnaik et al. developed a set of regression models to estimate bus travel times using data collected by automatic passenger counters installed in buses [16]. ...
... Bai et al. applied the Kalman filtering-based dynamic algorithm to adjust bus travel times with the latest bus operation information and estimated baseline travel times [28]. A model based on the Kalman filtering algorithm needs continuous real-time feeds, and data fluctuations might cause difficulties in solving the time lag problem [14]. ...
Article
Full-text available
Transit agencies often provide estimates of bus travel times to downstream stops. This study aims to improve the perceived reliability of bus transit systems and enhance their competitiveness. This study considers the characteristics of low headway and high demand for mid-volume bus lanes. Considering the variation in right-of-way with respect to both time and space, a stop-based bus route is built to divide the road into sections. Available real-time data from a schedule-based mid-volume bus route are used, including bus global positioning system (GPS) data, road condition information, and weather. Based on the accelerated failure time (AFT) model, a dynamic travel time model considering right-of-way variation is established to estimate bus travel times between adjacent stops and explore the specific impact of bus right-of-way variation on the travel time. The AFT model is chosen because it can reveal the significance of different variables to estimate travel times, and simultaneously estimate expected travel times as well as travel time uncertainty. The experimental results indicate that bus right-of-way variation significantly affects travel times. In contrast to the linear model, the parameter estimated by the AFT model conforms better to expectations, especially for long-distance travel.
... TT variability is explained by yearly, monthly, day-to-day and hourly variability as well as vehicle-tovehicle variability (B. Anil et al., 2014;Büchel and Corman, 2018;Kieu et al., 2015). Daily variability is the fluctuation between the TT of a trip at a specific time on different days whereas hourly variation is the variation of a route's TT during the course of the day. ...
Preprint
An important aspect of the quality of a public transport service is its reliability, which is defined as the invariability of the service attributes. Preventive measures taken during planning can reduce risks of unreliability throughout operations. In order to tackle reliability during the service planning phase, a key piece of information is the long-term prediction of the density of the travel time, which conveys the uncertainty of travel times. We introduce a reliable approach to one of the problems of service planning in public transport, namely the Multiple Depot Vehicle Scheduling Problem (MDVSP), which takes as input a set of trips and the probability density function (p.d.f.) of the travel time of each trip in order to output delay-tolerant vehicle schedules. This work empirically compares probabilistic models for the prediction of the conditional p.d.f. of the travel time, as a first step towards reliable MDVSP solutions. Two types of probabilistic models, namely similarity-based density estimation models and a smoothed Logistic Regression for probabilistic classification model, are compared on a dataset of more than 41,000 trips and 50 bus routes of the city of Montr\'eal. The result of a vast majority of probabilistic models outperforms that of a Random Forests model, which is not inherently probabilistic, thus highlighting the added value of modeling the conditional p.d.f. of the travel time with probabilistic models. A similarity-based density estimation model using a $k$ Nearest Neighbors method and a Kernel Density Estimation predicted the best estimate of the true conditional p.d.f. on this dataset.
... Mean absolute percentage error (MAPE) is the performance metric we use in evaluating our model because of its very intuitive interpretation in terms of relative error to our predicted values, with smaller MAPE values indicating higher model accuracy. MAPE for the full tested dataset is 5.32%, which means that on average the predicted values are off by 5.32%, comparable to the results of previous studies by Yu et al. (2011) for multiple routes and by Kumar et al. (2014) for one route. We can conclude that our prediction model was robust and not strongly influenced by time of day or day of week, where we still had a MAPE of less than 6%. ...
Article
Full-text available
Accurate prediction of bus delays improves transit service delivery and can potentially increase passenger use and satisfaction. To date, models developed for predicting bus delays have been restricted to single routes because of their poor performance on a wide network, due to reliance on simplistic model architectures and limited sources of data. This paper proposes a deep learning-based framework for predicting bus delays at the network level. The framework is fueled by large, heterogenous bus transit data (GTFS) and vehicle probe data. We utilize entity embeddings to enable the framework to simultaneously fit functions and learn patterns from both categorical and continuous data streams. The framework results in a single model that is able to characterize the factors influencing delays on multiple routes, for multiple stations at a time, at different times of the day, and during different seasons. A case study was conducted in Saint Louis, Missouri, with data collected over a 1-month period. The results indicate that the developed modeling framework is high performing, predicting delays for multiple stops at a mean absolute percentage error (MAPE) of about 6%. For different routes and trips, the observed prediction errors were stable across days of week, bus stops, holidays, and time of day. Although peak hour factors and the distance to bus stop influenced the model’s prediction errors for some routes, the observed differences were not significant. Compared to previous research, the ability to simultaneously model continuous and categorical data with deep learning and the use of heterogenous data contributed to such high performance on multiple routes.
... Firstly, the main reason why existing estimation approaches could not achieve excellent accuracy is the fact that the travel times are impacted by various factors, such as different weather conditions [28,29], temporal variation of peak and off-peak hours [4,30,31], boarding passenger information [32][33][34], and real-time traffic conditions [35,36]. Some work focus on analyzing the impacts of different factors. ...
Article
Full-text available
Travel time data is an important factor for evaluating the performance of a public transport system. In terms of time and space within the nature of uncertainty, bus travel time is dynamic and flexible. Since the change of traffic status is periodic, contagious or even sudden, the changing mechanism of that is a hidden mode. Therefore, bus travel time prediction is a challenging problem in intelligent transportation system (ITS). Allowing for a large amount of traffic data can be collected at present but lack of precisely-conducting, it is still worth exploring how to extract feature sets that can accurately predict bus travel time from these data. Hence, a feature extraction framework based on the deep learning models were developed to reflect the state of bus travel time. First, the study introduced different historical stages of bus signaling time, taxi speed, the stop identity (ID) of spatial characteristics, and real-time possible arrival time, signified by fourteen spatiotemporal characteristic values. Then, an embedding network is proposed to leverage a wide and deep structure to mate the spatial and temporal data. In order to meet the temporal dependence requirements, an attention mechanism for a Recurrent Neural Network (RNN) was designed in this research in order to capture the temporal information. Finally, a Deep Neural Networks (DNN) was implemented in this research in order to achieve the dynamic bus travel time prediction. Two case studies of Guangzhou and Shenzhen were tested. The results showed that the performance of the algorithm was more efficient than that of the traditional machine-learning model and promoted by 4.82% compared to the deep neural network applied to the initial feature space. Moreover, the study visualized the weighted cost of attention on the bus’s travel time features during a certain running state. Therefore, the study demonstrated the proposed model enabled to understand the characteristic data of transit travel time with visualization.
Article
In recent years, deep learning models proved their ability to solve complex problems in the areas such as computer vision and natural language processing, and are receiving a lot of attention within the community of transportation systems as well. Though these are known as data-driven approaches, it is not yet reported whether providing a huge amount of data is sufficient or whether extra domain knowledge added as features will improve their performance. It is reasonable to expect that the performance of deep learning models will be improved by incorporating field-specific knowledge into the problem. This paper tries to address this question by taking Convolutional Neural Networks (CNNs) as a sample deep learning technique and comparing its performance with and without adding extra information about the data as feature input, for the application of bus travel time prediction. To extract extra information, the data are pre-processed using visual and statistical analyses, and the obtained knowledge is incorporated with the deep learning method. For pre-processing heat maps and statistical analysis were conducted using k-means clustering and Davies-Bouldin (DB) score to identify the optimum number of input groups. Further, the accuracy levels were compared with the deep learning method that was built with just data alone as input. The proposed models were evaluated on two selected bus routes, 19B and M1, in the City of Chennai, India. Results show that the provision of domain-related information having a positive impact on the prediction accuracy of up to 3% in selected routes. Performance comparison with existing methods such as historical average, linear regression, ANN, LSTM, and Conv-LSTM was also carried out and it was observed that the proposed method performed better than other existing methods.
Article
Travel time is a variable that varies over both time and space. Hence, an ideal formulation should be able to capture its evolution over time and space. A mathematical representation capturing such variations was formulated from first principles, using the concept of conservation of vehicles. The availability of position and speed data obtained from GPS enabled buses provide motivation to rewrite the conservation equation in terms of speed alone. As the number of vehicles is discrete, the speed-based equation was discretized using Godunov scheme and used in the prediction scheme that was based on the Kalman filter. With a limited fleet size having an average headway of 30 min, availability of travel time data at small interval that satisfy the requirement of stability of numerical solution possess a big challenge. To address this issue, a continuous speed fill matrix spatially and temporally was developed with the help of historic data and used in this study. The performance of the proposed Advanced Time-Space Discterization (AdTSD) method was evaluated with real field data and compared with existing approaches. Results show that AdTSD approach was able to perform better than historical average approach with an advantage up to 11% and 5% compared to Base Time Space Discretization (BTSD) approach. Also, from the results it was observed that the maximum deviation in prediction was in the range of 2–3 min when it is predicted 10 km ahead and the error is close to zero when it is predicted a section ahead i.e. when the bus is close to a bus stop, indicating that the prediction accuracy achieved is suitable for real field implementation.
Article
Full-text available
Traffic parameters in general follow some patterns and identifying them is important since it will lead to more efficient end applications. With the introduction of various automated sensors, huge amounts of data are being collected, which can be used for identifying traffic patterns. In general, traffic patterns can be classified as yearly, monthly, weekly, daily and hourly. Travel tine has been recognized as one of the most anportant parameters to fully facilitate many of the Intelligent Transportation Systems (ITS) applications. However, to predict the travel time, its weekly, daily and hourly patterns need to be analyzed. The present study analyzes the pattern followed by Global Positioning System (GPS) based bus travel time data. The travel time pattern may be different for different days of the week and hence the analysis was carried out separately for each day of the week. A systematic statistical analysis was carried out to rank these patterns in the order of significance for each day of the week separately. The statistical test namely, the Z-test for the mean of a population of differences for 'paired' samples data, was used for hypothesis testing at 5% level of significance using one month's data (with a total sample size of 5700). It was observed that all days have a similar pattern except Sunday. (C) 2013 The Authors. Published by Elsevier Ltd.
Article
Full-text available
This synthesis report will be of interest to transit staff concerned with implementing real-time bus arrival information systems at their agencies. Information on relevant technical capabilities, agency experience, cost, and bus rider reactions to these information systems was documented. The report describes the state of the practice, including both U.S. and international experience. It documents survey information, a review of the relevant literature, as well as interviews with key personnel at agencies that have, or are in the process of, implementing these systems. This report integrates the information obtained from the literature review and survey responses with the follow-up interviews. Case study information details specifics from agencies that have deployed these systems.
Article
Full-text available
The emphasis of this research effort was on using AVL and APC dynamic data to develop a bus travel time model capable of providing real-time information on bus arrival and departure times to passengers (via traveler information services) and to transit controllers for the application of proactive control strategies. The developed model is comprised of two Kalman filter algorithms for the prediction of running times and dwell times alternately in an integrated framework. The AVL and APC data used were obtained for a specific bus route in Downtown Toronto. The performance of the developed prediction model was tested using “hold out ” data and other data from a microsimulation model representing different scenarios of bus operation along the investigated route using the VISSIM microsimulation software package. The Kalman filter-based model outperformed other conventional models in terms of accuracy, demonstrating the dynamic ability to update itself based on new data that reflected the changing characteristics of the transit-operating environment. A user-interactive system was developed to provide continuous information on the expected arrival and departure times of buses at downstream stops, hence the expected deviations from schedule. The system enables the user to assess in real time 41 Journal of Public Transportation, Vol. 7, No. 1, 2004 transit stop-based control actions to avoid such deviations before their occurrence, hence allowing for proactive control, as opposed to the traditional reactive control, which attempts to recover the schedule after deviations occur.
Article
Full-text available
Bus transit operations are influenced by stochastic variations in a number of factors (e.g., traffic congestion, ridership, intersection delays, and weather conditions) that can force buses to deviate from their predetermined schedule and headway, resulting in deterioration of service and the lengthening of passenger waiting times for buses. Providing passengers with accurate bus arrival information through Advanced Trav-eler Information Systems can assist passengers' decision-making (e.g., postpone de-parture time from home) and reduce average waiting time. This article develops a set of regression models that estimate arrival times for buses traveling between two points along a route. The data applied for developing the proposed model were collected by Automatic Passenger Counters installed on buses operated by a transit agency in the northeast region of the United States. The results obtained are promis-ing, and indicate that the developed models could be used to estimate bus arrival times under various conditions.
Article
The provision of accurate bus arrival time information is one of the major components in advanced public transportation systems, which many of the metropolitan cities in developing countries are trying to implement to increase the public transit usage. The effectiveness of such a system largely depends on the reliability of the information provided to the public. For such reliable information to be generated, the prediction technique used should be able to make accurate predictions, which in turn depends on the input data used for prediction. The present study is an attempt to explore these two areas, namely the identification of suitable input data by analysing trip-wise, daily and weekly patterns of bus travel times through valid statistical tests and the development of an accurate bus arrival prediction model using a popular time series technique called exponential smoothing. The performance evaluation using 90 actual bus trip data shows that the use of suitable input data into the prediction model yields better results with mean absolute percentage error of 12, and for 77% of the time the deviation of predicted arrival time with respect to actual arrival time is within the user acceptable range of +/- 5 min.
Conference Paper
In this paper, we develop a new bus travel time prediction framework, called Historical Trajectory based Travel/Arrival Time Prediction (HTTP) for real-time prediction of travel time over future segments (and thus the arrival time at stops) of an on-going bus journey. The basic idea behind HTTP is to use a collection of historical trajectories "similar" to the current bus trajectory to predict the future segments. Specifically, the HTTP framework (1) samples a set of similar trajectories as the basis for travel time estimation instead of relying on only one historical trajectory best matching the on-going bus journey; and (2) explores different prediction schemes, namely, passed segments, temporal features, and hybrid methods, to identify the sample set of similar trajectories. We conduct a comprehensive empirical experimentation using real bus trajectory data collected from Taipei City, Taiwan to validate our ideas and to evaluate the proposed schemes. Experimental result shows that the proposed prediction schemes significantly outperforms the state-of-the-art and baseline techniques.
Article
Transit operations are interrupted frequently by stochastic variations in traffic and ridership conditions that deteriorate schedule or headway adherence and thus lengthen passenger wait times. Providing passengers with accurate vehicle arrival information through advanced traveler information systems is vital to reducing wait time. Two artificial neural networks (ANNs), trained by link-based and stop-based data, are applied to predict transit arrival times. To improve prediction accuracy, both are integrated with an adaptive algorithm to adapt to the prediction error in real time. The bus arrival times predicted by the ANNs are assessed with the microscopic simulation model CORSIM, which has been calibrated and validated with real-world data collected from route number 39 of the New Jersey Transit Corporation. Results show that the enhanced ANNs outperform the ones without integration of the adaptive algorithm.
Article
Bus headway in a rural area is usually much larger than that in an urban area. Providing real-time bus arrival information could make the public transit system more user-friendly and thus enhance its competitiveness among various transportation modes. As part of an operational test for rural traveler information systems currently ongoing in Blacksburg, Virginia, an experimental study has been conducted on forecasting the arrival time of the next bus with AVL techniques. This paper discusses the process of developing arrival time estimation algorithms, including route representation, GPS data screening for identifying data quality and delay patterns, algorithm formulation, and the development of measures of performance. Whereas GPS-based bus location data are adopted in all four algorithms presented in the paper, the extent to which other information is used in these algorithms varies. In addition to bus location data, information relevant to the performance of an algorithm also includes scheduled arrival time, delay correlation, and waiting time at time-check stops. The performance of an algorithm using different levels of information is compared against three criteria: overall precision, robustness, and stability. Our results show that at the site where the study is being conducted, the dwell time at time-check stops is most relevant to the performance of an algorithm.