Content uploaded by Anilkumar Bachu
Author content
All content in this area was uploaded by Anilkumar Bachu on Dec 31, 2013
Content may be subject to copyright.
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
B. Anil Kumar
Graduate Student
Department of Civil Engineering
Indian Institute of Technology Madras
Chennai-600036
INDIA
E-mail: raghava547@gmail.com
Lelitha Vanajakshi
1
Associate Professor
Department of Civil Engineering
Indian Institute of Technology Madras
Chennai 600 036
INDIA
Ph: 91 44 2257 4291, Fax: 91 44 2257 4252
E-mail: lelitha@iitm.ac.in
and
Shankar C. Subramanian
Associate Professor
Department of Engineering Design
Indian Institute of Technology Madras
Chennai 600 036
INDIA
Ph: 91 44 2257 4705
E-mail: shankarram@iitm.ac.in
Paper Submitted for presentation and publication in
Transportation Research Record, Transportation Research Board,
National Research Council, Washington, D. C.
Word Count: Body text (6782) + Figures (1500) + Tables (1000) = 7032.
Submitted on: August 1, 2013
1
Corresponding author
Kumar, Vanajakshi, and Subramanian 1
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
ABSTRACT
In recent times, congestion levels have been increasing in Indian cities due to rapid changes
in urbanization, which will lead to several negative impacts such as delays and pollution. There
is a need to explore better traffic operations and management systems to overcome these
problems. Attracting more travelers towards public transport from personal vehicles is one of the
ways to reduce congestion levels. In this context, provision of next bus arrival time information
at bus stops to passengers will help improve the situation. This comes under Advanced Public
Transportation Systems (APTS), a major functional area of Intelligent Transportation Systems.
The reliability of such information provided to passengers depends on the prediction method
used, which in turns depends on the input data used in the method, which means that identifying
the most significant/appropriate input data and using them in the method is important. So, in the
present study, travel time pattern analysis was carried out to find the most significant inputs by
performing valid statistical tests for each day of the week separately. Also, a model-based
Kalman filtering algorithm was developed to predict bus travel time by using the identified
patterns effectively based on temporal discretization under heterogeneous traffic conditions. The
performance of the proposed algorithm shows a clear improvement in prediction accuracy when
compared with a prediction method using space discretization.
Kumar, Vanajakshi, and Subramanian 2
Pattern-Based Bus Travel Time Prediction under Heterogeneous
Traffic Conditions
INTRODUCTION
India is a developing country in terms of economy, infrastructure, transportation and facilities
leading to rapid changes in all fields including transportation. Urban development leads to more
traffic resulting in several undesirable consequences such as traffic congestion, delay and
pollution. As demand in mobility increases, the use of Intelligent Transportation Systems (ITS)
for better management of traffic towards meeting these challenges is becoming more important.
ITS will help to provide reliable, effective and useful information to the travelers, by applying
latest developments in communication and information technology to surface transportation
systems. ITS is a general term in use, which encompasses different functional areas, of which
Advanced Public Transportation Systems (APTS) and Advanced Traveler Information Systems
(ATIS) seem to be useful in the Indian scenario. Attracting travelers to use public transportation
systems is a way to reduce traffic congestion. In this context, the prediction of both bus arrival
times and travel times is crucial to make the public transport more attractive. However, for this to
be effective, the information provided to passengers should be reliable. The present study
contributes to the area of bus travel time prediction for the development of accurate passenger
information systems.
The prediction techniques commonly used can be mainly classified into data-driven and
model-based techniques. Data-driven techniques require a good data base where as model-based
techniques require a relatively limited data base. However, irrespective of the amount of data
required, one should use the most significant/appropriate input for better prediction accuracy.
Schweiger (1) suggested that the performance of prediction techniques in terms of their accuracy
depends on the travel time patterns of the data collected. Identifying the most significant and
effective input data and using them in prediction methods will hopefully improve their
performance. Traffic patterns can be typically classified as yearly, monthly, weekly, daily and
hourly. Yearly pattern analysis checks whether the travel time data of same-day/same-time trip
of the previous year(s) have a similar pattern as that of the current trip. Similarly, monthly,
weekly and daily patterns are compared with the corresponding month’s, week’s and day’s trips
respectively. Trip-wise pattern analysis checks whether the current trip has a similar pattern as
that of the previous trips on the same day. This may help to capture the traffic conditions on that
particular day such as accidents and route diversions. In the present study, bus travel time
prediction using a model based approach is attempted. The most significant input data that need
to be used are identified by carrying out a pattern analysis of the data using statistical analysis.
The most significant inputs, thus identified, are used in the prediction model.
LITERATURE REVIEW
Many researchers have suggested different techniques to find out travel time patterns. Ohba et al.
(2) obtained travel time patterns from the mean of the smoothened travel times collected from
toll booths of expressways to predict travel time. Wu et al. (3) obtained travel time patterns from
the speeds of vehicles from loop detectors and concluded the weekly pattern to be significant in
that data. Kwon et al. (4) used the data obtained from loop detectors to obtain day-to-day travel
time trends to predict travel time using regression analysis. It was reported that there is a strong
dependence between two successive vehicle travel times within a day. Lee (5) used Global
Positioning Systems (GPS) data to analyze travel time patterns using historical travel time
Kumar, Vanajakshi, and Subramanian 3
trajectories similar to the current trip. Kumar (6) used GPS data to obtain travel time patterns by
using parametric statistical tests to predict bus travel times.
Various techniques are being used to predict bus arrival times such as historical and real-
time approaches, statistical techniques, machine-learning techniques and model-based
techniques. Historical methods predict the travel time of a particular time period (trip) by
averaging the previous many same time periods (trips). These methods will show better
performance under expected traffic conditions. However, under unexpected traffic conditions,
the prediction accuracy will be reduced. In real-time methods, travel time can be predicted for
the next time period by using the present time period’s value, i.e., it assumes that the future
travel time is the same as the present one. This method is reliable, if real-time data are
continuously available and traffic conditions are normal. Any disturbance in receiving data
causes deviation in the expected performance of the method. Statistical techniques are very
popular to predict travel times, which include time-series methods and regression techniques.
Time-series based predictions make the underlying assumption that historical travel patterns will
remain the same in future. This technique needs a large amount of reliable data. Regression
techniques will predict the dependent variable (travel time) by using an equation formed by a set
of independent variables that can affect travel time. These independent variables may include
road conditions, traffic conditions, signals, intersections, driver characteristics, and vehicle
composition. The accuracy of prediction depends on identifying and applying the suitable
independent variables. Machine learning techniques such as Artificial Neural Networks (ANN)
and Support Vector Machines (SVM) are commonly used to predict travel time because of their
ability to solve complex non-linear relationships. These types of techniques need large amount of
data to train the system. Model-based techniques develop models that can capture the dynamics
of the system by establishing mathematical relationships between appropriate variables. Many of
the model-based studies use estimation techniques such as the Kalman Filtering Technique
(KFT) for the estimation/prediction of traffic parameters such as density, travel time, etc. The
following table presents a summary of literature related to this study with appropriate remarks
identifying their features and limitations.
Table 1 Summary of Literature Review
S. No
Author
Technique used
Traffic
Characteristic
Remarks
1
Lin (7)
Empirical
analysis
Homogeneous
Used location data, schedule
information, waiting time at bus stops.
2
Bo et al. (8)
Linear
Regression
Homogeneous
Used one month weekdays GPS based
bus travel time data.
3
Jeong (9)
Regression
Homogeneous
Evaluated a historical data based
model, regression model and ANN
model.
4
Patnaik (10)
Regression
Homogeneous
Developed models using path-based
data. Study results were not
corroborated with field data.
5
Bhandari (11)
Auto Regressive
(AR) model
Homogeneous
Used seven months’ AVL data.
6
Chien et al. (12)
ANN
Homogeneous
Developed link-based and path-based
ANN models to predict bus arrival
Kumar, Vanajakshi, and Subramanian 4
time.
7
Dailey et al. (13)
Kalman filtering
Homogeneous
Compared a historical data based
model, regression model and ANN
model.
8
Son et al. (14)
Kalman filtering
Homogeneous
Predicted travel time from bus stop to
stop line at signalized intersections.
9
Shalaby (15)
Kalman filtering
Homogeneous
Used data collected from AVL and
APC to predict bus travel time.
10
Nanthawichit et
al. (16)
Kalman filtering
Homogeneous
Used data collected from GPS
equipped vehicles and loop detectors
to estimate traffic parameters,
compared results with historical,
regression and historical methods.
11
Shalaby (17)
Kalman filtering
Homogeneous
Used data collected from AVL and
APC to predict bus travel time.
12
Rama Krishna
et al. (18)
ANN model
Heterogeneous
Used GPS based collected data in
Chennai, compared ANN with
multiple linear regression models.
13
Vanajakshi et al.
(19)
Kalman filtering
Heterogeneous
Used only previous two buses data to
predict next bus travel time based on
space discretization approach.
14
Padmanabhan et
al. (20)
Kalman filtering
Heterogeneous
Included dwell time explicitly in space
discretization approach to predict bus
travel time.
Most of the studies discussed above dealt with homogeneous traffic conditions. A few
studies have been reported from heterogeneous traffic conditions that tend to exist in developing
countries. Rama Krishna et al. (18) used 25 trips of GPS data to develop Multiple Linear
Regression and ANN models. Vanajakshi et al. (19) used space discretization approach to
predict bus travel time. In space discretization, the route was spatially discretized into smaller
subsections. The travel time of a bus in the upcoming subsections were predicted. The reason for
such an approach was limited data availability. Under such scenario, space discretization, where
one required data from only two previous buses to implement the prediction algorithm, was
advantageous. The basic assumption in that approach is that the trip wise data are good enough
for prediction and the model hypothesized a relation in travel time between neighboring
subsections. Padmanabhan et al. (20) extended the above study by analyzing the dwell times
explicitly. Kumar (6) used GPS data to find out travel time patterns in the data and reported a
strong weekly pattern followed by trip-wise pattern.
It can be observed that the input data for the prediction methods were taken arbitrarily in
most of the above studies, except by Kumar (6). Kumar (6) used the same pattern for all days of
the week and for all traffic conditions to develop a method for predicting travel time. However,
none of the studies analyzed the travel time pattern of all days of the week separately. It is
important to do so, since patterns are not likely to be the same for all days in a week. For
example, the travel time on weekends may follow a different pattern compared to weekdays.
Identifying the most significant trips and incorporating them during the analysis will definitely
help in improving the accuracy of the prediction method. Also, it can be observed that studies
Kumar, Vanajakshi, and Subramanian 5
reported from heterogeneous traffic conditions mainly dealt with spatial discretization than time
discretization. This is mainly due to lack of availability of a temporal data base. The present
study is one of the first attempts to study the bus travel time prediction under heterogeneous
traffic conditions using temporal discretization. Thus, the present study has two objectives:
1.Analysis of travel time pattern for each day of the week separately by statistical analysis, and
2.To develop a bus travel time prediction method that uses the identified patterns based on
temporal discretization under heterogeneous traffic conditions.
DATA COLLECTION AND EXTRACTION
GPS, widely used to collect data for APTS applications, tracks vehicles continuously and
provides their location information. In the present study, data were collected by using
permanently fixed GPS units in Metropolitan Transport Corporation (MTC) buses in the
metropolitan city of Chennai, India. For the purpose of collecting data, an MTC bus route, 5C, is
selected that spans 15kms, connecting the Parry’s bus depot, located in the northern part of the
city, to the Taramani bus depot, in the southern part of the city. There are 25 bus stops and 14
signalized intersections in this route. The selected road stretch is a typical representative of
heterogeneous traffic conditions. The route depicts several types of urban roads with varying
geometric characteristics, volume levels and land use characteristics such as residential,
commercial and institutional areas. The collected GPS data include the ID of the GPS unit, time
stamp, and latitude and longitude of the location at which the entry was made. Real time
communication of this data was made possible through General Packet Radio Service (GPRS).
The collected data is stored using Sequential Query Language (SQL) database encompassing all
trips in a day.
The data from all 7 buses running in the selected route (route number 5C) reporting every
five seconds from 6 AM to 8 PM was used. The average headway between two consecutive
vehicles in this route is around 45 minutes. Thus, a total of 975 trips data were collected during
the 45 days data collection period from 1st January 2013 to 14th February 2013.
From the GPS data, the distance between two consecutive entries was calculated by using
the Haversine formulae (21), which gives the great circle distances between two points on a
sphere from their latitudes and longitudes as
, (1)
where r is the radius of the earth (6378.1 km), indicate the latitude of point 1 and point 2,
indicate the longitude of point1 and point 2. After this process, the data consist of the
travel times and the corresponding distance between consecutive locations for all the buses. The
entire section was divided into 150 subsections each of 100m length and the time taken to cover
each subsection was calculated by using the linear interpolation technique. To identify the
patterns in the data, fifteen days’ data from 29th January 2013 to 12th February 2013 were taken
as the output set. Since the headway between the buses is approximately 45 minutes, one trip for
every hour was used from 6 AM to 8 PM. Thus, a total of 210 (14 trips/day × 15 days trips) were
generated. After analyzing the collected data, all these trips were divided into 4 zones, based on
starting time of the each trip as morning off-peak (6 AM – 8 AM), morning peak (8 AM – 10
AM),, afternoon off-peak (10 AM – 3 PM) and evening peak (3 AM – 8 AM). Travel time
patterns for peak (both morning peak and evening peak together) and off-peak (both morning
peak and evening peak together) trips were analyzed separately for each day of the week. Each
Kumar, Vanajakshi, and Subramanian 6
peak and off-peak output trip was compared with the 28 previous days’ corresponding input trips
with the same starting time as that of the output trip.
TRAVEL TIME PATTERN ANALYSIS
To analyze the travel time patterns, the Z-test for the mean of a population of differences for
paired samples was conducted for the hypothesis testing at 5% level of significance (22). The test
compared each 100m subsections’ travel time of the output trip to the input trip to check whether
the difference in the mean of the pair is zero or not. The check for daily pattern analyzes the
significance of trips that happened on the same time period of the previous days to that of the
current trip. A basic assumption of the Z-test for the mean of a population of differences for
paired samples data is that the differences of 100 m subsection travel time of the output trip and
the input trip follow a normal distribution. Tests were carried out to find whether this assumption
is true by using a statistical measure “skewness” given as
, (2)
where is the sample mean, M is the sample median and s is the standard deviation of the
sample. According to Rees et al. (23), if the skewness value is greater than +1, the distribution
has positive skew and if the skewness value is less than -1, the distribution has negative skew. If
the skewness value lies in between -1 and +1, the distribution is roughly symmetrical, i.e., it
follows a normal distribution. In the present study, the skewness is calculated for the differences
of 100 m subsection travel times between the output trip and the input trip. The results of
skewness calculated for various trips on a sample day, 29th January 2013, are shown in Figure 2.
FIGURE 1 Skewness calculated for various trips on 29th January 2013.
From the figure it can be observed that the calculated skewness values lie within the
range of -1 and +1. Since none of the values are outside the range of -1 and +1, it can be
concluded that differences of 100 m subsection travel time of the output trip and the input trip
follow a normal distribution and hence the Z-test can be adopted for hypothesis testing. The Z-
test is used to test the hypothesis, and is given by
Kumar, Vanajakshi, and Subramanian 7
, (3)
where is the mean of differences of 100 m section travel time of the output trip and the input
trip, is the standard deviation of the sample and n is the sample size. In the present study, the
test was conducted at 5% level of significance. So, if the calculated Z-value lies in between -1.96
and +1.96, then we can say that the null hypothesis is accepted, which means that the mean of
differences is zero. To analyse the daily pattern, 5880 (15 days × 14 trips/day × 28 preceding
days’ same time trips) ‘Z’ values were calculated. Then, a ratio has been calculated between the
number of times the null hypothesis was accepted to the total number of times the hypothesis
was tested for each case. If the ratio is high, we can conclude that the target trip is significant in
predicting the current trip (output trip). The results obtained from the statistical analysis for two
sample cases are shown in Figure 3A and Figure 3B, and the detailed results are given in Table
2. Table 2 illustrates the trend in the rankings followed on each day for peak and off-peak traffic
conditions separately. Over all, it can be seen that Sunday has a strong weekly pattern without
any strong daily pattern, and the other days have strong daily pattern too.
FIGURE 3A Travel time patterns observed for Sunday peak period.
FIGURE 3B Travel time patterns observed for Monday peak period.
Kumar, Vanajakshi, and Subramanian 8
TABLE 2 Pattern analysis results
Rank
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
Off-
Peak
Peak
1
d-7
d-28
d-5
d-7
d-3
d-5
d-2
d-9
d-3
d-2
d-1
d-1
d-3
d-4
2
d-14
d-7
d-2
d-25
d-5
d-7
d-9
d-14
d-8
d-6
d-4
d-11
d-28
d-8
3
d-15
d-21
d-7
d-2
d-7
d-11
d-26
d-26
d-28
d-8
d-9
d-16
d-5
d-15
4
d-21
d-14
d-11
d-3
d-1
d-12
d-1
d-28
d-2
d-13
d-20
d-2
d-21
d-21
5
d-28
d-1
d-14
d-4
d-4
d-25
d-5
d-2
d-7
d-22
d-2
d-3
d-1
d-22
6
d-22
d-15
d-4
d-5
d-6
d-6
d-6
d-7
d-13
d-1
d-3
d-4
d-2
d-28
7
d-1
d-22
d-12
d-6
d-8
d-15
d-7
d-13
d-19
d-3
d-7
d-7
d-4
d-17
8
d-8
d-5
d-18
d-13
d-24
d-20
d-19
d-1
d-1
d-7
d-10
d-22
d-8
d-24
9
d-10
d-8
d-24
d-14
d-10
d-27
d-20
d-6
d-5
d-17
d-11
d-8
d-9
d-25
10
d-18
d-9
d-6
d-18
d-11
d-4
d-21
d-15
d-10
d-20
d-16
d-14
d-7
d-5
11
d-26
d-16
d-10
d-26
d-12
d-8
d-27
d-19
d-21
d-27
d-22
d-28
d-12
d-7
12
d-2
d-18
d-13
d-10
d-14
d-10
d-11
d-27
d-26
d-10
d-6
d-15
d-15
d-16
13
d-3
d-27
d-17
d-11
d-15
d-13
d-13
d-5
d-27
d-14
d-14
d-18
d-22
d-1
14
d-4
d-10
d-23
d-12
d-13
d-14
d-18
d-16
d-6
d-15
d-15
d-9
d-24
d-6
15
d-9
d-17
d-25
d-17
d-17
d-18
d-25
d-20
d-12
d-21
d-27
d-25
d-10
d-10
16
d-16
d-19
d-26
d-19
d-18
d-19
d-8
d-21
d-15
d-24
d-28
d-27
d-14
d-11
17
d-19
d-3
d-3
d-24
d-19
d-26
d-4
d-22
d-9
d-26
d-8
d-20
d-17
d-14
18
d-23
d-24
d-27
d-27
d-21
d-1
d-12
d-8
d-14
d-5
d-13
d-23
d-23
d-23
19
d-25
d-25
d-9
d-21
d-25
d-3
d-14
d-12
d-16
d-9
d-18
d-6
d-25
d-26
20
d-27
d-26
d-16
d-23
d-26
d-17
d-22
d-18
d-17
d-23
d-12
d-17
d-26
d-2
21
d-5
d-2
d-19
d-28
d-27
d-21
d-28
d-23
d-18
d-28
d-21
d-21
d-13
d-3
22
d-12
d-4
d-20
d-16
d-16
d-22
d-15
d-25
d-20
d-16
d-24
d-24
d-16
d-9
23
d-17
d-11
d-21
d-9
d-28
d-24
d-16
d-4
d-22
d-19
d-19
d-10
d-6
d-12
24
d-20
d-12
d-28
d-20
d-2
d-28
d-23
d-10
d-24
d-12
d-23
d-13
d-11
d-13
25
d-24
d-13
d-15
d-8
d-20
d-9
d-17
d-11
d-23
d-4
d-25
d-5
d-18
d-19
26
d-6
d-20
d-22
d-15
d-23
d-2
d-10
d-3
d-11
d-11
d-17
d-19
d-19
d-18
27
d-11
d-23
d-1
d-22
d-22
d-16
d-3
d-17
d-4
d-18
d-5
d-12
d-20
d-20
28
d-13
d-6
d-8
d-1
d-9
d-23
d-24
d-24
d-25
d-25
d-26
d-26
d-27
d-27
* (d-n) represents previous nth day same time trip
The above ranking was used while selecting the input data for the prediction method. In
order to take into account the present day traffic conditions, data from previous two buses (PV1
and PV2) were also taken into consideration. This data will reflect the effect of events that have
taken place in that subsection on that day.
BUS ARRIVAL TIME PREDICTION METHOD
A model based approach using Kalman filtering was adopted in this study for the bus travel
time/arrival time prediction. The KFT (24) can be used to estimate state variables, which are
used to characterize systems/processes that are described by state space models. The
implementation of the Kalman filter requires information regarding the system’s dynamics,
Kumar, Vanajakshi, and Subramanian 9
statistical information of the system disturbances and measurement errors. It uses the model and
system inputs to predict the a priori state estimate and uses the output measurements to obtain
the a posteriori state estimate. Overall, it is a recursive algorithm, so that new measurements can
be processed when they are obtained. It needs only the current instant state estimate, current
input and output measurements to calculate next instant’s state estimate. The evolution of travel
time between various time intervals in a given subsection is assumed to be
(4)
where A is a parameter which relates the time taken to travel in a given subsection, is the
travel time taken for covering the given subsection at time t and is the associated process
disturbance. The measurement process was assumed to be governed by
(5)
where is the measured travel time in a given subsection at time t and is the
measurement noise. It was further assumed that and are zero mean white Gaussian
noise signals with and being their corresponding variances. Thus, two sets of data are
required to implement the above scheme - one set of data for the time update equations to
calculate the parameter ‘A’ and another data set to be used in the measurement update equations
to generate the a posteriori travel time estimate. So, the pattern analysis results were arranged as
two sets of data in the order of preference from lower to higher for both peak and off-peak time
zones of all days as shown in Table 3. The data in column 1 were used to obtain the value of A
during each time interval and the data from column 2 were used to obtain the a posteriori travel
time estimate.
Table 3 Data set used for KF technique
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
d-13
d-11
d-8
d-1
d-9
d-22
d-24
d-3
d-25
d-4
d-26
d-5
d-27
d-20
d-6
d-24
d-22
d-15
d-23
d-20
d-10
d-17
d-11
d-23
d-17
d-25
d-19
d-18
d-20
d-17
d-28
d-21
d-2
d-28
d-23
d-16
d-24
d-22
d-23
d-19
d-11
d-6
d-12
d-5
d-20
d-19
d-16
d-17
d-15
d-28
d-20
d-18
d-24
d-21
d-16
d-13
d-27
d-25
d-16
d-9
d-26
d-25
d-22
d-14
d-17
d-16
d-12
d-18
d-26
d-25
d-23
d-19
d-27
d-3
d-21
d-19
d-12
d-4
d-14
d-9
d-13
d-8
d-23
d-17
d-16
d-9
d-26
d-25
d-18
d-17
d-8
d-25
d-15
d-12
d-28
d-27
d-14
d-10
d-4
d-3
d-23
d-17
d-13
d-15
d-18
d-13
d-6
d-27
d-15
d-14
d-24
d-22
d-2
d-26
d-13
d-10
d-14
d-12
d-11
d-27
d-26
d-21
d-6
d-22
d-15
d-12
d-18
d-10
d-6
d-24
d-11
d-10
d-21
d-20
d-10
d-5
d-16
d-11
d-7
d-9
d-8
d-1
d-18
d-12
d-24
d-8
d-19
d-7
d-1
d-19
d-10
d-7
d-8
d-4
d-22
d-28
d-4
d-14
d-6
d-4
d-6
d-5
d-13
d-7
d-3
d-2
d-2
d-1
d-21
d-15
d-11
d-7
d-1
d-7
d-1
d-26
d-2
d-28
d-20
d-9
d-21
d-5
d-14
d-7
d-2
d-5
d-5
d-3
d-9
d-2
d-8
d-3
d-4
d-1
d-28
d-3
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
PV1
PV2
TV
TV
TV
TV
TV
TV
TV
Kumar, Vanajakshi, and Subramanian 10
The steps in the algorithm were as follows:
1. The entire section of travel between origin and destination was divided into N
subsections of equal length (100 m).
2. The travel time data from column 1 were used to obtain the value of A through
, (6)
3. Let denote the travel time taken by the test vehicle (which is the vehicle for which
the travel time needs to be predicted) to cover a given subsection. It was assumed that
(7)
, (8)
where is the estimate of travel time of the TV in the tth time interval.
4. For , the following steps were performed:
a. The priori estimate of the travel time was calculated by using
, where the superscript denotes the a priori estimate and
the superscript denotes the a posteriori estimate.
b. The a priori error variance (denoted by ) was calculated using
(9)
c. The Kalman gain (denoted by K) was calculated by using
(10)
d. The a posteriori travel time estimate and error variance were calculated using,
respectively,
(11)
(12)
Thus, the objective here is to predict the travel time of the TV using the travel time
obtained from previous all vehicles including PV1 and PV2 in a given subsection.
RESULTS AND DISCUSSIONS
The results obtained from the implementation of the algorithm presented in the previous section
which will be referred to as the Time Discretization method, are discussed in this section. Since
one of the main contributions of this study is the incorporation of significant historic data as
input to predict bus travel time, a comparison was carried out with a scheme that do not use the
pattern based historic data named as space discretization method (19).
Kumar, Vanajakshi, and Subramanian 11
The scheme presented in this study was used to predict travel time for each 100 m
subsection. However, it was observed from the data that there will be a distance of atleast 500m
between the bus stops. Hence, the final comparison was made between predicted and measured
travel times for every 500m subsections. The predicted travel time to cover a 500 m subsection
was found out by adding the predicted travel time from the corresponding five 100 m travel time
values obtained from the prediction algorithm. The prediction was carried out for one week
period (29th January – 4th February 2013). The Mean Absolute Percentage Error (MAPE) was
used to quantify the prediction accuracy, and was calculated by using
, (13)
where is the predicted travel time of TV to cover a given subsection and is the
corresponding travel time measured from the field. Figure 4A and Figure 4B shows sample
comparison of the predicted travel times and the measured travel times over 500 m subsections
along with the corresponding MAPE. It can be observed that the predicted values are closely
matching with the measured data.
Figure 4A Predicted and measured travel times for a peak period trip on 31st January 2013.
FIGURE 4B Predicted and measured travel times for an off-peak trip on 31st January 2013.
Kumar, Vanajakshi, and Subramanian 12
Earlier results from this study (25) showed that Sunday is having a distinctly different pattern
and the other days being very similar in their pattern. Hence, analysis was carried out to study
the effect of these patterns on the final application accuracy. The analysis was carried out for
various combinations of travel time patterns in time discretization approach as follows.
1. Method 1: Using the same travel time patterns for all days and without separating into
peak and off-peak traffic conditions.
2. Method 2: Using different travel time patterns for each day of the week, but without
separating into peak and off-peak traffic conditions.
3. Method 3: Using the same travel time patterns for all days together, but separating the
trips into peak and off-peak.
4. Method 4: Using Different travel time patterns for Sunday separately and all weekdays as
another group, and having peak and off-peak separated.
5. Method 5: Using different travel time patterns for each day of the week, and separating
the trips into peak and off-peak.
Figure 5 MAPE comparison with various approaches
The results obtained for a test period of one week is shown in Figure 5 along with space
discretization results. It can be observed that Method 3, 4 and 5 (analysing peak and off-peak
separately) are performing better than Method 1, Method 2 (without separating into peak and off-
peak) and Space Discretization. From Figure 5, it can also be observed that separating day wise
is not making much difference. It can be concluded that analysing the trips without having
separate day wise pattern but separating into peak and off-peak segments, may be the best
solution taking into account accuracy and model development effort.
A comparison of performance of the best method of time discretization (Method 5) was
carried out with the space discretization (19). The space discretization will use the data obtained
from previous two vehicles (PV1 and PV2) to predict the travel time for next vehicle (TV). The
errors obtained for all trips for a sample day (3rd February 2013) is shown in Figure 6.
Kumar, Vanajakshi, and Subramanian 13
FIGURE 6 MAPE values for all trips of a sample day.
From Figure. 6, it can be observed that the time discretization method is performing
better compared to space discretization in most of the cases. Table 4 shows the performance
comparison for all days.
Table 4 MAPE Comparison between Time and Space discretization’s
Date
Time
Discretization
Space
Discretization
29th January 2013
25.88
29.43
30th January 2013
29.16
33.57
31st January 2013
29.95
33.56
1st February 2013
29.88
32.53
2nd February 2013
30.32
31.95
3rd February 2013
20.55
22.10
4th February 2013
27.55
29.16
SUMMARY AND CONCLUSIONS
The accuracy of the bus arrival time information provided to passengers plays a key role in its
acceptance. In order to improve the accuracy of the system, one should develop the prediction
method carefully. One factor that can improve the performance of the prediction method is the
choice of the correct input data. The first part of the present study conducted a pattern analysis
separately for each day, using 45 days’ data to identify the most significant input that can be used
for the prediction method. The analysis was carried out separately for peak and off-peak periods
and used a parametric statistical test (Z-test). It was observed that Sundays followed a strong
weekly pattern whereas the other days followed a weekly and daily pattern. Data from trips on
the same day were also used in the prediction method, in addition to the most significant inputs
identified by pattern analysis, to take into account the same day variations. However, these
patterns may be site specific and may need to be identified for a new location. The methodology
used in the study can be followed to carry out similar analysis in a new location.
The identified pattern were used in the end application of bus travel time prediction. The
model developed was based on time discretization and represented the evolution of travel time in
Kumar, Vanajakshi, and Subramanian 14
a subsection over time. This was different from the earlier studies where evolution of travel time
between subsections was considered. The discretization over time is expected to reflect the effect
of roadway characteristics such as carriageway width, signalized intersections, etc., in a given
subsection more effectively. The performance of the method was compared with space
discretization followed in previous studies. Results obtained showed the proposed algorithm
performing better than the space discretization approach.
The study also analysed the effect of these patterns on the accuracy. A comparison of the
prediction accuracy with and without considering the day wise patterns and traffic condition wise
patterns was carried out. The results showed the patterns based on traffic condition having a
bigger impact on the prediction accuracy than the day wise pattern.
The main challenge in using the time discretization approach is the requirement of a
sufficiently large data set (say for a month). Further analysis can be carried out to identify the
optimum number of data points required for a reasonable accuracy. The proposed method may
be improved further by explicitly incorporating section specific characteristics such as bus stops
and signals. The predicted travel time obtained from the proposed algorithm can be expressed in
terms of time remaining or actual clock time and can be displayed at bus stops, within bus or
through web portals or cell phone messages.
ACKNOWLEDGEMENT
The authors acknowledge the support for this study as a part of the sub-project CIE/10-
11/168/IITM/LELI under the Centre of Excellence in Urban Transport project funded by the
Ministry of Urban Development, Government of India, through letter No. N-11025/30/2008-
UCD.
REFERENCES
1. Schweiger, C.L. Real-time bus arrival information systems. In Transportation Research
Board, TCRP synthesis 48, 2003.
2. Ohba, Yoshikazu, Hideki, and Masao. Travel time prediction method for expressway
using toll collection system data. 7th World Congress on Intelligent Transport Systems,
Turin, 2000.
3. Wu, C.H., D.C. Su, J. Chang, C.C. Wei, J.M. Ho, K.J. Lin, and D. Lee. An advanced
traveler information system with emerging network technologies. Proceedings of 6th
Asia-Pacific Conference Intelligent Transportation Systems Forum, pp. 230-231.
4. Kwon, J., B. Coifman, and P. Bickel. Day-to-day travel-time trends and travel-time
prediction from loop-detector data. In Transportation research board: Journal of the
Transportation Research Board, No. 1717, Transportation Research Board, National
Research Council, Washington, D.C.,2007, pp. 120-129.
5. Lee, and W. Chien. HTTP: a new framework for bus travel time prediction based on
historical trajectories. Proceedings of the 20th International Conference on Advances in
Geographic Information Systems, 2012, pp. 279-288.
6. Kumar, S.V., and L. Vanajakshi. Pattern identification based bus arrival time prediction.
Proceedings of the Institution of Civil Engineers – Transport, 2011,
http://www.icevirtualli brary.com/content/article/10.1680/tran.12.00001. Accessed Jan,
14, 2013.
7. Lin, W.H., and J. Zeng. An Experimental Study of real-time Bus Arrival Time Prediction
with GPS Data. In Transportation Research Board: Journal of the Transportation
Kumar, Vanajakshi, and Subramanian 15
Research Board, No. 1666(1), Transportation Research Board, Washington, D.C., 1999,
pp. 13-20.
8. Bo, Y., L. Jing, Y. Bin, andZhongjen. An Adaptive Bus Arrival Time Prediction Model.
Eastern Asia Society for Transportation studies, 2009. http://www.easts.info/
publications/journalproceedings/journal2010/100064.pdf. Accessed Jan, 15, 2013.
9. Jeong, R., and L.R Rilett. Bus Arrival Time Prediction using Artificial Neural Network
Model. IEEE Intelligent Transportation Systems conference, Washington, D.C., 2004, pp.
988-993.
10. Patnaik, J., S. Chien, and A. Bladikas. Estimation of Bus Arrival Times using APC Data.
Journal of Public Transportation, No.1, 2004, pp. 1-20.
11. Bhandari, R.R. Bus Arrival Time Prediction using Stochastic Time Series and Markov
Chains. Ph. D dissertation, Department of Civil Engineering, New Jersey Institute of
Technology, 2005. http://archives.njit.edu/vol01/etd/2000s/2005/njit-etd2005-038/njit-
etd2005-038.pdf. Accessed Jan, 12, 2013.
12. Chien, S.I.J., Y. Ding, and C. Wei. Dynamic Travel time Prediction with Artificial Neural
Networks. Journal of Transportation Engineering 128, No. 5, 2002, pp.429-438.
13. Cathey, F.W., and D.J. Dailey. A Prescription for Transit Arrival/Departure Prediction
using AVL Data. In Transportation Research Part C: Emerging Technologies, No.11,
2003, pp. 241-264.
14. Son, B., H.J. Kim, C.H. Shin, and S.K. Lee. Bus Arrival Time Prediction Method for ITS
Application. Knowledge Based Intelligent Information and Engineering Systems, 2004,
Springer Berlin Heidelberg, pp. 88–94.
15. Shalaby, A., and A. Farhan. Prediction models of bus arrival and departure times using
AVL and APC data. Journal of Public Transportation, Vol.7, No.1, 2004, pp. 41-60.
16. Nanthawichit, C., T. Nakatsuji, and H. Suzuki. Application of Probe vehicle data for real
time traffic state estimation and short time travel time prediction on a freeway. In
Transportation Research Board, (CD-ROM), Washington, D.C., 2006.
17. Chu, L., J.S. Oh, and W. Recker. Adaptive Kalman Filter Based Freeway Travel time
Estimation. In Proceedings of Transportation Research Board, Transportation Research
Board, National Research Council, Washington, D.C., 2005.
18. Ramakrishna, Y., V. Lakshmanan, and R. Sivanandan. Bus travel time prediction using
GPS data. http://www.gisdevelopment.net/proceedings/mapindia/2006 /student%20oral
/mi06stu_html. Accessed Feb, 12, 2013.
19. Vanajakshi, L., S.C. Subramanian, and R. Sivanandan. Travel Time Prediction under
Heterogeneous Traffic Conditions using GPS Data from Buses. IET Journal on
Intelligent Transportation Systems 3 (1), 2009, pp. 1-9.
20. Padmanabhan, R.P.S., K. Divakar, L. Vanajakshi, and S.C. Subramanian. Development
of a Real-Time Bus Arrival Prediction System for Indian Traffic Conditions. IET Journal
on Intelligent Transportation Systems 4 (3), 2009, pp. 189-200.
21. Chamberlain, R.G. Great Circle Distance between Two Points. http://www.movabletype
.co.uk/scripts/gis-faq-5.1.html. Accessed Feb, 14, 2013.
22. Sprinthall, and C. Richard. Basic statistical analysis: Pearson education group, Inc., New
York, 2003.
23. Rees, D.G. Essential Statistics: Chapman & Hall/CRC Publishing, Inc., London, 2001.
24. Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. Transaction
of the ASME-Journal of Basic Engineering 82(1), 1960, pp.35-45.
Kumar, Vanajakshi, and Subramanian 16
25. Kumar, A. B., L. Vanajakshi, S.C. Subramanian. Day-Wise Travel Time Pattern Analysis
under Heterogeneous Traffic Conditions. 2nd conference of transportation research
Group if India, 2013, (Accepted).