Conference PaperPDF Available

Improving Transportation Mode Identification with Limited GPS Trajectories

Authors:
Improving Transportation Mode Identification with
Limited GPS Trajectories
Yuanshao Zhu1,2 , Christos Markos1,3, James J.Q. Yu1,2,*
1Department of Computer Science and Engineering, Southern University of Science and Technology
2Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation
3Faculty of Engineering and Information Technology, University of Technology Sydney
yasozhu@gmail.com, christos.k.markos@gmail.com, yujq3@sustech.edu.cn
Abstract—The deployment of Global Positioning System (GPS)
sensors in modern smartphones and wearable devices has enabled
the acquisition of high-coverage urban trajectories. Extracting
knowledge from such diverse spatiotemporal data is essential
for optimizing intelligent transportation system operations. Yet
a deeper understanding of users’ mobility patterns also requires
identifying their associated transportation modes. Combined
with growing privacy concerns, the considerable effort involved
in manual data annotation means that GPS trajectories are
in reality not labeled by transportation mode. This poses a
significant challenge for machine learning classifiers, which often
perform best when trained on large amounts of labeled data.
As such, this paper investigates a wide range of time series
augmentation methods aiming to improve the real-world appli-
cability of transportation mode identification. In our extensive
experiments on Microsoft’s Geolife dataset, both discrete wavelet
transform and flip augmentations pushed the transportation
mode identification accuracy of a convolutional neural network
from 85.1% to 87.3% and 87.2%, respectively.
I. INTRODUCTION
The ability to associate users’ mobility patterns with their
corresponding transportation modes is crucial for urban plan-
ning and transportation management [1]–[3]. Knowledge of
the travel mode distribution along urban transportation net-
works can help develop more effective strategies towards opti-
mizing infrastructure utilization, thereby alleviating significant
issues such as traffic congestion [4]–[6]. It can also provide
individuals with better route recommendations, conditioned on
their desired travel mode and destination [7]. With Global
Positioning System (GPS) sensors being installed in modern
smartphones and other wearable devices, acquiring rich GPS
trajectories for transportation mode identification has become
easier than ever.
Most GPS-based transportation mode identification ap-
proaches have been in supervised learning settings. Because
raw GPS trajectories are ill-suited for direct processing by
machine learning models, the seminal work of [8] first com-
puted pointwise motion features such as speed and acceleration
from consecutive pairs of GPS points, before feeding them
to a decision tree classifier. This motion feature extraction
step has since become standard practice in the transportation
This work is supported by the Stable Support Plan Program of Shen-
zhen Natural Science Fund No. 20200925155105002 and by the General
Program of Guangdong Basic and Applied Basic Research Foundation No.
2019A1515011032. James J.Q. Yu is the corresponding author.
mode identification literature. [2] combined the predictions of
a random forest classifier with a rule-based method. [9] first
trained a sparse autoencoder to extract latent representations
of handcrafted motion features such as speed and accelera-
tion, before feeding them to a Convolutional Neural Network
(CNN) for the final classification. Inspired by computer vision
applications, [10] treated GPS trajectories as image pixels
by mapping GPS points to grid cells and adjusting pixel
intensity according to location stay time. The authors then
trained a CNN to extract high-level representations which were
ultimately fed to a logistic regression classifier. [11] proposed
a deep ensemble of CNNs, while [12], [13] leveraged a single
CNN equipped with the attention mechanism. Others success-
fully used recurrent neural networks based on the Long Short-
Term Memory (LSTM) module, due to their demonstrated
effectiveness in modeling long-term temporal dependencies
[14]–[16].
Despite the aforementioned advances in supervised GPS-
based transportation mode identification methods, the relative
lack of trajectories labeled by travel mode remains a limiting
factor. In reality, GPS trajectories are typically unlabeled,
since GPS sensors do not automatically capture travel mode
information. Another reason is that trajectory annotation is
both time-consuming and labor-intensive [17], with users often
citing privacy concerns [12]. Consequently, how to improve
the performance of transportation mode classifiers when few
labeled trajectories are available is an open problem.
In this direction, some researchers have combined labeled
and unlabeled data in semi-supervised learning [17]–[19],
while others have strictly used unlabeled data in unsupervised
learning [20]. Among the semi-supervised approaches, [17]
jointly trained a convolutional autoencoder and a CNN by
first balancing their losses and then gradually assigning more
weight to the latter’s supervised loss. [18] instead leveraged
a semi-supervised LSTM ensemble trained on multiple views
of the data, including frequency-domain and latent represen-
tations thereof that were learned end-to-end. [19] used the
mixup augmentation technique [21] to train a convolutional
autoencoder on mixed batches of labeled, unlabeled, and syn-
thetic samples by simultaneously minimizing their associated
objective functions. On the other hand, [20] proposed a fully
unsupervised approach whereby a convolutional autoencoder
was equipped with a custom clustering layer and trained
655
2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI)
DOI 10.1109/ICTAI52525.2021.00104
2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI) | 978-1-6654-0898-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICTAI52525.2021.00104
978-1-6654-0898-1/21/$31.00 ©2021 IEEE
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
4
(×1×૛૝૙)
4
RD
S
A
J
Preprocess
Fig. 1. Overview of the preprocessing framework for identifying transportation modes. Raw GPS trajectories are first segmented by transportation mode using
the available labels. Then, pointwise motion features are computed for each segment and converted to a 4-channel tensor.
by jointly optimizing a weighted sum of reconstruction and
clustering losses, thus encouraging clustering-friendly repre-
sentations at the autoencoder’s low-dimensional embedding
layer.
To address the limitations caused by the scarce availability
of labeled trajectories, we instead follow a different approach.
Specifically, we explore a collection of time series augmen-
tation methods1to assess their impact on the performance
of supervised transportation mode classifiers. We provide an
analysis of the underlying principles, effects on classification
performance, as well as hyperparameter selection guidelines
for each method. We conduct a series of comprehensive
experiments on Microsoft’s Geolife [8], [22] dataset, a real-
world dataset of GPS trajectories, showing that both discrete
wavelet transform and flip augmentations are effective meth-
ods towards improving transportation mode identification with
limited data.
The remainder of this paper is organized as follows. Section
II presents our preprocessing steps and formulates the problem
of data-augmented supervised transportation mode identifi-
cation. Section III introduces the time series augmentation
techniques that are investigated towards enhancing GPS-based
transportation mode identification with limited data. Section
IV analyzes our experimental results and provides guidelines
into hyperparameter selection for the above augmentation
methods, while Section V concludes this paper.
II. PRELIMINARIES
This section first presents how we preprocess GPS tra-
jectories into multivariate time series of motion features,
including relative distance, speed, acceleration, and jerk. It
then formulates the problem of data-augmented, supervised
transportation mode identification.
1While image augmentation techniques such as random rotations and
horizontal/vertical shifts have been shown to boost classification accuracy in
computer vision applications, they are not directly applicable to either raw
GPS data or the multivariate time series of motion features that trajectories
are typically converted to.
A. GPS Trajectory Preprocessing
We represent GPS trajectory Tias a sequence
p1,p
2, ..., p
LTiof length LTi. Within Ti, GPS
points are denoted by pi=lati,lng
i,t
i, where lati,
lngiindicate the device’s latitude and longitude in decimal
degrees at time ti. The relative distance RDibetween pi
and its successor pi+1 can be estimated in meters using the
Vincenty formula [23], denoted as:
RDi= Vincenty (lati,lng
i,lat
i+1,lng
i+1).(1)
Based on RDiand its associated time interval Δti=ti+1ti,
we follow established literature [16], [17], [22] in calculating
pointwise motion features of speed Si, acceleration Aiand
jerk Jiaccording to the following equations:
Si=RDi
Δti
,1in, Sn=Sn1,(2)
Ai=Si+1 Si
Δti
,1in, An=0,(3)
Ji=Ai+1 Ai
Δti
,1in, Jn=0.(4)
After the above feature extraction steps, we eliminate any
timesteps with velocity or acceleration outliers based on
upper thresholds defined for each transportation mode in
[17]. We finally apply min-max normalization to each of the
four features and stack them into a 4-channel tensor Xi=
x1,x
2, ..., x
LTi, where xi={RDi,S
i,A
i,J
i}.
Since our experiments leverage both recurrent and non-
recurrent neural network architectures, the latter requiring
fixed-size input, we finally split each motion feature tensor Xi
calculated for Tiinto LTi/Nsegments of length N. The last
segment is padded with zeros if it has fewer than Ntimesteps;
in this work, we empirically set N= 240. Please note that
all data augmentation methods discussed and evaluated in this
paper are based on motion feature tensors rather than raw GPS
trajectories.
656
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
B. Problem Formulation
Given labeled dataset D={(Xi,y
i)}n
i=1 preprocessed
as per Section II-A and classifier fω(·)parameterized by
trainable parameters ω, we formalize transportation mode
identification as a standard supervised classification problem,
i.e., the problem of obtaining the optimal set of parameters ω
such that the following loss is minimized:
arg min
ωL(ω)= 1
n
n
i=1
i(yi,f
ω(Xi)),(5)
i=[yilog ˆyi+(1yi)log(1ˆyi)],(6)
where ˆyi=fω(Xi),yiare the i-th predicted and ground-truth
transportation modes, and (·)is the categorical cross-entropy
loss function.
Next, we define the general data augmentation function
Aug(·)that produces synthesized sample Xwhen applied to
Xi, denoted by X=Aug(Xi). Assuming that each sample
is augmented exactly once, the above loss function can then
be rewritten as:
arg min
ωL(ω)= 1
2n
n
i=1
i(yi,f
ω(Xi))+
i(yi,f
ω(Aug(Xi))).
(7)
In this paper, we study a wide range of data augmentation tech-
niques (see Section III) in place of Aug(·)with the purpose of
evaluating their contribution towards improving transportation
mode identification with limited GPS trajectories.
III. METHODOLOGY
This section details the time series augmentation techniques
that we adopt towards improving the accuracy of transportation
mode classifiers. These include data perturbation, flipping,
mixup [21], mixing, and discrete wavelet transform.
A. Data Perturbation
Data perturbation refers to injecting each input motion
feature tensor Xiwith random noise. In practice, this is
achieved via addition with a noise tensor Zof the same
dimensionality. For simplicity, Zis sampled from a Gaussian
distribution; specifically, each zZis sampled according to:
p(z;μ, σ)= 1
2πσ exp (zμ)2
2σ2,(8)
where μ,σdenote the mean and standard deviation of z,
respectively. We determine the values for μand σby the mean
and standard deviation of Xi, controlled by hyperparameter k
as follows: μ=k·mean(Xi),
σ=k·stddev(Xi).(9)
For original sample Xi, the synthesized sample can thus be
written as: X=Xi+Z,
y=yi.(10)
B. Data Flip
In computer vision applications, data augmentation is usu-
ally performed by randomly rotating, cropping, or flipping im-
ages. However, most of the above methods would destroy the
motion features’ temporal correlations and interdependencies.
Considering that each input channel represents a different mo-
tion feature, we simply flip Xialong the temporal dimension
for each channel. The flip operation can be expressed as:
X={xn,x
n1, ..., x
1},
y=yi.(11)
C. Mixup
Originally proposed for computer vision applications,
Mixup [21] expands the training data by mixing pairs of
images and their corresponding labels. The mixup method is
a form of data augmentation which encourages the classifier
fω(·)to learn linear interpolations between pairs of training
samples, generated as follows:
X=λXi+(1λ)Xj,
y=λyi+(1λ)yj,(12)
where λis sampled from a beta distribution Beta(α, α)
parameterized by α(0,). In eq. (12), (Xi,y
i)and
(Xj,y
j)are two randomly-selected samples from the original
training data with one-hot encoded labels yiand yj. The
mixing hyperparameter αcontrols the mixing strength between
feature-target pairs; when α0,Xis identical to Xi, i.e.,
no mixing is performed.
D. Data Mixing
The intuition behind data mixing comes from the fact that
GPS trajectories with the same transportation mode would
have similar trends in terms of motion features. To this end, we
perform a weighted mix of kmotion feature tensors having the
same transportation mode and assign the resulting synthetic
sample with the same label as the original ones. In theory,
such a scheme allows for synthesizing an infinite number
motion feature tensors. In this paper, we adopt two data
mixing schemes, namely double-trajectory mixing and multi-
trajectory weight decay mixing. For double-trajectory mixing,
we randomly select two samples with identical transportation
modes and mix them as follows:
X=w1X1+w2X2,w
1+w2=1,
y=yi.(13)
For multi-trajectory weight decay mixing, we randomly select
ktrajectories with the same transportation mode and mix them
using gradually smaller weights:
X=w1X1+w2X2+...+wkXk,
y=yi.(14)
where k
i=1 wi=1 and w1w2... wk. Please note
that double-trajectory mixing is simply a special case of multi-
trajectory weight decay mixing where k=2.
657
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
E. Discrete Wavelet Transform
Given that motion feature variations can also be distin-
guished in the frequency domain [16], [18], we examine the
effect of augmentation by Discrete Wavelet Transform (DWT)
on the performance of transportation mode identification.
Given time series x(t), DWT results in a multi-resolution
decomposition of the input signal [24] as follows:
x(t)=
b
AM,b2M/2ϕt
2Mb
+
M
a
b
da,b(x(t)(t))2a/2ψt
2ab
=AM(t)+
M
a
Da(t),
(15)
where AM,b =x(t)
M,b(t)is the approximation coeffi-
cient at decomposition level Mand ϕ(t)is an auxiliary scaling
function. In other words, x(t)is decomposed into an approx-
imation signal AM(t)and Mdetailed signals Da(t). When
augmenting Xi, the synthesized sample Xis again associated
with the same transportation mode label, i.e., y=yi.
IV. EXPERIMENTS
This section first introduces the real-world dataset of GPS
trajectories that we used for our experiments and describes
our simulation setup. It finally presents our experimental
results and provides hyperparameter tuning guidelines for the
evaluated time series augmentation methods.
A. Dataset Description and Simulation Setup
1) Dataset:All data augmentation methods in Section III
are evaluated on Microsoft’s Geolife dataset [8], [22], which
has been widely used in the transportation mode identification
literature [10], [17], [20]. It contains GPS trajectories collected
by 182 users over five years. Out of these users, 69 users
have labeled parts of their trajectories by transportation mode.
preprocess them as per Section II-A. Following the dataset au-
thors’ recommendation, we select main transportation modes
for identification, namely walking,biking,bus,driving and
railway. After preprocessing all GPS trajectories as per Section
II-A, we obtain a total of 24,741 labeled samples of length
240 (walking:7315,biking:3848,bus:5964,driving:4338
,railway:3278). Following a stratified data split to maintain
the transportation mode distribution, 85% of the above samples
are used for training and validation, while the remaining 15%
are used for testing. Please note that all data augmentation
methods are only applied to the training set.
2) Simulation Setup:We first present our hyperparameter
settings for the time series augmentation methods described
in Section III. Perturbation is applied with k=0.02, while
Mixup [21] uses α=0.5. Data mixing expands each data
class by 2000 samples,2where mixing-1 and mixing-2 denote
2Even though we could generate as many samples per class as required to
eliminate the training set class imbalance, this would lead to a different class
distribution compared to the test set.
TABLE I
ACCURACY PERCENTAGE (MEAN ±STANDARD DEVIATION)FOR
DIFFERENT DATA AUGMENTATION METHODS AND CLASSIFIERS
Augmentation MLP CNN LSTM
Baseline 70.8 ±1.32 85.1 ±0.31 76.3 ±0.28
Perturbation 71.4 ±0.51 85.9 ±0.18 76.8 ±0.23
Flip 71.8 ±0.92 87.3 ±0.23 77.4 ±0.27
Mixup [21] 69.8 ±0.88 84.2 ±0.22 75.1 ±0.31
Mixing-1 72.6 ±0.72 86.0 ±0.21 76.9 ±0.20
Mixing-2 71.3 ±0.56 85.5 ±0.19 76.7 ±0.23
DWT 80.1 ±0.92 87.2 ±0.13 78.5 ±0.18
the double-trajectory mixing and multi-trajectory weight decay
mixing methods, respectively. For the former, we set w1,w
2
Beta(0.5,0.5), while the latter uses k=5(i.e., we mix five
trajectories of the same transportation mode) with w1=0.9,
w2=0.04,w3=0.02,w4=0.02,w5=0.02.
We evaluate the above time series augmentation methods on
a MultiLayer Perceptron (MLP), a CNN, and an LSTM. (1)
The MLP has three fully connected layers with {512,128,5}
neurons. (2) The CNN consists of three one-dimensional (1D)
convolution layers with a kernel size of 3and {32,64,128}
channels, respectively. Each convolution layer is followed by a
max pooling operation with a pool size of 2. The convolution
layers are followed by a flattening operation resulting in 3840
features, followed by a fully connected layer with 960 neurons.
(3) The LSTM has three LSTM layers with {64,64,64}units,
respectively. The output of the last LSTM layer is flattened and
fed to two fully connected layers with {256,5}neurons. For
all three neural networks, all hidden layers are activated using
the Rectified Linear Unit (ReLU) function, while the softmax
activation function is used to predict the transportation mode at
the output layer. Please note that we do not use regularization
methods such as dropout or batch normalization; instead, we
prevent our networks from overfitting by reducing their size
(i.e., number of layers and hidden units) and therefore the
number of trainable parameters. All models are trained for
200 epochs using the Adam optimizer with a learning rate of
0.001. We report the mean classification accuracy calculated
over the last 20 training epochs.
Our experiments were developed using Python 3.7. All
neural networks were built using PyTorch 1.7 and trained on
a server with an Intel Xeon Silver 4210 CPU and an NVIDIA
GeForce RTX 2080 Ti GPU with 11GB of GDDR6 memory.
B. Results
Our experimental results are shown in Table I. With the ex-
ception of Mixup [21], which performed worse than just using
the original samples, all evaluated augmentation methods con-
tributed to improving classification performance. Among them,
discrete wavelet transform and flip augmentations achieved the
best results for our CNN and LSTM models, pushing the for-
mer’s accuracy from 85.1% to 87.2% and 87.3%, respectively.
DWT was also by far the most effective augmentation method
for our MLP, increasing its baseline accuracy of 70.8% to
80.1% . Both mixing-1 and mixing-2 resulted in modest im-
658
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
k = 0.02
k = 0.1
k = 0.2
k = 1
Original
Accuracy
(mean)
SpeedNoise
85.1%
85.9%
84.5%
66.5%
80.2%
Fig. 2. Changes in the speed signal of a randomly-selected sample after adding
noise to all training samples with k∈{0.02,0.1,0.2,1}. The CNN’s
mean accuracy declines beyond a certain noise magnitude, indicating that
the classifier fails to identify meaningful information within the augmented
samples.
provement, with the former outperforming the latter. Moreover,
perturbation attained nearly identical results to mixing-1. The
above experimental results confirm the potential of time series
augmentation in improving GPS-based transportation mode
identification with limited data.
1) Data Perturbation:As described in Section III-A, the
intuition behind data perturbation is that learning from noisy
counterparts of the original data may help the classifier learn
more general features. However, adding too much noise may
result in unrealistic samples that are hard to learn meaningful
representations from. Fig. 2 shows that perturbation indeed
boosted classification accuracy from 85.1% to 85.9% for
k=0.02, which is the hyperparameter value used throughout
our experiments. Values of k>0.1, however, resulted in
significant accuracy degradation.
2) Flip:By simply flipping Xialong the temporal di-
mension for each motion feature, our expectation is that the
generated sample would still realistically correspond to the
same transportation mode. Each feature would demonstrate
the same minimum and maximum values, despite having
different temporal dynamics. According to our results in Table
I, flipping resulted in the highest classification accuracy for
our CNN and LSTM but only modestly benefited our MLP.
This is likely due to the latter not accounting for temporal
dependencies.
3) Mixup:As per Section III-C, mixup [21] generates
synthetic data via a linear combination of paired samples and
their corresponding ground-truth labels. This practice aims
to encourage the classifier to interpolate smoothly between
samples and reduce the effect of adversarial ones. Our hyper-
parameter sensitivity tests, shown in Table II, demonstrate that
mixup did not outperform the non-augmented transportation
TABLE II
HYPERPARAMETER SENSITIVITY OF CNN ACCURACY (MEAN ±
STANDARD DEVIATION)TO MIXUP
Augmentation Accuracy
Baseline 85.1 ±0.31
Mixup (α=0.2) 83.9 ±0.21
Mixup (α=0.5) 84.2 ±0.22
Mixup (α=1) 83.6 ±0.27
Mixup (α=10) 82.8 ±0.46
TABLE III
HYPERPARAMETER SENSITIVITY OF CNN ACCURACY TO DATA MIXING
Augmentation Parameter Settings Mean (%)
mixing-1
w1=0.5,w
2=0.584.3
w1=0.8,w
2=0.285.6
w1=0.95,w
2=0.05 85.8
w1,w
2Beta(0.5,0.5) 86.0
mixing-2
{0.5,0.3,0.1,0.05,0.05}82.6
{0.7,0.1,0.1,0.05,0.05}83.1
{0.8,0.05,0.05,0.05,0.05}85.3
{0.9,0.04,0.02,0.02,0.02}85.5
mode identification baseline. However, note that mixup assigns
labels to the synthesized samples by simply blending their
original ones. As such, we expect that it could perform better
in semi-supervised training, where the effect of the generated
labels on the learned representations would be attenuated. This
is out of the scope of this paper and is left for future work.
4) Data Mixing:Although data mixing did not dramati-
cally boost classification accuracy, we found that it resulted in
higher training stability during our experiments. This may be
due to how data mixing is performed, which is via timestep-
wise addition of two or more samples of the same transporta-
tion mode. We hypothesize that this may help the classifier
learn the main motion feature trends of each transportation
mode while simultaneously becoming more robust to trajectory
variations not observed in the original data.
We also analyzed the impact of different data mixing hyper-
parameter settings on classification accuracy; our experimental
results are summarized in Table III. Although mixing the
motion features of either two or five trajectories did increase
model accuracy compared to the baseline, mixing-2 did not
result in significant improvement. This is not surprising, as
mixing more sets of motion features will also incur an increase
in uncertainty.
5) DWT:Here, we explore the effect of extracting features
via different wavelet decomposition functions on classification
accuracy. As shown in Table IV, using different wavelet de-
composition functions did not significantly affect classification
accuracy, with Daubechies wavelets achieving the best results.
This suggests that DWT has the desirable property of not being
particularly sensitive to the choice of wavelet function.
Recall that, according to eq. (15), x(t)can be decomposed
into approximate signal AM(t)and detailed signal Da(t).
Having compared the influence of DWT on classification
659
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
TABLE IV
SENSITIVITY OF CNN ACCURACY TO DIFFERENT WAVELET
DECOMPOSITION FUNCTIONS IN DWT
Wavelet Mean (%) w/ AM(t)Mean (%) w/ Da(t)
Daubechies 87.2 87.0
Symlets 87.0 87.1
Coiflets 86.8 86.9
Haar 87.1 87.0
accuracy when using either AM(t)or Da(t), our experimental
results showed no significant prevalence of one over the other.
This is consistent with recent work indicating that capturing
motion feature trends rather than details may be more impor-
tant when distinguishing among transportation modes [18].
V. C ONCLUSION
In this paper, we investigated a range of data augmentation
techniques to improve GPS-based transportation mode identi-
fication performance when limited labeled data are available.
Since the literature typically performs transportation mode
identification on time series of motion features extracted from
GPS trajectories rather than the raw trajectories themselves, we
followed the same procedure and investigated the impact of
several time series augmentation techniques on classification
accuracy. We also provided guidelines into tuning their hy-
perparameters to encourage their use in future transportation
mode identification research. Through a set of comprehensive
experiments on Microsoft’s Geolife, an openly available real-
world dataset of GPS trajectories, we demonstrated that the
simple operation of flipping resulted in the highest accuracy
of 87.3% for a convolutional neural network. In addition,
extracting features in the frequency domain via DWT pushed
classification accuracy from the baseline of 85.1% to 87.2%.
In future work, we will investigate the influence of time
series augmentation methods on the transportation mode
identification accuracy of more sophisticated neural network
architectures, such as generative adversarial networks and
Transformers.
REFERENCES
[1] F.-Y. Wang, “Parallel control and management for intelligent trans-
portation systems: Concepts, architectures, and applications,” IEEE
Transactions on Intelligent Transportation Systems, vol. 11, no. 3, pp.
630–638, 2010.
[2] B. Wang, L. Gao, and Z. Juan, “Travel mode detection using GPS data
and socioeconomic attributes based on a random forest classifier, IEEE
Transactions on Intelligent Transportation Systems, vol. 19, no. 5, pp.
1547–1558, 2017.
[3] M. Ashifuddin Mondal and Z. Rehena, “Intelligent traffic congestion
classification system using artificial neural network,” in Companion
Proceedings of The 2019 World Wide Web Conference, ser. WWW ’19.
New York, NY, USA: Association for Computing Machinery, 2019, p.
110–116.
[4] J. Zhang, F.-Y. Wang, K. Wang, W.-H. Lin, X. Xu, and C. Chen, “Data-
driven intelligent transportation systems: A survey,” IEEE Transactions
on Intelligent Transportation Systems, vol. 12, no. 4, pp. 1624–1639,
2011.
[5] G. Li, C.-J. Chen, S.-Y. Huang, A.-J. Chou, X. Gou, W.-C. Peng, and
C.-W. Yi, “Public transportation mode detection from cellular data,”
in Proceedings of the 2017 ACM on Conference on Information and
Knowledge Management, 2017, pp. 2499–2502.
[6] E. Anagnostopoulou, B. Magoutas, E. Bothos, and G. Mentzas, “Per-
suasive technologies for sustainable smart cities: The case of urban
mobility, in Companion Proceedings of The 2019 World Wide Web
Conference, ser. WWW’19. New York, NY, USA: Association for
Computing Machinery, 2019, p. 73–82.
[7] A. C. Prelipcean, G. Gidofalvi, and Y. O. Susilo, “Transportation mode
detection an in-depth review of applicability and reliability,” Transport
Reviews, vol. 37, no. 4, pp. 442–464, 2017.
[8] Y. Zheng, L. Liu, L. Wang, and X. Xie, “Learning transportation
mode from raw GPS data for geographic applications on the web,” in
Proceedings of the 17th International Conference on World Wide Web.
New York, NY, USA: Association for Computing Machinery, 2008, pp.
247–256.
[9] H. Wang, G. Liu, J. Duan, and L. Zhang, “Detecting transportation
modes using deep neural network,” IEICE TRANSACTIONS on Infor-
mation and Systems, vol. 100, no. 5, pp. 1132–1135, 2017.
[10] Y. Endo, H. Toda, K. Nishida, and A. Kawanobe, “Deep feature extrac-
tion from trajectories for transportation mode estimation,” in Pacific-
Asia Conference on Knowledge Discovery and Data Mining. Cham,
Switzerland: Springer International Publishing, 2016, pp. 54–66.
[11] S. Dabiri and K. Heaslip, “Inferring transportation modes from GPS tra-
jectories using a convolutional neural network, Transportation Research
Part C: Emerging Technologies, vol. 86, pp. 360–371, 2018.
[12] Y. Zhu, S. Zhang, Y. Liu, D. Niyato, and J. J. Q. Yu, “Robust federated
learning approach for travel mode identification from non-iid GPS
trajectories,” in 2020 IEEE 26th International Conference on Parallel
and Distributed Systems (ICPADS), 2020, pp. 585–592.
[13] Y. Zhu, Y. Liu, J. J. Q. Yu, and X. Yuan, “Semi-supervised federated
learning for travel mode identification from GPS trajectories, IEEE
Transactions on Intelligent Transportation Systems, pp. 1–12, 2021.
[14] H. Liu and I. Lee, “End-to-end trajectory transportation mode classifica-
tion using bi-lstm recurrent neural network,” in 2017 12th International
Conference on Intelligent Systems and Knowledge Engineering (ISKE),
Nanjing, China, 2017, pp. 1–5.
[15] J. V. Jeyakumar, E. S. Lee, Z. Xia, S. S. Sandha, N. Tausik, and
M. Srivastava, “Deep convolutional bidirectional lstm based transporta-
tion mode recognition,” in Proceedings of the 2018 ACM International
Joint Conference and 2018 International Symposium on Pervasive and
Ubiquitous Computing and Wearable Computers. Association for
Computing Machinery, 2018, pp. 1606–1615.
[16] J. J. Q. Yu, “Travel mode identification with GPS trajectories using
wavelet transform and deep learning, IEEE Transactions on Intelligent
Transportation Systems, vol. 22, no. 2, pp. 1–11, 2021.
[17] S. Dabiri, C. Lu, K. Heaslip, and C. K. Reddy, “Semi-supervised deep
learning approach for transportation mode identification using GPS tra-
jectory data,” IEEE Transactions on Knowledge and Data Engineering,
vol. 32, no. 5, pp. 1010–1023, 2020.
[18] J. J. Q. Yu, “Semi-supervised deep ensemble learning for travel mode
identification,” Transportation Research Part C: Emerging Technologies,
vol. 112, pp. 120–135, 2020.
[19] X. Song, C. Markos, and J. J. Q. Yu, “Multimix: A multi-task deep
learning approach for travel mode identification with few gps data, in
2020 IEEE 23rd International Conference on Intelligent Transportation
Systems (ITSC). IEEE, 2020, pp. 1–6.
[20] C. Markos and J. J. Q. Yu, “Unsupervised deep learning for GPS-based
transportation mode identification,” in 2020 IEEE 23rd International
Conference on Intelligent Transportation Systems (ITSC). IEEE, 2020,
pp. 1–6.
[21] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond
empirical risk minimization,” in International Conference on Learning
Representations, 2018.
[22] Y. Zheng, Q. Li, Y. Chen, X. Xie, and W.-Y. Ma, Understanding Mobility
Based on GPS Data. New York, NY, USA: Association for Computing
Machinery, 2008, p. 312–321.
[23] T. Vincenty, “Direct and inverse solutions of geodesics on the ellipsoid
with application of nested equations,” Survey Review, vol. 23, no. 176,
pp. 88–93, 1975.
[24] S. G. Mallat, “A theory for multiresolution signal decomposition: the
wavelet representation, IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 11, no. 7, pp. 674–693, 1989.
660
Authorized licensed use limited to: Southern University of Science and Technology. Downloaded on March 21,2022 at 06:57:58 UTC from IEEE Xplore. Restrictions apply.
... On the other hand, alternative approaches have leveraged deep learning models to automatically learn deep features from GPS trajectories. In order to employ these extracted features from the GPS trajectory data to feed a deep learning model, they need to be transformed into images [6,9,24]. However, several approaches have directly used the GPS data itself to represent the trajectories as 2D images of data structures, employing these trajectory images as input for deep learning [10,17,25]. ...
... In [20] the author also explored LSTM neural networks, and introduced a mechanism based on discrete wavelet transform to extract time-frequency domain features of the trajectories to improve classification accuracy. Meanwhile, Zhu et al. [24] investigated various time series augmentation methods and found that discrete wavelet transform and flip augmentations yielded the best results for CNN and LSTM models. ...
Article
Full-text available
Global positioning system data play a crucial role in comprehending an individual’s life due to its ability to provide geographic positions and timestamps. However, it is a challenge to identify the transportation mode used during a trajectory due to the large amount of spatiotemporal data generated, and the distinct spatial characteristics exhibited. This paper introduces a novel approach for transportation mode identification by transforming trajectory data features into image representations and employing these images to train a neural network based on vision transformers architectures. Existing approaches require predefined temporal intervals or trajectory sizes, limiting their adaptability to real-world scenarios characterized by several trajectory lengths and inconsistent data intervals. The proposed approach avoids segmenting or changing trajectories and directly extracts features from the data. By mapping the trajectory features into pixel location generated using a dimensionality reduction technique, images are created to train a deep learning model to predict five transport modes. Experimental results demonstrate a state-of-the-art accuracy of 92.96% on the Microsoft GeoLife dataset. Additionally, a comparative analysis was performed using a traditional machine learning approach and neural network architectures. The proposed method offers accurate and reliable transport mode identification applicable in real-world scenarios, facilitating the understanding of individual’s mobility.
... On the other hand, alternative approaches have leveraged deep learning models to automatically learn deep features from GPS trajectories. In order to employ these extracted features from the GPS trajectory data to feed a deep learning model, they need to be transformed into images [6,21,24]. However, several approaches have directly used the GPS data itself to represent the trajectories as 2D images of data structures, employing these trajectory images as input for deep learning [8,15,25]. ...
... In [18] the author also explored LSTM neural networks, and introduced a mechanism based on discrete wavelet transform to extract time-frequency domain features of the trajectories to improve classification accuracy. Meanwhile, Zhu et al. [24] investigated various time series augmentation methods and found that discrete wavelet transform and flip augmentations yielded the best results for CNN and LSTM models. ...
Preprint
Full-text available
Global Positioning System data plays a crucial role in comprehending an individual's life due to its ability to provide geographic positions and timestamps. However, the large amount of spatio-temporal data generated, and the distinct spatial characteristics exhibited by different modes, poses challenges for learning transportation modes from the Global Positioning System trajectories. This paper introduces a novel approach for transportation mode identification by transforming Global Positioning System trajectory data into image representations and employing these images to train a neural network based on Vision Transformers architectures. The proposed method avoids segmenting or changing trajectories and directly extracts a set of features from the Global Positioning System trajectories. By mapping the trajectory features into pixel location generated using a dimensionality reduction technique, images are created for training a deep learning model to predict five transport modes. Experimental results demonstrate the approach effectiveness, achieving a state-of-the-art accuracy of 92.96\% on the Microsoft GeoLife dataset. Moreover, it highlights the differences in the experiments conducted in each study. Additionally, a comparative analysis was performed regarding our proposal, contrasting a machine learning approach and other neural network architectures with this approach. The proposed method offers accurate and reliable transport mode detection applicable in real-world scenarios, facilitating a comprehensive understanding of individual's mobility.
... We employ Microsoft's Geolife dataset [24], [25] for our experiment, which is extensively utilized for traffic mode identification [26]. The dataset comprises both labeled and unlabeled data, collected from a total of 182 users, including 69 users with labeled data and 113 users with unlabeled data. ...
Preprint
Full-text available
Intelligent transportation systems play a pivotal role in addressing the growing challenges of urban mobility. This paper introduces the Spatial-Temporal Traffic Predictor (STTP), a novel deep learning approach designed for accurate traffic speed prediction within transportation networks. Motivated by the need for advanced models to optimize traffic flow and mitigate congestion, STTP strategically integrates spatial and temporal dependencies, capturing intricate patterns crucial for precise predictions. Through extensive experiments on real-world datasets, including diverse urban contexts, STTP demonstrates superior performance compared to state-of-the-art methods. The model excels in capturing both spatial and temporal dynamics, identifies critical nodes within transportation networks, and showcases robust generalization across different cities. The findings affirm STTP's potential as a versatile tool for enhancing traffic management and contributing to the development of smarter transportation infrastructures.
Preprint
Full-text available
Real-time road anomaly detection is a critical aspect of modern urban transportation systems, aiming to enhance road safety and traffic management. In this paper, we propose an innovative methodology that integrates advanced sensor networks and machine learning algorithms to achieve accurate and efficient anomaly detection. Our approach involves the fusion of visual data, LiDAR, and vehicular communication signals, providing a comprehensive understanding of the road environment. Key components include an adaptive weighting mechanism, spatial-temporal fusion through convolutional long short-term memory networks (ConvLSTM), and dynamic thresholding. Through a series of experiments, we benchmark our method against established approaches, demonstrating superior performance in terms of precision, recall, and F1-score metrics, coupled with significantly lower execution times. The adaptability, efficiency, and accuracy of our proposed method position it as a promising solution for real-time road anomaly detection in dynamic urban environments.
Preprint
Full-text available
The efficient management of traffic flow on highways is crucial for ensuring safety, reducing congestion, and optimizing transportation infrastructure. However, real-world scenarios often involve missing or incomplete traffic data, posing challenges for decision-making in Intelligent Transportation Systems (ITS). In this paper, we present a novel methodology, the GAN-Transformer Imputer (GTI), for spatial-temporal traffic imputation. Our approach integrates Generative Adversarial Networks (GANs) and Transformers to address the limitations of existing methods. The GAN component generates realistic traffic data, while the Transformer captures long-range dependencies and temporal patterns. Through a series of experiments on synthetic and real-world datasets, we demonstrate the superior performance of GTI compared to state-of-the-art methods. GTI consistently outperforms traditional statistical methods, machine learning approaches, and deep learning methodologies in terms of Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and F1 Score. The key contributions of this work lie in the seamless integration of GANs and Transformers, providing a robust solution for spatial-temporal traffic imputation. Our methodology, with its adversarial training and uncertainty quantification, holds promise for enhancing decision-making processes in ITS, contributing to the advancement of transportation systems and urban planning.
Preprint
Full-text available
Accurate traffic volume prediction is essential for effective traffic management and urban planning. This paper introduces a novel approach, the Mix-of-Expert Transformer model, for traffic volume prediction on highway networks. Integrating the Mix-of-Expert concept within the Transformer architecture, our model leverages expert specialization and adaptive attention mechanisms to capture complex spatiotemporal dependencies in traffic data. Through comprehensive experiments, including comparisons with traditional time-series models, machine learning approaches, and state-of-the-art Transformer-based methods, we demonstrate the superior performance and computational efficiency of the proposed model. The Mix-of-Expert Transformer consistently outperforms baseline models, showcasing its potential as a valuable tool for optimizing traffic flow and urban transportation systems. The model's versatility and efficiency make it a promising solution for real-world applications in traffic prediction, paving the way for further advancements in intelligent transportation systems.
Preprint
Full-text available
With the rise of ride-sharing services, accurately predicting origin-destination demand has become a critical task for ensuring efficient allocation of resources and enhancing the overall user experience. Traditional demand prediction methods often rely on deterministic models that fail to capture the inherent uncertainty and variability in travel patterns. In this paper, we propose a novel approach for origin-destination demand prediction in ride-sharing using deep Bayesian learning techniques. By leveraging the power of deep neural networks and the flexibility of Bayesian modeling, our method not only provides accurate demand estimates but also quantifies the associated uncertainties. We demonstrate the effectiveness of our approach through extensive experiments on a real-world ride-sharing dataset, comparing it with several baseline methods commonly used in the field. The experimental results showcase the superior performance of our proposed method in accurately predicting the origin-destination demand in ride-sharing. Our approach achieves significantly lower mean absolute error (MAE) and root mean square error (RMSE) compared to the baseline methods, indicating its improved accuracy. Moreover, our method demonstrates a higher coverage probability, highlighting its better reliability and uncertainty estimation capabilities. The key contributions of our paper include the development of a deep Bayesian neural network for demand prediction, the introduction of probabilistic layers to capture uncertainties, and the utilization of dropout regularization for improved uncertainty estimation. Our proposed approach offers a robust and scalable solution for accurate origin-destination demand prediction in ride-sharing, providing valuable insights for decision-making and resource allocation.
Preprint
Full-text available
Traffic flow distribution forecasting plays a crucial role in intelligent transportation systems for effective traffic management and congestion mitigation. In this paper, we propose a novel approach for traffic flow distribution forecasting using federated learning and Bayesian deep learning. The proposed method leverages the collaborative training capability of federated learning to learn from decentralized data sources while preserving data privacy and ownership. Bayesian deep learning techniques are integrated to estimate uncertainties and provide probabilistic predictions, enabling the capture of the inherent variability in traffic flow distributions. We conduct comprehensive experiments on real-world traffic flow datasets and compare our method with baseline approaches. The results demonstrate that our proposed method outperforms the baselines in terms of mean absolute error, root mean squared error, negative log-likelihood, and calibration error. The integration of federated learning and Bayesian deep learning enables accurate traffic flow distribution forecasts while providing reliable uncertainty estimates. This work contributes to the advancement of intelligent transportation systems and provides valuable insights for decision-making in traffic management and congestion reduction.
Preprint
Full-text available
Highway ramp speed control plays a critical role in optimizing traffic flow and improving transportation system efficiency. In this paper, we propose an optimal highway ramp speed control method using deep reinforcement learning. By leveraging the power of deep neural networks and reinforcement learning, our method, named DeepRampControl, learns to dynamically adjust vehicle speeds at ramps to minimize disruptions to the main traffic flow. We compare DeepRampControl with traditional rule-based approaches and other machine learning-based methods through comprehensive experiments in a simulated environment. The results demonstrate that DeepRampControl achieves higher traffic flow efficiency, lower average merging delay, and reduced energy consumption. These findings highlight the potential of deep reinforcement learning in optimizing highway ramp speed control and its ability to adapt to dynamic traffic conditions. The proposed method contributes to the development of intelligent and adaptive traffic control systems, paving the way for more efficient and sustainable transportation networks.
Article
Full-text available
GPS trajectories serve as a significant data source for travel mode identification along with the development of various GPS-enabled smart devices. However, such data directly integrate user private information, thus hindering users from sharing data with third parties. On the other hand, existing identification methods heavily depend on the respective manual travel mode annotations, whose production is economically inefficient and error-prone. In this paper, we propose a Semi-supervised Federated Learning (SSFL) framework that can accurately identify travel modes without using users' raw trajectories data or relying on notable data labels. Specifically, we propose a new identification model named the convolutional neural network-gated recurrent unit model in SSFL to accurately infer travel modes from GPS trajectories. Second, we design a pseudo-labeling method for the clients to set pseudo-labels on their local unlabeled dataset by using a small public dataset at the server. Furthermore, we adopt a grouping-based aggregation scheme and a data flipping augmentation scheme, which can boost the convergence and performance of the proposed framework. Comprehensive evaluations on a real-world dataset show that SSFL outperforms centralized semi-supervised baselines and is robust to the non-independent and identically distributed data commonly seen in practice.
Conference Paper
Full-text available
GPS trajectory is one of the most significant data sources in intelligent transportation systems (ITS). A simple application is to use these data sources to help companies or organizations identify users’ travel behavior. However, since GPS trajectory is directly related to private data (e.g., location) of users, citizens are unwilling to share their private information with the third-party. How to identify travel modes while protecting the privacy of users is a significant issue. Fortunately, Federated Learning (FL) framework can achieve privacy-preserving deep learning by allowing users to keep GPS data locally instead of sharing data. In this paper, we propose a Roust Federated Learning-based Travel Mode Identification System to identify travel mode without compromising privacy. Specifically, we design an attention augmented model architectures and leverage robust FL to achieve privacy-preserving travel mode identification without accessing raw GPS data from the users. Compared to existing models, we are able to achieve more accurate identification results than the centralized model. Furthermore, considering the problem of non-Independent and Identically Distributed (non-IID) GPS data in the realworld, we develop a secure data sharing strategy to adjust the distribution of local data for each user, thereby the proposed model with non-IID data can achieve accuracy close to the distribution of IID data. Extensive experimental studies on a real-world dataset demonstrate that the proposed model can achieve accurate identification without compromising privacy and being robust to real-world non-IID data.
Article
Full-text available
Travel mode identification is among the key problems in transportation research. With the gradual and rapid adoption of GPS-enabled smart devices in modern society, this task greatly benefits from the massive volume of GPS trajectories generated. However, existing identification approaches heavily rely on manual annotation of these trajectories with their accurate travel mode information, which is both economically inefficient and error-prone. In this work, we propose a novel semi-supervised deep ensemble learning approach for travel mode identification to use a minimal number of annotated data for the task. The proposed approach accepts GPS trajectories of arbitrary lengths and extracts their latent information with a tailor-made feature engineering process. We devise a new deep neural network architecture to establish the mapping from this latent information domain to the final travel mode domain. An ensemble is accordingly constructed to develop proxy labels for unannotated data based on the rare annotated ones so that both types of data contribute to the learning process. Comprehensive case studies are conducted to assess the performance of the proposed approach, which notably outperforms existing ones with partially-labeled training data. Furthermore, we investigate its robustness to noisy data and the effectiveness of its constituting components.
Article
Full-text available
Accurate identification in public travel modes is an essential task in intelligent transportation systems. In recent years, GPS-based identification is gradually replacing the conventional survey-based information-gathering process due to the more detailed and precise data on individual's travel patterns. Nonetheless, existing research suffers from deficient feature selection, high data dimensionality, and data under-utilization issues. In this work, we propose a novel travel mode identification mechanism based on discrete wavelet transform and recent developments of deep learning techniques. The proposed mechanism aims to take GPS trajectories of arbitrary lengths to develop accurate travel mode results in both global and online identification scenarios. In this mechanism, raw GPS data is first pre-processed to compute preliminary motion and displacement attributes, which are input into a tailor-made deep neural network. Discrete wavelet transform is also adopted to further extract time-frequency domain characteristics of the trajectories to assist the neural network in the classification task. To evaluate the performance of the proposed mechanism, a series of comprehensive case studies are conducted. The results indicate that the mechanism can notably outperform existing travel mode identifications on a same data set with minuscule computation time. Furthermore, an architecture test is performed to determine the best-performing structure for the proposed mechanism. Lastly, we demonstrate the capability of the mechanism in handling online identifications, and the performance sensitivity of the selected attributes is evaluated.
Conference Paper
Full-text available
Managing the ever increasing road traffic congestion due to enormous vehicular growth is a big concern all over the world. Tremendous air pollution, loss of valuable time and money are the common consequences of traffic congestion in urban areas. IoT based Intelligent Transportation System (ITS) can help in managing the road traffic congestion in an efficient way. Estimation and classification of the traffic congestion state of different road segments is one of the important aspects of intelligent traffic management. Traffic congestion state recognition of different road segments helps the traffic management authority to optimize the traffic regulation of a transportation system. The commuters can also decide their best possible route to the destination based on traffic congestion state of different road segments. This paper aims to estimate and classify the traffic congestion state of different road segments within a city by analyzing the road traffic data captured by in-road stationary sensors. The Artificial Neural Network (ANN) based system is used to classify traffic congestion states. Based on traffic congestion status, ITS will automatically update the traffic regulations like, changing the queue length in traffic signal, suggesting alternate routes. It also helps the government to device policies regarding construction of flyover/alternate route for better traffic management.
Conference Paper
Full-text available
Traditional machine learning approaches for recognizing modes of transportation rely heavily on hand-crafted feature extraction methods which require domain knowledge. So, we propose a hybrid deep learning model: Deep Convolutional Bidirectional-LSTM (DCBL) which combines convolutional and bidirectional LSTM layers and is trained directly on raw sensor data to predict the transportation modes. We compare our model to the traditional machine learning approaches of training Support Vector Machines and Multilayer Perceptron models on extracted features. In our experiments, DCBL performs better than the feature selection methods in terms of accuracy and simplifies the data processing pipeline. The models are trained on the Sussex-Huawei Locomotion-Transportation (SHL) dataset. The submission of our team, Vahan, to SHL recognition challenge uses an ensemble of DCBL models trained on raw data using the different combination of sensors and window sizes and achieved an F1-score of 0.96 on our test data.
Conference Paper
In this paper, we study the effectiveness of personalized persuasive interventions to change urban travelers’ mobility behavior and nudge them towards more sustainable transport choices. More specifically, we embed a set of persuasive design elements in a route planning application and investigate how they affect users’ travel choices. The design elements take into consideration the style, the intensity, the target of persuasive interventions as well as users’ characteristics and the trip purpose. Our results show evidence that our proposed approach motivates users on a personal level to change their mobility behavior and make more sustainable choices. Furthermore, by personalizing the persuasive interventions while considering combinations of interventions styles (in our case messages and visualizations) as well as adjusting the intensity of persuasive interventions according to the trip purpose and the transport modes of the routes which the user is nudged to follow, the effects of the persuasive interventions can be increased.