ArticlePDF Available

H-ConvLSTM-based bagging learning approach for ride-hailing demand prediction considering imbalance problems and sparse uncertainty

Authors:

Abstract and Figures

The problem of learning from imbalanced ride-hailing demand data with spatiotemporal heterogeneity and highly skewed demand distributions is a relatively new challenge. Current prediction methods usually filter out some spatiotemporal partitions with sparse demands by setting a minimum ride-hailing demand threshold, where the dataset is always assumed to be well balanced in terms of its spatiotemporal partitions, with equal misprediction costs. However, this widely used assumption results in large prediction biases. To achieve better prediction performance, we propose a bagging learning approach based on hexagonal convolutional long short-term memory (H-ConvLSTM), which combines three components. 1) By setting multiple minimum ride-hailing demand thresholds, several subdatasets with different majority ride-hailing demand prediction ranges are obtained. The H-ConvLSTM regression model is applied to each undersampled dataset to train multiple submodels with their respective biased ride-hailing demand prediction ranges. 2) The H-ConvLSTM classification model is trained on the total ride-hailing demand dataset to predict the potential demand range for a certain partition at a future time. 3) The submodel with the best performance with respect to the potential demand range is selected to predict the future demand for this partition. Experiments conducted on order data obtained from Didi Chuxing in Chengdu, China, are conducted. The results show that the proposed approach achieves significantly improved prediction performance relative to that of other models.
Content may be subject to copyright.
Manuscript submitted to Transportation Research part C Chen et al.
1 / 23
H-ConvLSTM-based bagging learning approach for ride-hailing demand
prediction considering imbalance problems and sparse uncertainty
Zhiju Chen 1, Kai Liu 1*, Jiangbo Wang 1, Toshiyuki Yamamoto 2
1 School of Transportation and Logistics, Dalian University of Technology, Dalian, 116024, China.
E-mail: chenzhiju@mail.dlut.edu.cn; liukai@dlut.edu.cn; Jiangbo_Wang@dlut.edu.cn
2 Institute of Materials and Systems for Sustainability, Nagoya University, Nagoya, 464-8603, Japan.
E-mail: yamamoto@civil.nagoya-u.ac.jp
Abstract
The problem of learning from imbalanced ride-hailing demand data with
spatiotemporal heterogeneity and highly skewed demand distributions is a relatively
new challenge. Current prediction methods usually filter out some spatiotemporal
partitions with sparse demands by setting a minimum ride-hailing demand threshold,
where the dataset is always assumed to be well balanced in terms of its spatiotemporal
partitions, with equal misprediction costs. However, this widely used assumption
results in large prediction biases. To achieve better prediction performance, we propose
a bagging learning approach based on hexagonal convolutional long short-term
memory (H-ConvLSTM), which combines three components. 1) By setting multiple
minimum ride-hailing demand thresholds, several subdatasets with different majority
ride-hailing demand prediction ranges are obtained. The H-ConvLSTM regression
model is applied to each undersampled dataset to train multiple submodels with their
respective biased ride-hailing demand prediction ranges. 2) The H-ConvLSTM
classification model is trained on the total ride-hailing demand dataset to predict the
potential demand range for a certain partition at a future time. 3) The submodel with
the best performance with respect to the potential demand range is selected to predict
the future demand for this partition. Experiments conducted on order data obtained from
Didi Chuxing in Chengdu, China, are conducted. The results show that the proposed
approach achieves significantly improved prediction performance relative to that of
other models.
Keywords: ride-hailing demand prediction, sparse uncertainty, hexagonal convolutional
long short-term memory (H-ConvLSTM), bagging learning
1. Introduction
Internet-based ride-hailing services, which connect drivers and passengers in
real time, have attracted much interest as travel options for residents in recent years
(Vazifeh et al., 2018, Xu et al., 2020). Compared to a traditional taxi service, with a
ride-hailing service, passengers can book orders online in advance through a mobile
app instead of standing on the side of the road and spending time waiting for a taxi to
arrive; this improves the mobility of vehicles and the service level of travel (Alonso-
Mora et al., 2017). With the collection and analysis of large amounts of user order data
Manuscript submitted to Transportation Research part C Chen et al.
2 / 23
and vehicle trajectory data, ride-hailing services are constantly updating and evolving
(Alisoltani et al., 2021), thereby becoming a disruptive force to the traditional
transportation industry (Wang and Yang, 2019). Accurate short-term passenger demand
prediction is the basis for improving the operating efficiency of internet-based ride-
hailing platforms, which plays a crucial role in formulating regulation strategies and
improving the balance between supply and demand.
Human travel behavior has a high degree of temporal and spatial regularity
(González et al., 2008), and many contributions have by focusing on capturing the
temporal and spatial correlations of ride-hailing demand (Ke et al., 2019). Such
methods usually grid an urban space and predict the demands of future time intervals
through historical spatiotemporal demands (Wu et al., 2018; H. Yu et al., 2017). In
general, ride-hailing demand also exhibits complex spatiotemporal heterogeneity (Shen
et al., 2020). In the temporal aspect, ride-hailing demand during rush hour is higher
than that during the normal peak period, and demand during the day is higher than that
at night. Spatially, with increasing distance from the city center, ride-hailing demand
gradually becomes sparse. In a high demand prediction case focusing on a central urban
area, sparse demand has little influence on the training of the utilized model, and it is
therefore difficult to achieve good prediction performance. As the improvement of
transport infrastructure often lags behind urban development and expansion, the sparse
demand for suburban long-distance commuter travel also deserves fair attention.
Therefore, a data imbalance problem is present in the observed ride-hailing demand.
Different spatiotemporal scale divisions determined based on experience further
aggravate the uncertainties of such a highly skewed ride-hailing demand distribution
that are aggregated in various spatiotemporal granularities. To address the supply-
demand imbalance issue, ride-sourcing platforms attempt to provide relocation
guidance for idling drivers (Chen et al., 2020; Zhu et al., 2021). However, near-future
spatiotemporal supply gap area prediction remains an unanswered question (Daganzo
et al., 2020; Guo et al. 2021).
A gap remains in terms of predicting the highly uncertain demands of sparse
areas, where the supply-demand imbalance is grievous and requires prescient
dispatching in advance. Previous studies usually deleted spatiotemporal partitions with
sparse demand from all recorded data by setting a minimum ride-hailing demand
threshold, which helped to alleviate the problem of data imbalance. An increase in the
level of the minimum ride-hailing demand threshold significantly reduces the spatial
coverage of research and changes the sparsity of data, as shown in Fig. 1. However, the
imbalance of the reduced data leads to worse demand prediction, as shown in Fig. 2.
With the increase in demand per partition, the corresponding dataset size also decreases.
For a given small threshold, adjacent ranges that are larger than the threshold tend to
yield better prediction due to their higher data size distributions. The challenge is how
to improve the prediction results for demand-sparse partitions with ranges smaller than
the minimum threshold.
Manuscript submitted to Transportation Research part C Chen et al.
3 / 23
Fig. 1 Spatial coverages under different minimum ride-hailing demand thresholds.
(a) Data size distribution under different demand levels
(b) Prediction errors induced with different demand levels
Fig. 2 The influence of the minimum threshold on ride-hailing demand prediction.
This paper takes a step toward closing this gap. Due to the use of different
minimum ride-hailing demand threshold settings, the corresponding datasets have
different right-adjacent optimal prediction ranges. A hexagonal convolutional long
short-term memory (H-ConvLSTM)-based bagging learning approach is proposed to
integrate the bias preferences of H-ConvLSTM models at different data sparsity levels.
The results are helpful for providing suggestions regarding the optimal deployment of
ride-hailing services, reducing driver operating costs, and improving the travel quality
of residents. The main contributions of this paper are summarized as follows.
We propose an H-ConvLSTM regression model to compare and analyze the ride-
hailing demand prediction performances achieved under different minimum ride-
hailing demand threshold settings.
An H-ConvLSTM-based bagging learning approach is further proposed to
integrate the bias prediction preferences of each H-ConvLSTM regression model
trained at different data sparsity levels.
An experimental analysis conducted on the order data obtained from Didi Chuxing
in Chengdu city over one month shows that the proposed approach can achieve
improved prediction performance on the total dataset.
The rest of this paper is organized as follows. Section 2 is a literature review of
ride-hailing demand prediction and the data imbalance problem. Section 3 describes the
Manuscript submitted to Transportation Research part C Chen et al.
4 / 23
main structural framework of the developed prediction models. Section 4 presents the
experimental results, followed by the conclusions in Section 5.
2. Literature review
The use of historical travel records to predict future ride-hailing demand is
helpful for assisting online ride-hailing platforms in carrying out dynamic operation
strategies and optimizing the balance between supply and demand. In this section, we
discuss traditional and existing travel prediction approaches, the advantages of
hexagonal partitioning, and the related work that deals with sparse demand data.
2.1 Travel demand prediction approaches
The most common travel prediction method is a time series model, such as an
autoregressive integrated moving average (ARIMA) model and its various improved
versions (Kaltenbrunner et al., 2010; Min and Wynter, 2011). Machine learning models
and statistical models such as neural network models (Zheng et al., 2006), Bayesian
network models, Kalman filtering models, and least absolute shrinkage and selection
operator (LASSO) models have also been proposed to solve various prediction
problems related to travel demand. Jiang et al. (2014) integrated ensemble empirical
mode decomposition (EEMD) and a gray support vector machine (GSVM) into a
mixed-demand prediction model for high-speed railways. Ma et al. (2014) proposed an
interactive multiple model-based pattern hybrid (IMMPH) approach to predict short-
term passenger demand, and this approach maximizes the effective information by
assembling the knowledge obtained from pattern models. Davis et al. (2016) proposed
a multilayer clustering technique that utilizes the correlation between adjacent
geographic hashes to reduce prediction errors. Zhu et al. (2019) integrated the joint
probability distribution of traffic flows at nearby locations into a time series traffic
speed prediction model. Although these models have achieved improved prediction
performance through continuous improvement, they still struggle to capture complex
temporal and spatial correlations.
The great advantages of deep learning in terms of computing power and
characterizing big data enable its wide application to travel prediction (Jo et al., 2019;
Yuan et al., 2019). By approximating the grid of an urban space into image pixels, a
convolutional neural network (CNN) can effectively identify the spatial correlations
among the grid data. Zhang et al. (2016) applied a CNN to a deep spatiotemporal
prediction model to predict travel flows in real time. Both LSTM and gated recurrent
units (GRUs) have good performance with respect to capturing complex time-
sequential interactions. Therefore, combinations of these models seem to have better
performance in dealing with complex temporal and spatial correlations. H. Yu et al.
(2017) combined a CNN and LSTM to obtain spatial and temporal features for the
prediction of traffic speed. Shi et al. (2015) applied ConvLSTM to address precipitation
nowcasting. As an improved form of an LSTM model, ConvLSTM employs
convolutional structures in both the input-to-state and state-to-state transitions to reduce
the loss of spatiotemporal topology data. In the field of transportation, ConvLSTM has
Manuscript submitted to Transportation Research part C Chen et al.
5 / 23
also been applied to solve prediction problems such as travel speed and ride-hailing
demand and has achieved good prediction performance (Ke et al., 2017; Wang et al.,
2018; Yang et al., 2018). However, these models, which are based on square partitions,
are often difficult to directly apply to hexagonal networks.
2.2 The advantages of hexagonal partitioning
Compared with a square, a hexagon is closer to a circle, and its distribution is
symmetric and equivalent (Birch et al., 2000). Therefore, travel demands with similar
spatiotemporal characteristics are more easily aggregated, and the flows of vehicles
between partitions are more accurately characterized. In addition, in a square partition
space, the partition distance transformed from the same actual distance is much larger
in the oblique direction than in the vertical and horizontal directions. The better isotropy
of a hexagon partition enables it to better express the spatial proximity between
partitions during the calculation process. Based on these advantages, hexagonal
partitioning has been widely used in regional and urban science research. Shoman et al.
(2019) performed a comparative analysis between hexagonal partitions, triangles, and
squares and found that hexagonal partitions can better reduce the area errors of urban
fabric. Csiszár et al. (2019) applied the hexagonal partition method to an evaluation of
charging station configurations in urban areas to further optimize the distribution of
charging stations. To the best of our knowledge, Ke et al. (2019) were the first to
propose a successful hexagon-based deep learning model for travel demand prediction;
they also discussed the advantages of the hexagonal partition approach mentioned
above in detail. However, hexagonal data must be mapped to a matrix before executing
feedforward propagation calculations, which destroys the spatial position relationships
between the hexagonal partitions. The HexagDLy framework proposed by Steppa and
Holch (2019) subtly solved this problem; however, it has difficulty grasping the
complex time correlations in time series data.
2.3 Addressing sparse demand data
The highly skewed spatial and temporal distributions of ride-hailing demand
lead to severe demand imbalances among spatiotemporal partitions. As a result, the
demand information in minority spatiotemporal partitions is overwhelmed by that in
majority spatiotemporal partitions. The different settings of a minimum ride-hailing
demand threshold make the corresponding datasets have certain sparse distribution
characteristics. The levels for these sparse demands are often difficult to accurately
predict. The most common approaches for solving this problem include data-level
methods, algorithm-level methods, and hybrid methods that combine the advantages of
the other two types of techniques (Krawczyk, 2016). Data-level methods aim to change
the input training set to fit a standard learning algorithm. To achieve a balanced data
distribution, previous studies usually increased the number of minority ranges (the
number of classes in a classification task or the target values in a regression task that
have the lowest data sizes in the dataset) by oversampling (Chawla et al., 2002;
Vluymans, 2019) or decreased the number of majority ranges (the number of classes in
a classification task or the target values in a regression task that have the highest data
Manuscript submitted to Transportation Research part C Chen et al.
6 / 23
sizes in the dataset) by undersampling (Ha and Lee, 2016; Lin et al., 2017). Moniz et
al. (2017) combined resampling methods with standard regression models (such as
SVMs) to achieve improved prediction accuracy for imbalanced time series. Zhang et
al. (2021) proposed a clustering decision tree-based multimodel prediction method to
solve the data imbalance problem in building energy load prediction. Cheng et al. (2020)
developed a dynamic spatiotemporal k-nearest neighbor (D-STKNN) model to identify
heterogeneous travel patterns in different temporal and spatial units, which were further
considered for conducting short-term travel speed prediction to improve the prediction
accuracy of the model. However, little effort has been directed toward solving the data
imbalance problem while capturing the complex spatiotemporal correlations of ride-
hailing demands with sparse uncertainties.
Although the mathematical structures of prediction models exhibit significant
difference, the training objective of both statistical models and machine learning models
is always the same: minimizing their total/mean prediction errors (loss function) on the
observed or training dataset. The utilized evaluation indices (such as the symmetric
mean absolute percentage error (SMAPE) and root mean square error (RMSE)), guided
by global prediction performance, are often biased toward the majority ranges of ride-
hailing demand (Japkowicz and Stephen, 2002). The minority ranges of partitioned
ride-hailing demand induce high costs when the demand is not well-predicted. Previous
studies related to ride-hailing demand prediction usually filtered out large amounts of
spatiotemporal units with sparse demand by setting a minimum ride-hailing demand
threshold (Ke et al., 2017). Then, the dataset was always assumed have well-balanced
spatiotemporal partitions with equal numbers of mispredictions. However, this
assumption results in great bias in the prediction results due to the spatiotemporal data
imbalance problem. Therefore, more attention should be given to designing appropriate
prediction algorithms for imbalanced ride-hailing demand data and to ensuring good
prediction performance in different spatial and temporal locations.
In this paper, we integrate the bias preferences of a standard prediction model
with multiple majority ranges of ride-hailing demand to improve the total prediction
accuracy. A hexagon is chosen as the basic spatiotemporal partition to facilitate the
aggregation of ride-hailing demands with similar characteristics. Previous studies (Ke
et al., 2019; Huang et al., 2019) usually focused their research area on limited ranges
by setting minimum ride-hailing demand thresholds, as this is a common data
processing method. Different minimum ride-hailing demand threshold settings cause
the corresponding datasets to have their own majority ride-hailing demand ranges,
leading to an imbalanced data problem with uncertain sparsities in ride-hailing demand
prediction. Therefore, H-ConvLSTM is proposed as a submodel to compare the
prediction performances achieved with different threshold settings, in which hexagonal
convolution kernels are applied to directly conduct convolution calculations on
hexagonal partitions. In addition, an H-ConvLSTM-based bagging learning approach
is further proposed to integrate the optimal prediction ranges of the submodel at
different data sampling degrees.
Manuscript submitted to Transportation Research part C Chen et al.
7 / 23
3 Methodology
Fig. 3 shows the architecture of the proposed H-ConvLSTM-based bagging
learning approach for ride-hailing demand prediction. The architecture is composed of
three parts. First, several undersampled datasets are established for all ride-
hailing order data by setting a minimum ride-hailing demand threshold. Second, an H-
ConvLSTM regression model is established, and the corresponding predictive
submodels are trained on each subtraining dataset. Finally, a bagging strategy is
developed to integrate the bias preferences of each submodel.
Fig. 3. The architecture of the H-ConvLSTM-based bagging learning approach.
3.1 Preliminary
In this section, a city is divided into uniform hexagonal partitions, and a day is
divided into uniform time intervals to aggregate the ride-hailing orders of different areas.
Therefore, the ride-hailing demand can be defined as the number of ride-hailing
orders issued in hexagon partition during time interval .
Due to the presence of significant spatiotemporal correlations, historical
ride-hailing demand features of two-layer local adjacent maps 
centralized at hexagon , as shown in Fig. 4, are selected to jointly predict the ride-
hailing demands  of target partition for future time intervals.
Manuscript submitted to Transportation Research part C Chen et al.
8 / 23
Fig. 4. The two-layer local adjacent map of .
3.2 H-ConvLSTM regression model
As an improved form of the LSTM model, ConvLSTM has convolutional
structures in both the input-to-state and state-to-state transitions and has good
performance in terms of simultaneously capturing temporal and spatial features. The
key to ConvLSTM involves the cell states , which memorize and cycle information
through gate structures that consist of forget gates , input gates
and output gates
. To capture spatial dependencies, the historical cell states  , input
states  , hidden states  and other gates of ConvLSTM
are 3D tensors whose last two dimensions are rows and columns of spatial information.
The forget gate layer determines what information we discard from cell state .
The input gate layer
determines what information to input and updates the old cell
state  to through a tanh layer. Then, parts of the cell state determined by
the output gate layer are exported as the memorized hidden state .
To incorporate the advantages of hexagonal partitioning, we propose an H-
ConvLSTM regression model to capture the spatiotemporal characteristics of ride-
hailing demand, as shown in Fig. 5. H-ConvLSTM directly adopts hexagonal
convolution calculations during feedforward propagation. Following previous research
(Steppa and Holch, 2019), we apply a hexagonal convolution kernel to extract the
spatial and temporal features of the two-layer local adjacency map, as shown in Fig. 6.
The specific functional relationships of H-ConvLSTM are as follows:
     (1)
    (2)
    (3)

 (4)
     (5)
  (6)
where * denotes the hexagonal convolution operator and denotes the Hadamard
operator.  denote the trainable
parameters. and  denote the sigmoid and hyperbolic tangent activation
functions, respectively. Following a series of fully connected layers, the ride-hailing
demand  for location and time interval can be predicted.
Manuscript submitted to Transportation Research part C Chen et al.
9 / 23
Fig. 5. The architecture of the H-ConvLSTM regression model.
Fig. 6. Hexagonal convolution operation with a kernel size of 1.
3.3 Bagging strategy
Bootstrap aggregation, known as bagging, is one of the earliest ensemble
algorithms (Breiman, 1996). The bagging structure is shown in Fig. 7. The original
dataset is sampled n times according to a certain sampling strategy, and n subdatasets
are obtained. N weak classification models are trained on these subdatasets, and the
final classification result is obtained by voting on the prediction results of each model.
This algorithm effectively improves the classification performance of weak classifiers,
especially when dealing with data imbalance problems.
Fig. 7. Bagging structure.
The bagging strategy of the H-ConvLSTM-based bagging learning approach is
shown in Fig. 8, and it contains three parts. First, the trained submodels are used to
predict the total training set  and the prediction error distribution of each trained
submodel is counted. Then, the optimal prediction range of each submodel in terms of
the demand value distribution is identified and labeled as a category. Finally, instead of
utilizing the traditional voting method, the H-ConvLSTM classification model is
Manuscript submitted to Transportation Research part C Chen et al.
10 / 23
trained on the total training set  to predict the potential range of the demand level
for a certain location at a future time. The submodel with the best performance
regarding the range of potential demand levels is selected to predict the future demand
at this location.
Fig. 8. Bagging strategy of the H-ConvLSTM-based bagging learning approach.
The SMAPE and RMSE are selected as the prediction error evaluation indices,
and they are formulated as follows:
 


 (7)
 
 
 (8)
where  and  are the predicted ride-hailing demands and true ride-hailing
demands, respectively, and is a very small value that prevents the denominator from
being 0.
The undersampled datasets have different ride-hailing demand
distribution structures. Different majority ranges of ride-hailing demand make the
corresponding H-ConvLSTM regression submodels have their own prediction bias
preferences in  as shown in Fig. 9. Therefore, we divide the demand values into
continuous sections  according to size, in which the first
sections represent the optimal prediction ranges of the submodels. Due to
slight prediction performance differences regarding the demand distribution, we can
Manuscript submitted to Transportation Research part C Chen et al.
11 / 23
generate two sets of boundary points 

and 

,
corresponding to the SMAPE and RMSE, respectively. The final optimal prediction
range boundary points  can be obtained by taking the average values of
the two sets of data.
Fig. 9. Prediction error distributions of the submodel trained on datasets .
Different from the corresponding regression model, the H-ConvLSTM
classification model identifies the potential range category
of the ride-hailing
demand  for a future time interval based on the historical spatiotemporal ride-
hailing demand features , as shown in Fig. 10. One-hot encoding is used
to convert the categories  to binary vectors of length . The H-
ConvLSTM submodel at corresponding to the predicted range category
is
selected to obtain the predicted ride-hailing demand  at a future time.
(a) Regression model
(b) Classification model
Fig. 10. Structures of the fully connected layers in the prediction models.
4 Experimental results
4.1 Dataset and model setup
The dataset, including all the online ride-hailing order data for Chengdu in
November 2016, is provided by the Didi Gaia Plan platform. To achieve better
prediction performance, the selection of the spatiotemporal granularities in this case
follows the research of Liu et al. (2022). Each day is decentralized by setting 30 minutes
as the time interval, and a time partition label is added for each order data point based
on its starting time. Then, hexagonal partitions are added to the urban space based on
the Quantum Geographic Information System (QGIS), and the intersection operation is
Manuscript submitted to Transportation Research part C Chen et al.
12 / 23
performed with the order data and their added time partition labels. The city is divided
into 35×46 hexagonal partitions with a side length of 800 meters, and each order data
point is further labeled with a hexagonal partition ID. Based on the time interval labels
and the hexagonal partition IDs, we can easily aggregate the ride-hailing demand into
different spatiotemporal partitions. Two-layer local adjacent maps centralized at the
target partition in the previous 8 time intervals are used to predict the ride-hailing
demand in the next time interval. Therefore, during the training and testing processes
of the proposed deep learning model, a travel demand sample needs to be expanded
into the corresponding sample group   , where 
represents the input of the model and  represents the corresponding label, as
shown in Fig. 11.
Minimum ride-hailing demand thresholds are set for all spatiotemporal
partitions (from 1 to 256, doubling each time) to create multiple datasets with different
ride-hailing demand coverages. The ride-hailing demands that are less than the
corresponding threshold in each dataset are excluded. In other words, if the travel
demand sample at the center of is less than the threshold, the sample group
is removed from the corresponding dataset.
Fig. 11. The contents of a sample group.
The ride-hailing demands are arranged in order from small to large and divided
into 20 equal parts according to their proportions of the total ride-hailing demand. The
data size distributions obtained under different minimum ride-hailing demand
thresholds and the average demand values over the total ride-hailing demand are shown
in Fig. 12. The left axis represents the data size of the subdataset corresponding to the
minimum ride-hailing demand threshold (1 to 256) in each demand range, and the right
axis represents the average demand value of  in each demand range. As the
minimum ride-hailing demand threshold increases, the corresponding majority ride-
hailing demand ranges continuously increase.
Manuscript submitted to Transportation Research part C Chen et al.
13 / 23
Fig. 12. The data size distributions obtained under different minimum ride-hailing demand
thresholds and the average demand values over the total ride-hailing demand.
The training process of the proposed H-ConvLSTM-based bagging learning
approach is shown in Fig. 13(a). It consists of multiple regression submodels (H-
ConvLSTM regression models) and a classifier (an H-ConvLSTM classification model),
which are trained as shown in Fig. 13(b) and Fig. 13(c), respectively. Each submodel
 is trained on the corresponding subdataset which is an undersampling of the
total dataset . By evaluating the prediction performance of each submodel 
on , the individual optimal prediction ranges can be identified and labeled as
separate classes. Then, the dataset 
is obtained on the basis of  by replacing
the labels of the sample data with the range categories to which they belong. The
classifier is trained on 
to identify the potential range of the predicted travel
demand.
Fig. 13. Training process of the H-ConvLSTM-based bagging learning approach.
The division of the training dataset and testing dataset is shown in Fig. 14. The
data of , and 
in the first 21 days are used for training, and the data from
the last 9 days are used for testing. The testing process of the H-ConvLSTM-based
bagging learning approach is similar to that shown in Fig. 13(a). First, the trained
classifier is used to select an appropriate submodel for 's input demand, and then
Manuscript submitted to Transportation Research part C Chen et al.
14 / 23
this submodel is used to predict the corresponding future demand.
Fig. 14. Division of the training dataset and testing dataset.
The experimental platform is a server with an Intel(R) Xeon(R) Gold-5218 CPU
@ 2.30 GHz, 128 GB of RAM, and one GPU (NVIDIA Quadro RTX 5000). The
proposed model is implemented in Python 3.6.6 with PyTorch, TensorFlow and Keras.
The proposed H-ConvLSTM regression and classification models both consist of 4
ConvLSTM layers, which have 8, 16, 32, and 32 hidden states. The hexagonal kernel
size of each layer is 1. To ensure that the input and output of the hexagonal convolution
operation have the same dimensionality, similar to the same padding approach used in
traditional CNN models, virtual hexagons with zero demand values are padded as
neighbors of the hexagons on the border. Batch normalization and dropout are used for
training the model. The number of training epochs is set to 50 with a batch size of 128.
Adam is used for optimization with a learning rate of 0.0001. The weighted sum of the
SMAPE and RMSE is used as the loss function of the regression model, while the
classification cross entropy is used as the loss function of the classification model. The
SMAPE and RMSE are used to evaluate the prediction performance of the demand
value distribution yielded by the regression model.
4.2 Optimal prediction range division results
The H-ConvLSTM regression submodel is trained on each undersampled
dataset from to  and the corresponding prediction performance is calculated.
Since only a threshold setting between 1 and 32 can produce a relatively obvious
optimal prediction distribution range, we only select the corresponding submodels with
this characteristic as the research objects, and the prediction results are shown in Fig.
15. Each prediction distribution curve first exhibits a decreasing trend and then
increases near the threshold point. As a percentage error that is sensitive to sparse
demand, the SMAPE is mainly used to reflect the influence of different thresholds on
the resulting prediction performances. The submodel corresponding to each threshold
has an obvious optimal prediction range, and the ride-hailing demand values can be
divided into 7 segments according to size. The RMSE is an absolute error and is
sensitive to large outliers. Although the RMSEs of the submodels also perform best
when the demand values are slightly larger than the threshold, the prediction
performance corresponding to these demand values is difficult to make as obvious as
that obtained with the SMAPE because their distribution is located in a smaller demand
range. Therefore, the prediction result distribution of the RMSE is mainly a
supplementary validation of the SMAPE.
Manuscript submitted to Transportation Research part C Chen et al.
15 / 23
(a) SMAPE
(b) RMSE
Fig. 15. The distributions of the demand value prediction errors obtained under different minimum
ride-hailing demand thresholds.
Table 1 Statistical results of the boundary points.
Boundary point demand values






1
3
7
17
32
67
2
4
7
22
34
59
1.5
3.5
7
19.5
33
63
The boundary points between each segment are determined as shown in Table
1. The intersection points of adjacent optimal ranges are selected as the first five
boundary points. The last boundary point is the closest intersection between the
prediction distribution curve of threshold 32 and the other distribution curves on the
right. The classification numbers of the demand values distributed in the final 7
segments are set to 1, 2, 3, 4, 5, 6, and 7 and further transformed into corresponding
binary vectors through one-hot encoding. Then, the dataset 
is obtained on the
basis of  by replacing the label  of the sample group with the
classification number to which it belongs.
4.3 Results of the H-ConvLSTM-based bagging learning approach
The H-ConvLSTM classification model is trained on 
. Similar to the H-
ConvLSTM regression submodel, 8 historical ride-hailing demand features of two-
layer local adjacent maps  centralized at hexagon , are selected to jointly
predict the segment category of ride-hailing demands  of target partition for
future time interval . An accuracy of 85.76% is achieved on the testing dataset
(88.94% on the training dataset). The boundaries of segment categories depend on the
prediction distribution of each submodel in the training set of , and it is assumed
that the optimal prediction range of each submodel in the training set and testing set is
roughly similar. For the historical data in the testing set of  whose prediction
categories are the first 6 segments , the corresponding regression submodel
trained on the training set of is used to predict the ride-hailing demand at a future
time. The data size of the ride-hailing demand distributed in the last segment is
relatively small as shown in Fig. 12, and no submodel shows significantly better
prediction performance in this segment. Therefore, the average value of the prediction
results of the 6 submodels is used as the predicted value of ride-hailing demand of
segment at a future time. The prediction error distribution of the H-ConvLSTM-
based bagging learning approach is shown in Fig. 16.
Manuscript submitted to Transportation Research part C Chen et al.
16 / 23
(a) SMAPE
(b) RMSE
Fig. 16. Prediction errors of the bagging learning approach based on H-ConvLSTM.
Compared with that of the H-ConvLSTM regression submodels trained under
different minimum ride-hailing demand threshold settings, the prediction performance
of the H-ConvLSTM-based bagging learning approach is improved by different degrees
and is closer to the optimal performance limit that can be achieved by this method (i.e.,
an oracle submodel classifier that always selects the model that performs best, as shown
by the dotted black line).
To verify the validity of the proposed model, several basic models are selected
for comparison, as follows.
1) ARIMA: This is the autoregressive integrated moving average model that is
widely used for time series prediction. The difference order is set to 1, with an
autoregressive coefficient and a moving average coefficient for iterating the
previous time intervals between 1 and 8.
2) Hexagonal artificial neural network (H-ANN): The spatial feature and
historical temporal feature of the demands of a hexagonal partition are spliced together
as the input for a fully connected neural network, and the predicted demand value of a
future time is output. The model includes 5 fully connected layers, which have 128, 64,
32, 16, and 8 hidden neurons.
3) H-CNN: The previous 8 time intervals are represented by the numbers of
channels in the input image. A hexagonal convolution operation is applied between each
pair of layers. The H-CNN model includes 4 convolution layers, which have 8, 16, 32,
and 32 hidden states. The hexagonal kernel size of each layer is 1. Batch normalization
and dropout are used to train the model.
4) H-CNN-LSTM: An H-CNN model with one channel is selected to extract the
spatial ride-hailing demand characteristics of the previous 8 time intervals. The settings
for the convolution layer and the convolution kernel remain the same. The outputs of
the H-CNN for the previous 8 time intervals are expanded into vectors and used as the
inputs for the LSTM to extract the temporal characteristics of ride-hailing demand. The
hidden state of the LSTM is set to 128.
5) H-CNN-GRU: The output of the H-CNN is taken as the input of a GRU, and
the other settings are consistent with those of H-CNN-LSTM.
The data sizes of different ride-hailing demands are greatly different, which
causes the overall prediction error to be significantly affected by the sparse ride-hailing
demand prediction results with large data sizes. The sparse ride-hailing demands, which
account for approximately 70% of the total data size (i.e., 70% of the spatiotemporal
partitions are sparse demands), contain less than 10% of the total ride-hailing demand
quantity. To better evaluate the prediction performance of each model, we propose to
utilize the weighted SMAPE (wSMAPE) and weighted RMSE (wRMSE) to
Manuscript submitted to Transportation Research part C Chen et al.
17 / 23
comprehensively consider the prediction results corresponding to different ride-hailing
demand size distributions as follows:
 

 




 (9)
 

  


 (10)
The ride-hailing demands are arranged in order from small to large and divided
into 20 equal parts according to their proportions of the total demand value. The subdata
size of each 5% ride-hailing demand segment is denoted as  .
denotes the weight of segment .  denotes the th predicted value of segment 
and  is the corresponding true value. is a very small value that prevents the
denominator from being 0.
Table 2 Model performance comparison.
Model
wSMAPE
(×10-2)
wRMSE
Training
time
(h)
Testing
time
(min)
ARIMA
14.53
23.51
0.01
0.01
H-ANN
14.11
22.75
0.31
0.04
H-CNN
13.21
22.36
4.43
0.54
H-CNN-LSTM
12.35
21.12
6.05
0.96
H-CNN-GRU
12.29
21.53
5.71
0.88
H-ConvLSTM + Threshold 1
11.61
20.97
7.16
1.02
H-ConvLSTM + Threshold 2
10.02
19.58
5.36
0.83
H-ConvLSTM + Threshold 4
10.54
20.04
3.59
0.57
H-ConvLSTM + Threshold 8
11.81
20.21
2.33
0.36
H-ConvLSTM + Threshold 16
14.62
20.76
1.02
0.15
H-ConvLSTM + Threshold 32
18.76
24.18
0.75
0.11
H-ConvLSTM + bagging
9.42
18.63
25.84
1.62
The overall prediction performance achieved by each model on the testing
dataset is shown in Table 2. With the enhancement in the ability of the model to capture
the temporal and spatial characteristics of ride-hailing demand, both the wSMAPE and
wRMSE of the H-ConvLSTM regression model are lower values. By integrating the bias
prediction preferences of each submodel in different segments, the prediction
performance of our proposed bagging learning approach based on H-ConvLSTM
improves by 5.99% and 4.85% over the values obtained with the optimal threshold
setting in terms of the wMAPE and wRMSE, respectively. Due to the inclusion of
multiple regression submodels and an additional classification model, the proposed H-
ConvLSTM-based bagging learning approach requires more training time.
Manuscript submitted to Transportation Research part C Chen et al.
18 / 23
Fig. 17. Spatial distribution of the wSMAPE difference between the H-ConvLSTM-based bagging
learning approach and H-ConvLSTM-Oracle.
Fig. 18. Spatial distributions of the wSMAPE differences between the submodels and
H-ConvLSTM-Oracle.
Assume that H-ConvLSTM-Oracle has an oracle submodel classifier that
always selects the version that performs best. This model represents the upper bound
performance of our H-ConvLSTM-based bagging learning approach. The spatial
distributions of the wSMAPE differences between the H-ConvLSTM-based bagging
learning approach and each of the other 6 submodels against H-ConvLSTM-Oracle are
shown in Fig. 17 and Fig. 18, respectively. A smaller difference value means that the
wSMAPE value is close to that of H-ConvLSTM-Oracle, and the corresponding
prediction results are better. The proposed H-ConvLSTM-based bagging learning
approach effectively selects the optimal submodel in the whole spatial distribution, and
the prediction results are less different from those of H-ConvLSTM-Oracle. When
compared with the results obtained under a minimum ride-hailing demand threshold of
1, the prediction results of H-ConvLSTM-Oracle are mainly improved in the central
1
2
4
8
16
32
Manuscript submitted to Transportation Research part C Chen et al.
19 / 23
urban area, especially in the transition areas between urban and suburban areas. For
other minimum ride-hailing demand threshold settings, the prediction results of H-
ConvLSTM-Oracle are more significantly improved in the outer suburbs. With the
continuous increase in the threshold value, the improvement effect and coverage area
of the corresponding H-ConvLSTM-Oracle prediction results continuously increase.
5 Conclusion
In this paper, we propose an H-ConvLSTM regression model to compare and
analyze the ride-hailing demand prediction performances achieved under different data
distribution characteristics. Minimum ride-hailing demand thresholds are set for all
spatiotemporal partitions to create multiple datasets with different data sparsities. The
H-ConvLSTM regression models trained using different datasets have their own
optimal prediction ranges on the testing set, and each prediction distribution curve first
exhibits a decreasing trend and then increases near the threshold point.
An H-ConvLSTM-based bagging learning approach is further proposed to
integrate the bias prediction preferences of each H-ConvLSTM regression model
trained on datasets with different data sparsities. An experimental analysis conducted
on the order data obtained from Didi Chuxing in Chengdu city over one month shows
that the proposed H-ConvLSTM-based bagging learning approach can achieve
significantly improved prediction performance.
In future work, we are committed to performing more in-depth qualitative and
quantitative analyses of the spatiotemporal scale of internet-based ride-hailing demand.
The influence of an imbalanced ride-hailing demand distribution (caused by the
division of different spatial and temporal scales) on the prediction performance will be
discussed. Policy recommendations will also be made to improve the operational
efficiency and quality of ride-hailing.
Acknowledgments
This research was funded by the National Natural Science Foundation of China
(grant nos. 51378091 and 71871043). The authors would like to acknowledge the GAIA
open data from DiDi Chuxing.
Appendix A. Comparison of predicted results between H-ConvLSTM and
ConvLSTM
Traditional ConvLSTM based on matrix convolution operation is compared
with our H-ConvLSTM model to verify the advantages of hexagonal convolution
operation. The model parameter configuration can refer to Liu et al. (2022) for a
detailed explanation and instruction. The predicted results are shown in Table A1.
Compared with a square, a hexagon is closer to a circle, and its distribution is symmetric
and equivalent. Therefore, travel demands with hexagon partition are more accurately
predicted. Since the hexagonal convolution operation solves the problem of topological
loss of spatial relations caused by matrix transformation in traditional ConvLSTM, the
proposed H-ConvLSTM model shows stable optimal prediction performance.
Manuscript submitted to Transportation Research part C Chen et al.
20 / 23
Table A1: Comparison of different demand prediction models
Model
Partition
shape
RMSE
MAPE (×10-2)
Testing
set
Avg.
Sd.
Testing
set
Avg.
Sd.
ConvLSTM
Square
9.37
9.38
0.12
17.18
17.55
0.46
Hexagon
9.03
9.12
0.07
17.02
17.27
0.55
H-ConvLSTM
Hexagon
8.82
8.80
0.05
16.71
16.76
0.36
References
Alisoltani N., Leclercq L. and Zargayouna M. 2021, Can dynamic ride-sharing reduce
traffic congestion? Transp. Res. Part B Methodol. 145, 212-246.
Alonso-Mora, J., Samaranayake, S., Wallar, A., Frazzoli, E., Rus, D., 2017. On-
demand high-capacity ride-sharing via dynamic trip-vehicle assignment. Proc.
Natl. Acad. Sci. U. S. A. 114, 462-467.
Birch, C.P.D., Vuichard, N., Werkman, B.R., 2000. Modelling the effects of patch
size on vegetation dynamics: Bracken [Pteridium aquilitnum (L.) Kuhn] under
grazing. Ann. Bot. 85, 63-76.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24, 123140.
Chawla, N. V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE:
Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16, 321-357.
Chen X., Zheng H., Ke J. and Yang H. 2020. Dynamic optimization strategies for on-
demand ride services platform: Surge pricing, commission rate, and incentives.
Transp. Res. Part B Methodol. 138, 23-45.
Cheng, S., Lu, F., Peng, P., 2020. Short-Term Traffic Forecasting by Mining the Non-
Stationarity of Spatiotemporal Patterns. IEEE Trans. Intell. Transp. Syst. 22(10),
6365-6383.
Csiszár, C., Csonka, B., Földes, D., Wirth, E., Lovas, T., 2019. Urban public charging
station locating method for electric vehicles based on land use approach. J.
Transp. Geogr. 74, 173180.
Daganzo C.F., Ouyang Y. and Yang H. 2020. Analysis of ride-sharing with service
time and detour guarantees. Transp. Res. Part B Methodol. 140, 130-150.
Davis, N., Raina, G., Jagannathan, K., 2016. A multi-level clustering approach for
forecasting taxi ride-hailing demand. IEEE Conf. Intell. Transp. Syst.
Proceedings, ITSC 223-228.
González, M.C., Hidalgo, C.A., Barabási, A.L., 2008. Understanding individual
human mobility patterns. Nature 453, 779-782.
Guo X.T., Caros N.S. and Zhao J.H. 2021. Robust matching-integrated vehicle
rebalancing in ride-hailing system with uncertain demand. Transp. Res. Part B
Methodol. 150, 161-189.
Huang, Z., Huang, G., Chen, Z., Wu, C., Ma, X., Wang, H., 2019. Multi-regional
online car-hailing order quantity forecasting based on the convolutional neural
network. Inf. 10(6),193.
Manuscript submitted to Transportation Research part C Chen et al.
21 / 23
Japkowicz, N., Stephen, S., 2002. The class imbalance problem: A systematic study.
Intell. Data Anal. 6, 429-449.
Jo, D., Yu, B., Jeon, H., Sohn, K., 2019. Image-to-image learning to predict traffic
speeds by considering area-wide spatio-temporal dependencies. IEEE Trans.
Veh. Technol. 68, 1188-1197.
Jiang, X., Zhang, L., Chen, M.X., 2014. Short-term forecasting of high-speed rail
demand: A hybrid approach combining ensemble empirical mode decomposition
and gray support vector machine with real-world applications in China. Transp.
Res. Part C Emerg. Technol. 44, 110-127.
Kaltenbrunner, A., Meza, R., Grivolla, J., Codina, J., Banchs, R., 2010. Urban cycles
and mobility patterns: Exploring and predicting trends in a bicycle-based public
transport system. Pervasive Mob. Comput. 6, 455-466.
Ke, J., Yang, H., Zheng, H., Chen, X., Jia, Y., Gong, P., Ye, J., 2019. Hexagon-Based
Convolutional Neural Network for Supply-Demand Forecasting of Ride-
Sourcing Services. IEEE Trans. Intell. Transp. Syst. 20, 4160-4173.
Ke, J., Zheng, H., Yang, H., Chen, X. (Michael), 2017. Short-term forecasting of
passenger demand under on-demand ride services: A spatio-temporal deep
learning approach. Transp. Res. Part C Emerg. Technol. 85, 591-608.
Krawczyk, B., 2016. Learning from imbalanced data: open challenges and future
directions. Prog. Artif. Intell. 5, 221-232.
Li, X., Pan, G., Wu, Z., Qi, G., Li, S., Zhang, D., Zhang, W., Wang, Z., 2012.
Prediction of urban human mobility using large-scale taxi traces and its
applications. Front. Comput. Sci. China 6, 111-121.
Lin, W.C., Tsai, C.F., Hu, Y.H., Jhang, J.S., 2017. Clustering-based undersampling in
class-imbalanced data. Inf. Sci. (Ny). 409-410, 17-26.
Liu, K., Chen, Z., Yamamoto, T., Tuo, L., 2022. Exploring the impact of
spatiotemporal granularity on the demand prediction of dynamic ride-hailing.
preprint arXiv:2203.10301.
Ma, Z., Xing, J., Mesbah, M., Ferreira, L., 2014. Predicting short-term bus passenger
demand using a pattern hybrid approach. Transp. Res. Part C Emerg. Technol.
39, 148-163.
Min, W., Wynter, L., 2011. Real-time road traffic prediction with spatio-temporal
correlations. Transp. Res. Part C Emerg. Technol. 19, 606-616.
Moniz, N., Branco, P., Torgo, L., 2017. Resampling strategies for imbalanced time
series forecasting. Int. J. Data Sci. Anal. 3, 161-181.
Shen, X., Zhou, Y., Jin, S., Wang, D., 2020. Spatiotemporal influence of land use and
household properties on automobile ride-hailing demand. Transp. Res. Part D
Transp. Environ. 84, 102359.
Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W., Woo, W., 2015. Convolutional
LSTM Network: A Machine Learning Approach for Precipitation Nowcasting.
Adv. Neural Inf. Process. Syst. 2015-Janua, 68-80.
Shoman, W., Alganci, U., Demirel, H., 2019. A comparative analysis of gridding
Manuscript submitted to Transportation Research part C Chen et al.
22 / 23
systems for point-based land cover/use analysis. Geocarto Int. 34, 867886.
Steppa, C., Holch, T.L., 2019. HexagDLy-Processing hexagonally sampled data with
CNNs in PyTorch. SoftwareX 9, 193-198.
Vazifeh, M.M., Santi, P., Resta, G., Strogatz, S.H., Ratti, C., 2018. Addressing the
minimum fleet problem in on-demand urban mobility. Nature 557, 534-538.
Vluymans, S., 2019. Learning from imbalanced data. Stud. Comput. Intell. 807, 81-
110.
Wang, D., Yang, Y., Ning, S., 2018. DeepSTCL: A Deep Spatio-temporal
ConvLSTM for Ride-hailing demand Prediction, in: 2018 International Joint
Conference on Neural Networks (IJCNN). IEEE, pp. 1-8.
Wang, H., Yang, H., 2019. Ridesourcing systems: A framework and review. Transp.
Res. Part B Methodol. 129, 122155.
Wu, X., Guo, J., Xian, K., Zhou, X., 2018. Hierarchical ride-hailing demand
estimation using multiple data sources: A forward and backward propagation
algorithmic framework on a layered computational graph. Transp. Res. Part C
Emerg. Technol. 96, 321-346.
Xu Z., Yin Y. and Ye J. 2020. On the supply curve of ride-hailing systems. Transp.
Res. Part B Methodol. 132, 29-43.
Yang, G., Wang, Y., Yu, H., Ren, Y., Xie, J., 2018. Short-Term Traffic State
Prediction Based on the Spatiotemporal Features of Critical Road Sections.
Sensors 18, 2287.
Yu, H., Wu, Z., Wang, S., Wang, Y., Ma, X., 2017. Spatiotemporal recurrent
convolutional networks for traffic prediction in transportation networks. Sensors
(Switzerland) 17, 1-16.
Yuan, C., Yu, X., Li, D., Xi, Y., 2019. Overall Traffic Mode Prediction by VOMM
Approach and AR Mining Algorithm with Large-Scale Data. IEEE Trans. Intell.
Transp. Syst. 20, 1508-1516.
Zhang, C., Li, J., Zhao, Y., Li, T., Chen, Q., Zhang, X., Qiu, W., 2021. Problem of
data imbalance in building energy load prediction: Concept, influence, and
solution. Appl. Energy 297, 117139.
Zhang, J., Zheng, Y., Qi, D., Li, R., Yi, X., 2016. DNN-based prediction model for
spatio-temporal data, in: Proceedings of the 24th ACM SIGSPATIAL
International Conference on Advances in Geographic Information Systems -
GIS ’16. ACM Press, New York, New York, USA, pp. 1-4.
Zheng, W., Lee, D.H., Shi, Q., 2006. Short-term freeway traffic flow prediction:
Bayesian combined neural network approach. J. Transp. Eng. 132, 114-121.
Zhu, Z., Tang, L., Xiong, C., Chen, X., Zhang, L., 2019. The conditional probability
of travel speed and its application to short-term prediction. Transp. B, 7(1), 684-
706.
Zhu Z., Ke J. and Wang H. 2021. A mean-field Markov decision process model for
spatial-temporal subsidies in ride-sourcing markets. Transp. Res. Part B
Methodol. 150, 540-565.
Manuscript submitted to Transportation Research part C Chen et al.
23 / 23
... To solve the problem of repositioning, the input can be current spatial distributions of vehicles and riders. Additional inputs could include predictions about future supplies and demands [75], [76], [77], [78], [79], [80]. The output decisions are the positions to relocate to for all idle vehicles. ...
... Some other collective repositioning solutions leverage various ML techniques to predict future information of a ride-hailing system, which plays an important role in guiding the platforms to make better repositioning decisions [80]. Riley et al. [141] leverage Vector autoregression to forecast the future demand from region to region. ...
Preprint
Full-text available
Ride-hailing is a sustainable transportation paradigm where riders access door-to-door traveling services through a mobile phone application, which has attracted a colossal amount of usage. There are two major planning tasks in a ride-hailing system: (1) matching, i.e., assigning available vehicles to pick up the riders, and (2) repositioning, i.e., proactively relocating vehicles to certain locations to balance the supply and demand of ride-hailing services. Recently, many studies of ride-hailing planning that leverage machine learning techniques have emerged. In this article, we present a comprehensive overview on latest developments of machine learning-based ride-hailing planning. To offer a clear and structured review, we introduce a taxonomy into which we carefully fit the different categories of related works according to the types of their planning tasks and solution schemes, which include collective matching, distributed matching, collective repositioning, distributed repositioning, and joint matching and repositioning. We further shed light on many real-world datasets and simulators that are indispensable for empirical studies on machine learning-based ride-hailing planning strategies. At last, we propose several promising research directions for this rapidly growing research and practical field.
... Firstly, the demand-supply prediction method adopted is less complex, and the accuracy can be further improved. In the future, more accurate demand-supply prediction methods will be applied, such as prediction methods based on ConvLSTM and T-GCN [54,55]. Secondly, the current regional division is based on indicators, and further research will be carried out on the division mechanism in the future. ...
Article
As an emerging internet service, online car-hailing allows users to do the car-hailing via their mobile phone instead of calling cars on the road. Though the convenient service provided by online car-hailing platforms, the following problems encountered in the process of real-time order-matching still cannot be well addressed: (1) the imbalance of the accessible cars in different areas, and (2) the conflict of interests among the drivers, platforms, and passengers. To address those problems, this paper proposes a set of order allocation strategy that fully consider the changing regional characteristics and the interests of multiple parties. Firstly, a dynamic region division method is proposed to identify the corresponding area types for real-time car-hailing. Then, a novel order allocation strategy with the regional multi-party subsidy mechanism is developed for different types of areas. Lastly, to help obtain a better performance for multiple parties, a parameter optimization method for the multi-party subsidy mechanism is introduced. The proposed order allocation strategy is evaluated on the real-world dataset sourced from an online car-hailing company. Our experiments verify the feasibility and effectiveness of the proposed method, without reducing drivers’ benefits, the order response time and order response rate are greatly improved, and the cost in all is acceptable for passengers and the platform. It is illustrated that a suitable regional multi-party subsidy mechanism could help online car-hailing platforms obtain higher passenger satisfaction than that without a regional multi-party subsidy mechanism.
... As a consequence, missing values are filled by interpolating data according to the time and sequence. Other approaches could be used to handle missing data, such as the bagging learning approach proposed in [44]. For example, if values are missing on Monday 08.00-11.00, ...
Article
Full-text available
This study aims to address the challenge of developing accurate and efficient parking occupancy forecasting models at the city level for autonomous vehicles. Although deep learning techniques have been successfully employed to develop such models for individual parking lots, it is a resource-intensive process that requires significant amounts of time and data for each parking lot. To overcome this challenge, we propose a novel two-step clustering technique that groups parking lots based on their spatiotemporal patterns. By identifying the relevant spatial and temporal characteristics of each parking lot (parking profile) and grouping them accordingly, our approach allows for the development of accurate occupancy forecasting models for a set of parking lots, thereby reducing computational costs and improving model transferability. Our models were built and evaluated using real-time parking data. The obtained correlation rates of 86% for the spatial dimension, 96% for the temporal one, and 92% for both demonstrate the effectiveness of the proposed strategy in reducing model deployment costs while improving model applicability and transfer learning across parking lots.
... Su and Wang (2019) analyzed the impact of ride-sourcing services on the morning commute considering the parking supply constraints. Many studies have adopted data-driven approaches to predict the ride-hailing demand (Ke et al., 2017;Chen et al., 2022). From a network equilibrium perspective, He and Shen (2015) modeled the taxi services with both the emerging e-hailing applications and traditional street hailing by constructing the spatial equilibrium model. ...
Article
Cruising of electric ride-sourcing vehicles (ERVs) when waiting for trip orders can create additional vehicle miles, which increase congestion and waste electricity. Reducing cruising is an important issue. This study investigates the strategy of allocating a portion of road space as parking for ERVs. Considering ERVs cruising for parking/charging, we analytically examine the trade-off between road capacity reduction due to reserving road space as parking and less cruising. We evaluate the effects of parking provision on reducing congestion and charging demand. We also investigate the optimal fare and fleet size of ERV services to achieve profit or social welfare maximization. Numerical studies indicate that vehicles cruising for charging might be reduced significantly with a mild increase of charging pile supply, where cruising can increase sharply after charging pile occupancy rate is at critical levels. By providing parking to ERVs, ride-sourcing demand increases, charging demand reduces, profit and social welfare increase.
Article
Ride-hailing is a sustainable transportation paradigm where riders access door-to-door traveling services through a mobile phone application, which has attracted a colossal amount of usage. There are two major planning tasks in a ride-hailing system: 1) matching, i.e., assigning available vehicles to pick up the riders; and 2) repositioning, i.e., proactively relocating vehicles to certain locations to balance the supply and demand of ride-hailing services. Recently, many studies of ride-hailing planning that leverage machine learning techniques have emerged. In this article, we present a comprehensive overview on latest developments of machine learning-based ride-hailing planning. To offer a clear and structured review, we introduce a taxonomy into which we carefully fit the different categories of related works according to the types of their planning tasks and solution schemes, which include collective matching, distributed matching, collective repositioning, distributed repositioning, and joint matching and repositioning. We further shed light on many real-world data sets and simulators that are indispensable for empirical studies on machine learning-based ride-hailing planning strategies. At last, we propose several promising research directions for this rapidly growing research and practical field.
Article
Dynamic speed guidance for vehicles in on-ramp merging zones is instrumental in alleviating traffic congestion on urban expressways. To enhance compliance with recommended speeds, the development of a dynamic speed-guidance mechanism that accounts for heterogeneity in human driving styles is pivotal. Utilizing intelligent connected technologies that provide real-time vehicular data in these merging locales, this study proposes such a guidance system. Initially, we integrate a multi-agent consensus algorithm into a multi-vehicle framework operating on both the mainline and the ramp, thereby facilitating harmonized speed and spacing strategies. Subsequently, we conduct an analysis of the behavioral traits inherent to drivers of varied styles to refine speed planning in a more efficient and reliable manner. Lastly, we investigate a closed-loop feedback approach for speed guidance that incorporates the driver’s execution rate, thereby enabling dynamic recalibration of advised speeds and ensuring fluid vehicular integration into the mainline. Empirical results substantiate that a dynamic speed guidance system incorporating driving styles offers effective support for human drivers in seamless mainline merging.
Article
Full-text available
Dynamic demand prediction is a key issue in ride-hailing dispatching. Many methods have been developed to improve the demand prediction accuracy of an increase in demand-responsive, ride-hailing transport services. However, the uncertainties in predicting ride-hailing demands due to multiscale spatiotemporal granularity, as well as the resulting statistical errors, are seldom explored. This paper attempts to fill this gap and to examine the spatiotemporal granularity effects on ride-hailing demand prediction accuracy by using empirical data for Chengdu, China. A convolutional, long short-term memory model combined with a hexagonal convolution operation (H-ConvLSTM) is proposed to explore the complex spatial and temporal relations. Experimental analysis results show that the proposed approach outperforms conventional methods in terms of prediction accuracy. A comparison of 36 spatiotemporal granularities with both departure demands and arrival demands shows that the combination of a hexagonal spatial partition with an 800 m side length and a 30 min time interval achieves the best comprehensive prediction accuracy. However, the departure demands and arrival demands reveal different variation trends in the prediction errors for various spatiotemporal granularities.
Article
Full-text available
The accurate short-term travel demand predictions of ride-hailing orders can promote the optimal dispatching of vehicles in space and time, which is the crucial issue to achieve sustainable development of such dynamic demand-responsive service. The sparse demands are always ignored in the previous models, and the uncertainties in the spatiotemporal distribution of the predictions induced by setting subjective thresholds are rarely explored. This paper attempts to fill this gap and examine the spatiotemporal sparsity effect on ride-hailing travel demand prediction by using Didi Chuxing order data recorded in Chengdu, China. To obtain the spatiotemporal characteristics of the travel demand, three hexagon-based deep learning models (H-CNN-LSTM, H-CNN-GRU, and H-ConvLSTM) are compared by setting various threshold values. The results show that the H-ConvLSTM model has better prediction performance than the others due to its ability to simultaneously capture spatiotemporal features, especially in areas with a high proportion of sparse demands. We found that increasing the minimum demand threshold to delete more sparse data improves the overall prediction accuracy to a certain extent, but the spatiotemporal coverage of the data is also significantly reduced. Results of this study could guide traffic operations in providing better travel services for different regions.
Article
Full-text available
With the rapid growth of the mobility-on-demand (MoD) market in recent years, ride-hailing companies have become an important element of the urban mobility system. There are two critical components in the operations of ride-hailing companies: driver-customer matching and vehicle rebalancing. In most previous literature, each component is considered separately , and performances of vehicle rebalancing models rely on the accuracy of future demand predictions. To better immunize rebalancing decisions against demand uncertainty, a novel approach, the matching-integrated vehicle rebalancing (MIVR) model, is proposed in this paper to incorporate driver-customer matching into vehicle rebalancing problems to produce better rebalancing strategies. The MIVR model treats the driver-customer matching component at an aggregate level and minimizes a generalized cost including the total vehicle miles traveled (VMT) and the number of unsatisfied requests. For further protection against uncertainty , robust optimization (RO) techniques are introduced to construct a robust version of the MIVR model. Problem-specific uncertainty sets are designed for the robust MIVR model. The proposed MIVR model is tested against two benchmark vehicle rebalancing models using real ride-hailing demand and travel time data from New York City (NYC). The MIVR model is shown to have better performances by reducing customer wait times compared to benchmark models under most scenarios. In addition, the robust MIVR model produces better solutions by planning for demand uncertainty compared to the non-robust (nominal) MIVR model.
Article
Full-text available
Can dynamic ride-sharing reduce traffic congestion? In this paper we show that the answer is yes if the trip density is high, which is usually the case in large-scale networks but not in medium-scale networks where opportunities for sharing in time and space be- come rather limited. When the demand density is high, the dynamic ride-sharing system can significantly improve traffic conditions, especially during peak hours. Sharing can compensate extra travel distances related to operating a mobility service. The situation is entirely different in small and medium-scale cities when trip shareability is small, even if the ride-sharing system is fully optimized based on the perfect demand prediction in the near future. The reason is simple, mobility services significantly increase the total travel distance, and sharing is simply a means of combating this trend without eliminating it when the trip density is not high enough. This paper proposes a complete framework to represent the functioning of the ride-sharing system and multiple steps to tackle the curse of dimensionality when solving the problem. We address the problem for two city scales in order to compare different trip densities. A city scale of 25 km2 with a total market of 11,235 shareable trips for the medium-scale network and a city scale of 80 km 2 with 205,308 demand for service vehicles for the large-scale network over a 4-hour period with a rolling horizon of 20 minutes. The solutions are assessed using a dynamic trip-based macroscopic simulation to account for the congestion effect and dynamic travel times that may influence the optimal solution obtained with predicted travel times. This outperforms most previous studies on optimal fleet management that usually consider constant and fully deterministic travel time functions.
Article
Ride-sourcing services are increasingly popular because of their ability to accommodate on-demand travel needs. A critical issue faced by ride-sourcing platforms is the supply-demand imbalance, as a result of which drivers may spend substantial time on idle cruising and picking up remote passengers. Some platforms attempt to mitigate the imbalance by providing relocation guidance for idle drivers who may have their own self-relocation strategies and decline to follow the suggestions. Platforms then seek to induce drivers to system-desirable locations by offering them subsidies. This paper proposes a mean-field Markov decision process (MF-MDP) model to depict the dynamics in ride-sourcing markets with mixed agents, whereby the platform aims to optimize some objectives from a system perspective using spatial-temporal subsidies with predefined subsidy rates, and a number of drivers aim to maximize their individual income by following certain self-relocation strategies. To solve the model more efficiently, we further develop a representative-agent reinforcement learning algorithm that uses a representative driver to model the decision-making process of multiple drivers. This approach is shown to achieve significant computational advantages, faster convergence, and better performance. Using case studies, we demonstrate that by providing some spatial-temporal subsidies, the platform is able to well balance a short-term objective of maximizing immediate revenue and a long-term objective of maximizing service rate, while drivers can earn higher income.
Article
Building energy systems work under wide-scale operation conditions. The available data from some conditions might be far less than the data from the other conditions seriously. This is the so-called data imbalance problem, that is, the volumes of data are different for various conditions. This problem is always ignored in the field of building energy load prediction. Three questions remain unclear: how to identify various building operation conditions, how this problem affects the prediction accuracy, and how to overcome this problem. With the aim of addressing the above three questions, at first, this study proposes a clustering decision tree algorithm to identify the building operation conditions. Then, the effects of data imbalance are investigated by changing the proportions of model training samples from various operation conditions. Finally, a clustering decision tree-based multi-model prediction method is proposed to solve the data imbalance problem. The one-year historical operational data from a public building are utilized to validate the multi-model method. The results show that the proposed method has better prediction performance than the conventional single model-based method. It decreases the mean absolute errors of energy load prediction using artificial neural networks, gradient boosting trees, random forests, and support vector regression by 9.83%, 6.71%, 1.32%, and 12.22% on average, respectively. In addition, it increases the coefficients of determination of energy load prediction using the four algorithms by 8.47%, 4.59%, 0.26%, and 13.99% on average, respectively.
Article
This paper explores whether upper bound guarantees to detour distances can be introduced in ride sharing services. By ride sharing we mean taxi ride aggregation services such as Uber-Pool. The paper develops an analytical model that for a given demand relates the guarantee levels to (i) the percent of rides that can be matched, (ii) the expected vehicle distance traveled; (iii) the expected passenger distance traveled; (iv) the fleet size required, and (v) the average passenger trip time including waiting and riding. The formulas developed reveal that for the full range of feasible fleet sizes, ridesharing with detour distance guarantees outperforms both ordinary ride-sharing and ordinary taxi. This suggests that there is a business opportunity that is not currently being exploited.
Article
On-demand ride services reshape urban transportation systems, human mobility, and travelers' mode choice behavior. Compared to the traditional street-hailing taxi, an on-demand ride services platform analyzes ride requests of passengers and coordinates real-time supply and demand with dynamic operational strategies in the ride-sourcing market. To test the impact of dynamic optimization strategies on the ride-sourcing market, this paper proposes a dynamic vacant car-passenger meeting model. In this model, the accumulative arrival rate and departure rate of passengers and vacant cars determine the waiting number of passengers and vacant cars, while the waiting number of passengers and vacant cars in turn influence the meeting rate (which equals to the departure rate of both passengers and vacant cars). The departure rate means the rate at which passengers and vacant cars match up and start a paid trip. Compared with classic equilibrium models, this model can be utilized to characterize the influence of short-term variances and disturbances of current demand and supply (i.e., arrival rates of passengers and vacant cars) on the waiting numbers of passengers and vacant cars. Using the proposed meeting model, we optimize dynamic strategies under two objective functions, i.e., platform revenue maximization, and social welfare maximization, while the driver's profit is guaranteed above a certain level. We also propose an algorithm based on approximate dynamic programming (ADP) to solve the sequential dynamic optimization problem. The results show that our algorithm can effectively improve the objective function of the multi-period problem, compared with the myopic algorithm. A broader range of surge pricing and commission rate and the introduction of incentives are helpful to achieve better optimization results. The dynamic optimization strategies help the on-demand ride services platform efficiently adjust supply and demand resources and achieve specific optimization goals.
Article
Understanding the patterns of automobile travel demand can help formulate policies to alleviate congestion and pollution. This study focuses on the influence of land use and household properties on automobile travel demand. Car license plate recognition (CLPR) data, point-of-interest (POI) data, and housing information data were utilized to obtain automobile travel demand along with the land use and household properties. A geographically and temporally weighted regression (GTWR) model was adopted to deal with both the spatial and temporal heterogeneity of travel demand. The spatial-temporal patterns of GTWR coefficients were analyzed. Also, comparative analyses were carried out between automobile and total person travel demand, and among travel demand of taxis, heavily-used private cars, and total automobiles. The results show that: (I) The GTWR model has significantly higher accuracy compared with the Ordinary Least Square (OLS) model and the Geographically Weighted Regression (GWR) model, which means the GTWR model can measure both the spatial and temporal heterogeneity with high precision; (II) The influence of built environment and household properties on automobile travel demand varies with space and time. In particular, the temporal distribution of regression coefficients shows significant peak phenomenon; and (III) Comparative analyses indicate that residents’ preference for automobiles over other travel modes varies with their travel purpose and destination. The above findings indicate that the proposed method can not only model spatial-temporal heterogeneous travel demand, but also provide a way to analyze the patterns of automobile travel demand.
Article
Short-term traffic forecasting is important for the development of an intelligent traffic management system. Critical to the performance of the traffic prediction model utilized in such a system is accurate representation of the spatiotemporal traffic characteristics. This can be achieved by integrating spatiotemporal traffic information or the dynamic traffic characteristics in the modeling process. The currently employed spatiotemporal k-nearest neighbor (STKNN) model is based on the spatial heterogeneity and adaptive spatiotemporal parameters of the traffic to improve the prediction accuracy. However, the non-stationary characteristics of the traffic cannot be fully represented by simply modeling the entire time range or all the time partitions based on experience. We therefore developed a dynamic STKNN model (D-STKNN) for short-term traffic forecasting based on the non-stationary spatiotemporal pattern of the road traffic. The different traffic patterns along the road are first automatically determined using an affinity propagation clustering algorithm. The Warped K-Means algorithm is then used to automatically partition the time periods for each traffic pattern. Finally, the D-STKNN model is developed based on the three-dimensional spatiotemporal tensor data models for the different road segments with different traffic patterns during different time periods. The D-STKNN model was verified through extensive experiments performed using actual vehicular speed datasets collected from city roads in Beijing, China, and expressways in California, U.S.A. The proposed model outperforms existing seven baselines in different time periods under different traffic patterns. The results confirmed the imperative of considering the non-stationary spatiotemporal traffic pattern in developing a model for short-term traffic prediction.