ArticlePDF Available

H-ConvLSTM-based bagging learning approach for ride-hailing demand prediction considering imbalance problems and sparse uncertainty

July 2022
Transportation Research Part C Emerging Technologies 140(3):103709

July 2022
140(3):103709

DOI:10.1016/j.trc.2022.103709

Authors:

Zhiju Chen

Dalian University of Technology

Kai Liu

Dalian University of Technology

Jiangbo Wang

Dalian University of Technology

Toshiyuki Yamamoto

Nagoya University

The problem of learning from imbalanced ride-hailing demand data with spatiotemporal heterogeneity and highly skewed demand distributions is a relatively new challenge. Current prediction methods usually filter out some spatiotemporal partitions with sparse demands by setting a minimum ride-hailing demand threshold, where the dataset is always assumed to be well balanced in terms of its spatiotemporal partitions, with equal misprediction costs. However, this widely used assumption results in large prediction biases. To achieve better prediction performance, we propose a bagging learning approach based on hexagonal convolutional long short-term memory (H-ConvLSTM), which combines three components. 1) By setting multiple minimum ride-hailing demand thresholds, several subdatasets with different majority ride-hailing demand prediction ranges are obtained. The H-ConvLSTM regression model is applied to each undersampled dataset to train multiple submodels with their respective biased ride-hailing demand prediction ranges. 2) The H-ConvLSTM classification model is trained on the total ride-hailing demand dataset to predict the potential demand range for a certain partition at a future time. 3) The submodel with the best performance with respect to the potential demand range is selected to predict the future demand for this partition. Experiments conducted on order data obtained from Didi Chuxing in Chengdu, China, are conducted. The results show that the proposed approach achieves significantly improved prediction performance relative to that of other models.

The architecture of the H-ConvLSTM-based bagging learning approach.

…

The two-layer local adjacent map of í µí±¦ í µí± í µí±¡ .

…

The architecture of the H-ConvLSTM regression model.

…

Hexagonal convolution operation with a kernel size of 1.

…

Bagging structure.

…

Figures - uploaded by Kai Liu

Content may be subject to copyright.

Content uploaded by Kai Liu

Content may be subject to copyright.

Manuscript submitted to Transportation Research part C Chen et al.

1 / 23

H-ConvLSTM-based bagging learning approach for ride-hailing demand

prediction considering imbalance problems and sparse uncertainty

Zhiju Chen 1, Kai Liu 1*, Jiangbo Wang 1, Toshiyuki Yamamoto 2

1 School of Transportation and Logistics, Dalian University of Technology, Dalian, 116024, China.

E-mail: chenzhiju@mail.dlut.edu.cn; liukai@dlut.edu.cn; Jiangbo_Wang@dlut.edu.cn

2 Institute of Materials and Systems for Sustainability, Nagoya University, Nagoya, 464-8603, Japan.

E-mail: yamamoto@civil.nagoya-u.ac.jp

Abstract

The problem of learning from imbalanced ride-hailing demand data with

spatiotemporal heterogeneity and highly skewed demand distributions is a relatively

new challenge. Current prediction methods usually filter out some spatiotemporal

partitions with sparse demands by setting a minimum ride-hailing demand threshold,

where the dataset is always assumed to be well balanced in terms of its spatiotemporal

partitions, with equal misprediction costs. However, this widely used assumption

results in large prediction biases. To achieve better prediction performance, we propose

a bagging learning approach based on hexagonal convolutional long short-term

memory (H-ConvLSTM), which combines three components. 1) By setting multiple

minimum ride-hailing demand thresholds, several subdatasets with different majority

ride-hailing demand prediction ranges are obtained. The H-ConvLSTM regression

model is applied to each undersampled dataset to train multiple submodels with their

respective biased ride-hailing demand prediction ranges. 2) The H-ConvLSTM

classification model is trained on the total ride-hailing demand dataset to predict the

potential demand range for a certain partition at a future time. 3) The submodel with

the best performance with respect to the potential demand range is selected to predict

the future demand for this partition. Experiments conducted on order data obtained from

Didi Chuxing in Chengdu, China, are conducted. The results show that the proposed

approach achieves significantly improved prediction performance relative to that of

other models.

Keywords: ride-hailing demand prediction, sparse uncertainty, hexagonal convolutional

long short-term memory (H-ConvLSTM), bagging learning

1. Introduction

Internet-based ride-hailing services, which connect drivers and passengers in

real time, have attracted much interest as travel options for residents in recent years

(Vazifeh et al., 2018, Xu et al., 2020). Compared to a traditional taxi service, with a

ride-hailing service, passengers can book orders online in advance through a mobile

app instead of standing on the side of the road and spending time waiting for a taxi to

arrive; this improves the mobility of vehicles and the service level of travel (Alonso-

Mora et al., 2017). With the collection and analysis of large amounts of user order data

Manuscript submitted to Transportation Research part C Chen et al.

2 / 23

and vehicle trajectory data, ride-hailing services are constantly updating and evolving

(Alisoltani et al., 2021), thereby becoming a disruptive force to the traditional

transportation industry (Wang and Yang, 2019). Accurate short-term passenger demand

prediction is the basis for improving the operating efficiency of internet-based ride-

hailing platforms, which plays a crucial role in formulating regulation strategies and

improving the balance between supply and demand.

Human travel behavior has a high degree of temporal and spatial regularity

(González et al., 2008), and many contributions have by focusing on capturing the

temporal and spatial correlations of ride-hailing demand (Ke et al., 2019). Such

methods usually grid an urban space and predict the demands of future time intervals

through historical spatiotemporal demands (Wu et al., 2018; H. Yu et al., 2017). In

general, ride-hailing demand also exhibits complex spatiotemporal heterogeneity (Shen

et al., 2020). In the temporal aspect, ride-hailing demand during rush hour is higher

than that during the normal peak period, and demand during the day is higher than that

at night. Spatially, with increasing distance from the city center, ride-hailing demand

gradually becomes sparse. In a high demand prediction case focusing on a central urban

area, sparse demand has little influence on the training of the utilized model, and it is

therefore difficult to achieve good prediction performance. As the improvement of

transport infrastructure often lags behind urban development and expansion, the sparse

demand for suburban long-distance commuter travel also deserves fair attention.

Therefore, a data imbalance problem is present in the observed ride-hailing demand.

Different spatiotemporal scale divisions determined based on experience further

aggravate the uncertainties of such a highly skewed ride-hailing demand distribution

that are aggregated in various spatiotemporal granularities. To address the supply-

demand imbalance issue, ride-sourcing platforms attempt to provide relocation

guidance for idling drivers (Chen et al., 2020; Zhu et al., 2021). However, near-future

spatiotemporal supply gap area prediction remains an unanswered question (Daganzo

et al., 2020; Guo et al. 2021).

A gap remains in terms of predicting the highly uncertain demands of sparse

areas, where the supply-demand imbalance is grievous and requires prescient

dispatching in advance. Previous studies usually deleted spatiotemporal partitions with

sparse demand from all recorded data by setting a minimum ride-hailing demand

threshold, which helped to alleviate the problem of data imbalance. An increase in the

level of the minimum ride-hailing demand threshold significantly reduces the spatial

coverage of research and changes the sparsity of data, as shown in Fig. 1. However, the

imbalance of the reduced data leads to worse demand prediction, as shown in Fig. 2.

With the increase in demand per partition, the corresponding dataset size also decreases.

For a given small threshold, adjacent ranges that are larger than the threshold tend to

yield better prediction due to their higher data size distributions. The challenge is how

to improve the prediction results for demand-sparse partitions with ranges smaller than

the minimum threshold.

Manuscript submitted to Transportation Research part C Chen et al.

3 / 23

Fig. 1 Spatial coverages under different minimum ride-hailing demand thresholds.

(a) Data size distribution under different demand levels

(b) Prediction errors induced with different demand levels

Fig. 2 The influence of the minimum threshold on ride-hailing demand prediction.

This paper takes a step toward closing this gap. Due to the use of different

minimum ride-hailing demand threshold settings, the corresponding datasets have

different right-adjacent optimal prediction ranges. A hexagonal convolutional long

short-term memory (H-ConvLSTM)-based bagging learning approach is proposed to

integrate the bias preferences of H-ConvLSTM models at different data sparsity levels.

The results are helpful for providing suggestions regarding the optimal deployment of

ride-hailing services, reducing driver operating costs, and improving the travel quality

of residents. The main contributions of this paper are summarized as follows.

⚫ We propose an H-ConvLSTM regression model to compare and analyze the ride-

hailing demand prediction performances achieved under different minimum ride-

hailing demand threshold settings.

⚫ An H-ConvLSTM-based bagging learning approach is further proposed to

integrate the bias prediction preferences of each H-ConvLSTM regression model

trained at different data sparsity levels.

⚫ An experimental analysis conducted on the order data obtained from Didi Chuxing

in Chengdu city over one month shows that the proposed approach can achieve

improved prediction performance on the total dataset.

The rest of this paper is organized as follows. Section 2 is a literature review of

ride-hailing demand prediction and the data imbalance problem. Section 3 describes the

Manuscript submitted to Transportation Research part C Chen et al.

4 / 23

main structural framework of the developed prediction models. Section 4 presents the

experimental results, followed by the conclusions in Section 5.

2. Literature review

The use of historical travel records to predict future ride-hailing demand is

helpful for assisting online ride-hailing platforms in carrying out dynamic operation

strategies and optimizing the balance between supply and demand. In this section, we

discuss traditional and existing travel prediction approaches, the advantages of

hexagonal partitioning, and the related work that deals with sparse demand data.

2.1 Travel demand prediction approaches

The most common travel prediction method is a time series model, such as an

autoregressive integrated moving average (ARIMA) model and its various improved

versions (Kaltenbrunner et al., 2010; Min and Wynter, 2011). Machine learning models

and statistical models such as neural network models (Zheng et al., 2006), Bayesian

network models, Kalman filtering models, and least absolute shrinkage and selection

operator (LASSO) models have also been proposed to solve various prediction

problems related to travel demand. Jiang et al. (2014) integrated ensemble empirical

mode decomposition (EEMD) and a gray support vector machine (GSVM) into a

mixed-demand prediction model for high-speed railways. Ma et al. (2014) proposed an

interactive multiple model-based pattern hybrid (IMMPH) approach to predict short-

term passenger demand, and this approach maximizes the effective information by

assembling the knowledge obtained from pattern models. Davis et al. (2016) proposed

a multilayer clustering technique that utilizes the correlation between adjacent

geographic hashes to reduce prediction errors. Zhu et al. (2019) integrated the joint

probability distribution of traffic flows at nearby locations into a time series traffic

speed prediction model. Although these models have achieved improved prediction

performance through continuous improvement, they still struggle to capture complex

temporal and spatial correlations.

The great advantages of deep learning in terms of computing power and

characterizing big data enable its wide application to travel prediction (Jo et al., 2019;

Yuan et al., 2019). By approximating the grid of an urban space into image pixels, a

convolutional neural network (CNN) can effectively identify the spatial correlations

among the grid data. Zhang et al. (2016) applied a CNN to a deep spatiotemporal

prediction model to predict travel flows in real time. Both LSTM and gated recurrent

units (GRUs) have good performance with respect to capturing complex time-

sequential interactions. Therefore, combinations of these models seem to have better

performance in dealing with complex temporal and spatial correlations. H. Yu et al.

(2017) combined a CNN and LSTM to obtain spatial and temporal features for the

prediction of traffic speed. Shi et al. (2015) applied ConvLSTM to address precipitation

nowcasting. As an improved form of an LSTM model, ConvLSTM employs

convolutional structures in both the input-to-state and state-to-state transitions to reduce

the loss of spatiotemporal topology data. In the field of transportation, ConvLSTM has

Manuscript submitted to Transportation Research part C Chen et al.

5 / 23

also been applied to solve prediction problems such as travel speed and ride-hailing

demand and has achieved good prediction performance (Ke et al., 2017; Wang et al.,

2018; Yang et al., 2018). However, these models, which are based on square partitions,

are often difficult to directly apply to hexagonal networks.

2.2 The advantages of hexagonal partitioning

Compared with a square, a hexagon is closer to a circle, and its distribution is

symmetric and equivalent (Birch et al., 2000). Therefore, travel demands with similar

spatiotemporal characteristics are more easily aggregated, and the flows of vehicles

between partitions are more accurately characterized. In addition, in a square partition

space, the partition distance transformed from the same actual distance is much larger

in the oblique direction than in the vertical and horizontal directions. The better isotropy

of a hexagon partition enables it to better express the spatial proximity between

partitions during the calculation process. Based on these advantages, hexagonal

partitioning has been widely used in regional and urban science research. Shoman et al.

(2019) performed a comparative analysis between hexagonal partitions, triangles, and

squares and found that hexagonal partitions can better reduce the area errors of urban

fabric. Csiszár et al. (2019) applied the hexagonal partition method to an evaluation of

charging station configurations in urban areas to further optimize the distribution of

charging stations. To the best of our knowledge, Ke et al. (2019) were the first to

propose a successful hexagon-based deep learning model for travel demand prediction;

they also discussed the advantages of the hexagonal partition approach mentioned

above in detail. However, hexagonal data must be mapped to a matrix before executing

feedforward propagation calculations, which destroys the spatial position relationships

between the hexagonal partitions. The HexagDLy framework proposed by Steppa and

Holch (2019) subtly solved this problem; however, it has difficulty grasping the

complex time correlations in time series data.

2.3 Addressing sparse demand data

The highly skewed spatial and temporal distributions of ride-hailing demand

lead to severe demand imbalances among spatiotemporal partitions. As a result, the

demand information in minority spatiotemporal partitions is overwhelmed by that in

majority spatiotemporal partitions. The different settings of a minimum ride-hailing

demand threshold make the corresponding datasets have certain sparse distribution

characteristics. The levels for these sparse demands are often difficult to accurately

predict. The most common approaches for solving this problem include data-level

methods, algorithm-level methods, and hybrid methods that combine the advantages of

the other two types of techniques (Krawczyk, 2016). Data-level methods aim to change

the input training set to fit a standard learning algorithm. To achieve a balanced data

distribution, previous studies usually increased the number of minority ranges (the

number of classes in a classification task or the target values in a regression task that

have the lowest data sizes in the dataset) by oversampling (Chawla et al., 2002;

Vluymans, 2019) or decreased the number of majority ranges (the number of classes in

a classification task or the target values in a regression task that have the highest data

Manuscript submitted to Transportation Research part C Chen et al.

6 / 23

sizes in the dataset) by undersampling (Ha and Lee, 2016; Lin et al., 2017). Moniz et

al. (2017) combined resampling methods with standard regression models (such as

SVMs) to achieve improved prediction accuracy for imbalanced time series. Zhang et

al. (2021) proposed a clustering decision tree-based multimodel prediction method to

solve the data imbalance problem in building energy load prediction. Cheng et al. (2020)

developed a dynamic spatiotemporal k-nearest neighbor (D-STKNN) model to identify

heterogeneous travel patterns in different temporal and spatial units, which were further

considered for conducting short-term travel speed prediction to improve the prediction

accuracy of the model. However, little effort has been directed toward solving the data

imbalance problem while capturing the complex spatiotemporal correlations of ride-

hailing demands with sparse uncertainties.

Although the mathematical structures of prediction models exhibit significant

difference, the training objective of both statistical models and machine learning models

is always the same: minimizing their total/mean prediction errors (loss function) on the

observed or training dataset. The utilized evaluation indices (such as the symmetric

mean absolute percentage error (SMAPE) and root mean square error (RMSE)), guided

by global prediction performance, are often biased toward the majority ranges of ride-

hailing demand (Japkowicz and Stephen, 2002). The minority ranges of partitioned

ride-hailing demand induce high costs when the demand is not well-predicted. Previous

studies related to ride-hailing demand prediction usually filtered out large amounts of

spatiotemporal units with sparse demand by setting a minimum ride-hailing demand

threshold (Ke et al., 2017). Then, the dataset was always assumed have well-balanced

spatiotemporal partitions with equal numbers of mispredictions. However, this

assumption results in great bias in the prediction results due to the spatiotemporal data

imbalance problem. Therefore, more attention should be given to designing appropriate

prediction algorithms for imbalanced ride-hailing demand data and to ensuring good

prediction performance in different spatial and temporal locations.

In this paper, we integrate the bias preferences of a standard prediction model

with multiple majority ranges of ride-hailing demand to improve the total prediction

accuracy. A hexagon is chosen as the basic spatiotemporal partition to facilitate the

aggregation of ride-hailing demands with similar characteristics. Previous studies (Ke

et al., 2019; Huang et al., 2019) usually focused their research area on limited ranges

by setting minimum ride-hailing demand thresholds, as this is a common data

processing method. Different minimum ride-hailing demand threshold settings cause

the corresponding datasets to have their own majority ride-hailing demand ranges,

leading to an imbalanced data problem with uncertain sparsities in ride-hailing demand

prediction. Therefore, H-ConvLSTM is proposed as a submodel to compare the

prediction performances achieved with different threshold settings, in which hexagonal

convolution kernels are applied to directly conduct convolution calculations on

hexagonal partitions. In addition, an H-ConvLSTM-based bagging learning approach

is further proposed to integrate the optimal prediction ranges of the submodel at

different data sampling degrees.

Manuscript submitted to Transportation Research part C Chen et al.

7 / 23

3 Methodology

Fig. 3 shows the architecture of the proposed H-ConvLSTM-based bagging

learning approach for ride-hailing demand prediction. The architecture is composed of

three parts. First, several undersampled datasets  are established for all ride-

hailing order data by setting a minimum ride-hailing demand threshold. Second, an H-

ConvLSTM regression model is established, and the corresponding predictive

submodels are trained on each subtraining dataset. Finally, a bagging strategy is

developed to integrate the bias preferences of each submodel.

Fig. 3. The architecture of the H-ConvLSTM-based bagging learning approach.

3.1 Preliminary

In this section, a city is divided into uniform hexagonal partitions, and a day is

divided into uniform time intervals to aggregate the ride-hailing orders of different areas.

Therefore, the ride-hailing demand  can be defined as the number of ride-hailing

orders issued in hexagon partition  during time interval .

Due to the presence of significant spatiotemporal correlations,  historical

ride-hailing demand features of two-layer local adjacent maps 

centralized at hexagon , as shown in Fig. 4, are selected to jointly predict the ride-

hailing demands  of target partition  for future time intervals.

Manuscript submitted to Transportation Research part C Chen et al.

8 / 23

Fig. 4. The two-layer local adjacent map of .

3.2 H-ConvLSTM regression model

As an improved form of the LSTM model, ConvLSTM has convolutional

structures in both the input-to-state and state-to-state transitions and has good

performance in terms of simultaneously capturing temporal and spatial features. The

key to ConvLSTM involves the cell states , which memorize and cycle information

through gate structures that consist of forget gates , input gates 

 and output gates

 . To capture spatial dependencies, the historical cell states  , input

states  , hidden states  and other gates of ConvLSTM

are 3D tensors whose last two dimensions are rows and columns of spatial information.

The forget gate layer  determines what information we discard from cell state .

The input gate layer 

 determines what information to input and updates the old cell

state  to  through a tanh layer. Then, parts of the cell state  determined by

the output gate layer  are exported as the memorized hidden state .

To incorporate the advantages of hexagonal partitioning, we propose an H-

ConvLSTM regression model to capture the spatiotemporal characteristics of ride-

hailing demand, as shown in Fig. 5. H-ConvLSTM directly adopts hexagonal

convolution calculations during feedforward propagation. Following previous research

(Steppa and Holch, 2019), we apply a hexagonal convolution kernel to extract the

spatial and temporal features of the two-layer local adjacency map, as shown in Fig. 6.

The specific functional relationships of H-ConvLSTM are as follows:

     (1)



    (2)



     (3)

  



 (4)

     (5)

  (6)

where * denotes the hexagonal convolution operator and  denotes the Hadamard

operator.  denote the trainable

parameters.  and  denote the sigmoid and hyperbolic tangent activation

functions, respectively. Following a series of fully connected layers, the ride-hailing

demand  for location  and time interval  can be predicted.

Manuscript submitted to Transportation Research part C Chen et al.

9 / 23

Fig. 5. The architecture of the H-ConvLSTM regression model.

Fig. 6. Hexagonal convolution operation with a kernel size of 1.

3.3 Bagging strategy

Bootstrap aggregation, known as bagging, is one of the earliest ensemble

algorithms (Breiman, 1996). The bagging structure is shown in Fig. 7. The original

dataset is sampled n times according to a certain sampling strategy, and n subdatasets

are obtained. N weak classification models are trained on these subdatasets, and the

final classification result is obtained by voting on the prediction results of each model.

This algorithm effectively improves the classification performance of weak classifiers,

especially when dealing with data imbalance problems.

Fig. 7. Bagging structure.

The bagging strategy of the H-ConvLSTM-based bagging learning approach is

shown in Fig. 8, and it contains three parts. First, the trained submodels are used to

predict the total training set  and the prediction error distribution of each trained

submodel is counted. Then, the optimal prediction range of each submodel in terms of

the demand value distribution is identified and labeled as a category. Finally, instead of

utilizing the traditional voting method, the H-ConvLSTM classification model is

Manuscript submitted to Transportation Research part C Chen et al.

10 / 23

trained on the total training set  to predict the potential range of the demand level

for a certain location at a future time. The submodel with the best performance

regarding the range of potential demand levels is selected to predict the future demand

at this location.

Fig. 8. Bagging strategy of the H-ConvLSTM-based bagging learning approach.

The SMAPE and RMSE are selected as the prediction error evaluation indices,

and they are formulated as follows:

  











 (7)

 

 



 (8)

where  and  are the predicted ride-hailing demands and true ride-hailing

demands, respectively, and  is a very small value that prevents the denominator from

being 0.

The undersampled datasets  have different ride-hailing demand

distribution structures. Different majority ranges of ride-hailing demand make the

corresponding H-ConvLSTM regression submodels have their own prediction bias

preferences in  as shown in Fig. 9. Therefore, we divide the demand values into

continuous   sections  according to size, in which the first 

sections  represent the optimal prediction ranges of the  submodels. Due to

slight prediction performance differences regarding the demand distribution, we can

Manuscript submitted to Transportation Research part C Chen et al.

11 / 23

generate two sets of boundary points 



 and 



 ,

corresponding to the SMAPE and RMSE, respectively. The final optimal prediction

range boundary points  can be obtained by taking the average values of

the two sets of data.

Fig. 9. Prediction error distributions of the submodel trained on datasets .

Different from the corresponding regression model, the H-ConvLSTM

classification model identifies the potential range category 

 of the ride-hailing

demand  for a future time interval based on the historical spatiotemporal ride-

hailing demand features , as shown in Fig. 10. One-hot encoding is used

to convert the categories  to binary vectors of length  . The H-

ConvLSTM submodel at  corresponding to the predicted range category 

 is

selected to obtain the predicted ride-hailing demand  at a future time.

(a) Regression model

(b) Classification model

Fig. 10. Structures of the fully connected layers in the prediction models.

4 Experimental results

4.1 Dataset and model setup

The dataset, including all the online ride-hailing order data for Chengdu in

November 2016, is provided by the Didi Gaia Plan platform. To achieve better

prediction performance, the selection of the spatiotemporal granularities in this case

follows the research of Liu et al. (2022). Each day is decentralized by setting 30 minutes

as the time interval, and a time partition label is added for each order data point based

on its starting time. Then, hexagonal partitions are added to the urban space based on

the Quantum Geographic Information System (QGIS), and the intersection operation is

Manuscript submitted to Transportation Research part C Chen et al.

12 / 23

performed with the order data and their added time partition labels. The city is divided

into 35×46 hexagonal partitions with a side length of 800 meters, and each order data

point is further labeled with a hexagonal partition ID. Based on the time interval labels

and the hexagonal partition IDs, we can easily aggregate the ride-hailing demand into

different spatiotemporal partitions. Two-layer local adjacent maps centralized at the

target partition in the previous 8 time intervals are used to predict the ride-hailing

demand in the next time interval. Therefore, during the training and testing processes

of the proposed deep learning model, a travel demand sample  needs to be expanded

into the corresponding sample group   , where 

represents the input of the model and  represents the corresponding label, as

shown in Fig. 11.

Minimum ride-hailing demand thresholds are set for all spatiotemporal

partitions (from 1 to 256, doubling each time) to create multiple datasets with different

ride-hailing demand coverages. The ride-hailing demands that are less than the

corresponding threshold in each dataset are excluded. In other words, if the travel

demand sample  at the center of  is less than the threshold, the sample group 

is removed from the corresponding dataset.

Fig. 11. The contents of a sample group.

The ride-hailing demands are arranged in order from small to large and divided

into 20 equal parts according to their proportions of the total ride-hailing demand. The

data size distributions obtained under different minimum ride-hailing demand

thresholds and the average demand values over the total ride-hailing demand are shown

in Fig. 12. The left axis represents the data size of the subdataset corresponding to the

minimum ride-hailing demand threshold (1 to 256) in each demand range, and the right

axis represents the average demand value of  in each demand range. As the

minimum ride-hailing demand threshold increases, the corresponding majority ride-

hailing demand ranges continuously increase.

Manuscript submitted to Transportation Research part C Chen et al.

13 / 23

Fig. 12. The data size distributions obtained under different minimum ride-hailing demand

thresholds and the average demand values over the total ride-hailing demand.

The training process of the proposed H-ConvLSTM-based bagging learning

approach is shown in Fig. 13(a). It consists of multiple regression submodels (H-

ConvLSTM regression models) and a classifier (an H-ConvLSTM classification model),

which are trained as shown in Fig. 13(b) and Fig. 13(c), respectively. Each submodel

 is trained on the corresponding subdataset  which is an undersampling of the

total dataset . By evaluating the prediction performance of each submodel 

on , the individual optimal prediction ranges  can be identified and labeled as

separate classes. Then, the dataset 

 is obtained on the basis of  by replacing

the labels of the sample data with the range categories to which they belong. The

classifier is trained on 

 to identify the potential range of the predicted travel

demand.

Fig. 13. Training process of the H-ConvLSTM-based bagging learning approach.

The division of the training dataset and testing dataset is shown in Fig. 14. The

data of ,  and 

 in the first 21 days are used for training, and the data from

the last 9 days are used for testing. The testing process of the H-ConvLSTM-based

bagging learning approach is similar to that shown in Fig. 13(a). First, the trained

classifier is used to select an appropriate submodel for 's input demand, and then

Manuscript submitted to Transportation Research part C Chen et al.

14 / 23

this submodel is used to predict the corresponding future demand.

Fig. 14. Division of the training dataset and testing dataset.

The experimental platform is a server with an Intel(R) Xeon(R) Gold-5218 CPU

@ 2.30 GHz, 128 GB of RAM, and one GPU (NVIDIA Quadro RTX 5000). The

proposed model is implemented in Python 3.6.6 with PyTorch, TensorFlow and Keras.

The proposed H-ConvLSTM regression and classification models both consist of 4

ConvLSTM layers, which have 8, 16, 32, and 32 hidden states. The hexagonal kernel

size of each layer is 1. To ensure that the input and output of the hexagonal convolution

operation have the same dimensionality, similar to the same padding approach used in

traditional CNN models, virtual hexagons with zero demand values are padded as

neighbors of the hexagons on the border. Batch normalization and dropout are used for

training the model. The number of training epochs is set to 50 with a batch size of 128.

Adam is used for optimization with a learning rate of 0.0001. The weighted sum of the

SMAPE and RMSE is used as the loss function of the regression model, while the

classification cross entropy is used as the loss function of the classification model. The

SMAPE and RMSE are used to evaluate the prediction performance of the demand

value distribution yielded by the regression model.

4.2 Optimal prediction range division results

The H-ConvLSTM regression submodel is trained on each undersampled

dataset from  to  and the corresponding prediction performance is calculated.

Since only a threshold setting between 1 and 32 can produce a relatively obvious

optimal prediction distribution range, we only select the corresponding submodels with

this characteristic as the research objects, and the prediction results are shown in Fig.

15. Each prediction distribution curve first exhibits a decreasing trend and then

increases near the threshold point. As a percentage error that is sensitive to sparse

demand, the SMAPE is mainly used to reflect the influence of different thresholds on

the resulting prediction performances. The submodel corresponding to each threshold

has an obvious optimal prediction range, and the ride-hailing demand values can be

divided into 7 segments according to size. The RMSE is an absolute error and is

sensitive to large outliers. Although the RMSEs of the submodels also perform best

when the demand values are slightly larger than the threshold, the prediction

performance corresponding to these demand values is difficult to make as obvious as

that obtained with the SMAPE because their distribution is located in a smaller demand

range. Therefore, the prediction result distribution of the RMSE is mainly a

supplementary validation of the SMAPE.

Manuscript submitted to Transportation Research part C Chen et al.

15 / 23

(a) SMAPE

(b) RMSE

Fig. 15. The distributions of the demand value prediction errors obtained under different minimum

ride-hailing demand thresholds.

Table 1 Statistical results of the boundary points.

Type

Boundary point demand values













SMAPE

RMSE

Average value

1.5

3.5

19.5

The boundary points between each segment are determined as shown in Table

1. The intersection points of adjacent optimal ranges are selected as the first five

boundary points. The last boundary point is the closest intersection between the

prediction distribution curve of threshold 32 and the other distribution curves on the

right. The classification numbers of the demand values distributed in the final 7

segments are set to 1, 2, 3, 4, 5, 6, and 7 and further transformed into corresponding

binary vectors through one-hot encoding. Then, the dataset 

 is obtained on the

basis of  by replacing the label  of the sample group  with the

classification number to which it belongs.

4.3 Results of the H-ConvLSTM-based bagging learning approach

The H-ConvLSTM classification model is trained on 

. Similar to the H-

ConvLSTM regression submodel, 8 historical ride-hailing demand features of two-

layer local adjacent maps  centralized at hexagon , are selected to jointly

predict the segment category of ride-hailing demands  of target partition  for

future time interval  . An accuracy of 85.76% is achieved on the testing dataset

(88.94% on the training dataset). The boundaries of segment categories depend on the

prediction distribution of each submodel in the training set of , and it is assumed

that the optimal prediction range of each submodel in the training set and testing set is

roughly similar. For the historical data in the testing set of  whose prediction

categories are the first 6 segments , the corresponding regression submodel

trained on the training set of  is used to predict the ride-hailing demand at a future

time. The data size of the ride-hailing demand distributed in the last segment  is

relatively small as shown in Fig. 12, and no submodel shows significantly better

prediction performance in this segment. Therefore, the average value of the prediction

results of the 6 submodels is used as the predicted value of ride-hailing demand of

segment  at a future time. The prediction error distribution of the H-ConvLSTM-

based bagging learning approach is shown in Fig. 16.

Manuscript submitted to Transportation Research part C Chen et al.

16 / 23

(a) SMAPE

(b) RMSE

Fig. 16. Prediction errors of the bagging learning approach based on H-ConvLSTM.

Compared with that of the H-ConvLSTM regression submodels trained under

different minimum ride-hailing demand threshold settings, the prediction performance

of the H-ConvLSTM-based bagging learning approach is improved by different degrees

and is closer to the optimal performance limit that can be achieved by this method (i.e.,

an oracle submodel classifier that always selects the model that performs best, as shown

by the dotted black line).

To verify the validity of the proposed model, several basic models are selected

for comparison, as follows.

1) ARIMA: This is the autoregressive integrated moving average model that is

widely used for time series prediction. The difference order  is set to 1, with an

autoregressive coefficient  and a moving average coefficient  for iterating the

previous time intervals between 1 and 8.

2) Hexagonal artificial neural network (H-ANN): The spatial feature and

historical temporal feature of the demands of a hexagonal partition are spliced together

as the input for a fully connected neural network, and the predicted demand value of a

future time is output. The model includes 5 fully connected layers, which have 128, 64,

32, 16, and 8 hidden neurons.

3) H-CNN: The previous 8 time intervals are represented by the numbers of

channels in the input image. A hexagonal convolution operation is applied between each

pair of layers. The H-CNN model includes 4 convolution layers, which have 8, 16, 32,

and 32 hidden states. The hexagonal kernel size of each layer is 1. Batch normalization

and dropout are used to train the model.

4) H-CNN-LSTM: An H-CNN model with one channel is selected to extract the

spatial ride-hailing demand characteristics of the previous 8 time intervals. The settings

for the convolution layer and the convolution kernel remain the same. The outputs of

the H-CNN for the previous 8 time intervals are expanded into vectors and used as the

inputs for the LSTM to extract the temporal characteristics of ride-hailing demand. The

hidden state of the LSTM is set to 128.

5) H-CNN-GRU: The output of the H-CNN is taken as the input of a GRU, and

the other settings are consistent with those of H-CNN-LSTM.

The data sizes of different ride-hailing demands are greatly different, which

causes the overall prediction error to be significantly affected by the sparse ride-hailing

demand prediction results with large data sizes. The sparse ride-hailing demands, which

account for approximately 70% of the total data size (i.e., 70% of the spatiotemporal

partitions are sparse demands), contain less than 10% of the total ride-hailing demand

quantity. To better evaluate the prediction performance of each model, we propose to

utilize the weighted SMAPE (wSMAPE) and weighted RMSE (wRMSE) to

Manuscript submitted to Transportation Research part C Chen et al.

17 / 23

comprehensively consider the prediction results corresponding to different ride-hailing

demand size distributions as follows:

  





 













 (9)

  





  







 (10)

The ride-hailing demands are arranged in order from small to large and divided

into 20 equal parts according to their proportions of the total demand value. The subdata

size of each 5% ride-hailing demand segment is denoted as  . 



denotes the weight of segment .  denotes the th predicted value of segment 

and  is the corresponding true value.  is a very small value that prevents the

denominator from being 0.

Table 2 Model performance comparison.

Model

wSMAPE

(×10-2)

wRMSE

Training

time

(h)

Testing

time

(min)

ARIMA

14.53

23.51

0.01

H-ANN

14.11

22.75

0.31

0.04

H-CNN

13.21

22.36

4.43

0.54

H-CNN-LSTM

12.35

21.12

6.05

0.96

H-CNN-GRU

12.29

21.53

5.71

0.88

H-ConvLSTM + Threshold 1

11.61

20.97

7.16

1.02

H-ConvLSTM + Threshold 2

10.02

19.58

5.36

0.83

H-ConvLSTM + Threshold 4

10.54

20.04

3.59

0.57

H-ConvLSTM + Threshold 8

11.81

20.21

2.33

0.36

H-ConvLSTM + Threshold 16

14.62

20.76

1.02

0.15

H-ConvLSTM + Threshold 32

18.76

24.18

0.75

0.11

H-ConvLSTM + bagging

9.42

18.63

25.84

1.62

The overall prediction performance achieved by each model on the testing

dataset is shown in Table 2. With the enhancement in the ability of the model to capture

the temporal and spatial characteristics of ride-hailing demand, both the wSMAPE and

wRMSE of the H-ConvLSTM regression model are lower values. By integrating the bias

prediction preferences of each submodel in different segments, the prediction

performance of our proposed bagging learning approach based on H-ConvLSTM

improves by 5.99% and 4.85% over the values obtained with the optimal threshold

setting in terms of the wMAPE and wRMSE, respectively. Due to the inclusion of

multiple regression submodels and an additional classification model, the proposed H-

ConvLSTM-based bagging learning approach requires more training time.

Manuscript submitted to Transportation Research part C Chen et al.

18 / 23

Fig. 17. Spatial distribution of the wSMAPE difference between the H-ConvLSTM-based bagging

learning approach and H-ConvLSTM-Oracle.

Fig. 18. Spatial distributions of the wSMAPE differences between the submodels and

H-ConvLSTM-Oracle.

Assume that H-ConvLSTM-Oracle has an oracle submodel classifier that

always selects the version that performs best. This model represents the upper bound

performance of our H-ConvLSTM-based bagging learning approach. The spatial

distributions of the wSMAPE differences between the H-ConvLSTM-based bagging

learning approach and each of the other 6 submodels against H-ConvLSTM-Oracle are

shown in Fig. 17 and Fig. 18, respectively. A smaller difference value means that the

wSMAPE value is close to that of H-ConvLSTM-Oracle, and the corresponding

prediction results are better. The proposed H-ConvLSTM-based bagging learning

approach effectively selects the optimal submodel in the whole spatial distribution, and

the prediction results are less different from those of H-ConvLSTM-Oracle. When

compared with the results obtained under a minimum ride-hailing demand threshold of

1, the prediction results of H-ConvLSTM-Oracle are mainly improved in the central

Manuscript submitted to Transportation Research part C Chen et al.

19 / 23

urban area, especially in the transition areas between urban and suburban areas. For

other minimum ride-hailing demand threshold settings, the prediction results of H-

ConvLSTM-Oracle are more significantly improved in the outer suburbs. With the

continuous increase in the threshold value, the improvement effect and coverage area

of the corresponding H-ConvLSTM-Oracle prediction results continuously increase.

5 Conclusion

In this paper, we propose an H-ConvLSTM regression model to compare and

analyze the ride-hailing demand prediction performances achieved under different data

distribution characteristics. Minimum ride-hailing demand thresholds are set for all

spatiotemporal partitions to create multiple datasets with different data sparsities. The

H-ConvLSTM regression models trained using different datasets have their own

optimal prediction ranges on the testing set, and each prediction distribution curve first

exhibits a decreasing trend and then increases near the threshold point.

An H-ConvLSTM-based bagging learning approach is further proposed to

integrate the bias prediction preferences of each H-ConvLSTM regression model

trained on datasets with different data sparsities. An experimental analysis conducted

on the order data obtained from Didi Chuxing in Chengdu city over one month shows

that the proposed H-ConvLSTM-based bagging learning approach can achieve

significantly improved prediction performance.

In future work, we are committed to performing more in-depth qualitative and

quantitative analyses of the spatiotemporal scale of internet-based ride-hailing demand.

The influence of an imbalanced ride-hailing demand distribution (caused by the

division of different spatial and temporal scales) on the prediction performance will be

discussed. Policy recommendations will also be made to improve the operational

efficiency and quality of ride-hailing.

Acknowledgments

This research was funded by the National Natural Science Foundation of China

(grant nos. 51378091 and 71871043). The authors would like to acknowledge the GAIA

open data from DiDi Chuxing.

Appendix A. Comparison of predicted results between H-ConvLSTM and

ConvLSTM

Traditional ConvLSTM based on matrix convolution operation is compared

with our H-ConvLSTM model to verify the advantages of hexagonal convolution

operation. The model parameter configuration can refer to Liu et al. (2022) for a

detailed explanation and instruction. The predicted results are shown in Table A1.

Compared with a square, a hexagon is closer to a circle, and its distribution is symmetric

and equivalent. Therefore, travel demands with hexagon partition are more accurately

predicted. Since the hexagonal convolution operation solves the problem of topological

loss of spatial relations caused by matrix transformation in traditional ConvLSTM, the

proposed H-ConvLSTM model shows stable optimal prediction performance.

Manuscript submitted to Transportation Research part C Chen et al.

20 / 23

Table A1: Comparison of different demand prediction models

Model

Partition

shape

RMSE

MAPE (×10-2)

Testing

set

Avg.

Sd.

Testing

set

Avg.

Sd.

ConvLSTM

Square

9.37

9.38

0.12

17.18

17.55

0.46

Hexagon

9.03

9.12

0.07

17.02

17.27

0.55

H-ConvLSTM

Hexagon

8.82

8.80

0.05

16.71

16.76

0.36

References

Alisoltani N., Leclercq L. and Zargayouna M. 2021, Can dynamic ride-sharing reduce

traffic congestion? Transp. Res. Part B Methodol. 145, 212-246.

Alonso-Mora, J., Samaranayake, S., Wallar, A., Frazzoli, E., Rus, D., 2017. On-

demand high-capacity ride-sharing via dynamic trip-vehicle assignment. Proc.

Natl. Acad. Sci. U. S. A. 114, 462-467.

Birch, C.P.D., Vuichard, N., Werkman, B.R., 2000. Modelling the effects of patch

size on vegetation dynamics: Bracken [Pteridium aquilitnum (L.) Kuhn] under

grazing. Ann. Bot. 85, 63-76.

Breiman, L., 1996. Bagging predictors. Mach. Learn. 24, 123–140.

Chawla, N. V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE:

Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 16, 321-357.

Chen X., Zheng H., Ke J. and Yang H. 2020. Dynamic optimization strategies for on-

demand ride services platform: Surge pricing, commission rate, and incentives.

Transp. Res. Part B Methodol. 138, 23-45.

Cheng, S., Lu, F., Peng, P., 2020. Short-Term Traffic Forecasting by Mining the Non-

Stationarity of Spatiotemporal Patterns. IEEE Trans. Intell. Transp. Syst. 22(10),

6365-6383.

Csiszár, C., Csonka, B., Földes, D., Wirth, E., Lovas, T., 2019. Urban public charging

station locating method for electric vehicles based on land use approach. J.

Transp. Geogr. 74, 173–180.

Daganzo C.F., Ouyang Y. and Yang H. 2020. Analysis of ride-sharing with service

time and detour guarantees. Transp. Res. Part B Methodol. 140, 130-150.

Davis, N., Raina, G., Jagannathan, K., 2016. A multi-level clustering approach for

forecasting taxi ride-hailing demand. IEEE Conf. Intell. Transp. Syst.

Proceedings, ITSC 223-228.

González, M.C., Hidalgo, C.A., Barabási, A.L., 2008. Understanding individual

human mobility patterns. Nature 453, 779-782.

Guo X.T., Caros N.S. and Zhao J.H. 2021. Robust matching-integrated vehicle

rebalancing in ride-hailing system with uncertain demand. Transp. Res. Part B

Methodol. 150, 161-189.

Huang, Z., Huang, G., Chen, Z., Wu, C., Ma, X., Wang, H., 2019. Multi-regional

online car-hailing order quantity forecasting based on the convolutional neural

network. Inf. 10(6),193.

Manuscript submitted to Transportation Research part C Chen et al.

21 / 23

Japkowicz, N., Stephen, S., 2002. The class imbalance problem: A systematic study.

Intell. Data Anal. 6, 429-449.

Jo, D., Yu, B., Jeon, H., Sohn, K., 2019. Image-to-image learning to predict traffic

speeds by considering area-wide spatio-temporal dependencies. IEEE Trans.

Veh. Technol. 68, 1188-1197.

Jiang, X., Zhang, L., Chen, M.X., 2014. Short-term forecasting of high-speed rail

demand: A hybrid approach combining ensemble empirical mode decomposition

and gray support vector machine with real-world applications in China. Transp.

Res. Part C Emerg. Technol. 44, 110-127.

Kaltenbrunner, A., Meza, R., Grivolla, J., Codina, J., Banchs, R., 2010. Urban cycles

and mobility patterns: Exploring and predicting trends in a bicycle-based public

transport system. Pervasive Mob. Comput. 6, 455-466.

Ke, J., Yang, H., Zheng, H., Chen, X., Jia, Y., Gong, P., Ye, J., 2019. Hexagon-Based

Convolutional Neural Network for Supply-Demand Forecasting of Ride-

Sourcing Services. IEEE Trans. Intell. Transp. Syst. 20, 4160-4173.

Ke, J., Zheng, H., Yang, H., Chen, X. (Michael), 2017. Short-term forecasting of

passenger demand under on-demand ride services: A spatio-temporal deep

learning approach. Transp. Res. Part C Emerg. Technol. 85, 591-608.

Krawczyk, B., 2016. Learning from imbalanced data: open challenges and future

directions. Prog. Artif. Intell. 5, 221-232.

Li, X., Pan, G., Wu, Z., Qi, G., Li, S., Zhang, D., Zhang, W., Wang, Z., 2012.

Prediction of urban human mobility using large-scale taxi traces and its

applications. Front. Comput. Sci. China 6, 111-121.

Lin, W.C., Tsai, C.F., Hu, Y.H., Jhang, J.S., 2017. Clustering-based undersampling in

class-imbalanced data. Inf. Sci. (Ny). 409-410, 17-26.

Liu, K., Chen, Z., Yamamoto, T., Tuo, L., 2022. Exploring the impact of

spatiotemporal granularity on the demand prediction of dynamic ride-hailing.

preprint arXiv:2203.10301.

Ma, Z., Xing, J., Mesbah, M., Ferreira, L., 2014. Predicting short-term bus passenger

demand using a pattern hybrid approach. Transp. Res. Part C Emerg. Technol.

39, 148-163.

Min, W., Wynter, L., 2011. Real-time road traffic prediction with spatio-temporal

correlations. Transp. Res. Part C Emerg. Technol. 19, 606-616.

Moniz, N., Branco, P., Torgo, L., 2017. Resampling strategies for imbalanced time

series forecasting. Int. J. Data Sci. Anal. 3, 161-181.

Shen, X., Zhou, Y., Jin, S., Wang, D., 2020. Spatiotemporal influence of land use and

household properties on automobile ride-hailing demand. Transp. Res. Part D

Transp. Environ. 84, 102359.

Shi, X., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W., Woo, W., 2015. Convolutional

LSTM Network: A Machine Learning Approach for Precipitation Nowcasting.

Adv. Neural Inf. Process. Syst. 2015-Janua, 68-80.

Shoman, W., Alganci, U., Demirel, H., 2019. A comparative analysis of gridding

Manuscript submitted to Transportation Research part C Chen et al.

22 / 23

systems for point-based land cover/use analysis. Geocarto Int. 34, 867–886.

Steppa, C., Holch, T.L., 2019. HexagDLy-Processing hexagonally sampled data with

CNNs in PyTorch. SoftwareX 9, 193-198.

Vazifeh, M.M., Santi, P., Resta, G., Strogatz, S.H., Ratti, C., 2018. Addressing the

minimum fleet problem in on-demand urban mobility. Nature 557, 534-538.

Vluymans, S., 2019. Learning from imbalanced data. Stud. Comput. Intell. 807, 81-

110.

Wang, D., Yang, Y., Ning, S., 2018. DeepSTCL: A Deep Spatio-temporal

ConvLSTM for Ride-hailing demand Prediction, in: 2018 International Joint

Conference on Neural Networks (IJCNN). IEEE, pp. 1-8.

Wang, H., Yang, H., 2019. Ridesourcing systems: A framework and review. Transp.

Res. Part B Methodol. 129, 122–155.

Wu, X., Guo, J., Xian, K., Zhou, X., 2018. Hierarchical ride-hailing demand

estimation using multiple data sources: A forward and backward propagation

algorithmic framework on a layered computational graph. Transp. Res. Part C

Emerg. Technol. 96, 321-346.

Xu Z., Yin Y. and Ye J. 2020. On the supply curve of ride-hailing systems. Transp.

Res. Part B Methodol. 132, 29-43.

Yang, G., Wang, Y., Yu, H., Ren, Y., Xie, J., 2018. Short-Term Traffic State

Prediction Based on the Spatiotemporal Features of Critical Road Sections.

Sensors 18, 2287.

Yu, H., Wu, Z., Wang, S., Wang, Y., Ma, X., 2017. Spatiotemporal recurrent

convolutional networks for traffic prediction in transportation networks. Sensors

(Switzerland) 17, 1-16.

Yuan, C., Yu, X., Li, D., Xi, Y., 2019. Overall Traffic Mode Prediction by VOMM

Approach and AR Mining Algorithm with Large-Scale Data. IEEE Trans. Intell.

Transp. Syst. 20, 1508-1516.

Zhang, C., Li, J., Zhao, Y., Li, T., Chen, Q., Zhang, X., Qiu, W., 2021. Problem of

data imbalance in building energy load prediction: Concept, influence, and

solution. Appl. Energy 297, 117139.

Zhang, J., Zheng, Y., Qi, D., Li, R., Yi, X., 2016. DNN-based prediction model for

spatio-temporal data, in: Proceedings of the 24th ACM SIGSPATIAL

International Conference on Advances in Geographic Information Systems -

GIS ’16. ACM Press, New York, New York, USA, pp. 1-4.

Zheng, W., Lee, D.H., Shi, Q., 2006. Short-term freeway traffic flow prediction:

Bayesian combined neural network approach. J. Transp. Eng. 132, 114-121.

Zhu, Z., Tang, L., Xiong, C., Chen, X., Zhang, L., 2019. The conditional probability

of travel speed and its application to short-term prediction. Transp. B, 7(1), 684-

706.

Zhu Z., Ke J. and Wang H. 2021. A mean-field Markov decision process model for

spatial-temporal subsidies in ride-sourcing markets. Transp. Res. Part B

Methodol. 150, 540-565.

Manuscript submitted to Transportation Research part C Chen et al.

23 / 23

A Survey of Machine Learning-Based Ride-Hailing Planning

Preprint

Full-text available

Mar 2023

Ride-hailing is a sustainable transportation paradigm where riders access door-to-door traveling services through a mobile phone application, which has attracted a colossal amount of usage. There are two major planning tasks in a ride-hailing system: (1) matching, i.e., assigning available vehicles to pick up the riders, and (2) repositioning, i.e., proactively relocating vehicles to certain locations to balance the supply and demand of ride-hailing services. Recently, many studies of ride-hailing planning that leverage machine learning techniques have emerged. In this article, we present a comprehensive overview on latest developments of machine learning-based ride-hailing planning. To offer a clear and structured review, we introduce a taxonomy into which we carefully fit the different categories of related works according to the types of their planning tasks and solution schemes, which include collective matching, distributed matching, collective repositioning, distributed repositioning, and joint matching and repositioning. We further shed light on many real-world datasets and simulators that are indispensable for empirical studies on machine learning-based ride-hailing planning strategies. At last, we propose several promising research directions for this rapidly growing research and practical field.

Order allocation strategy for online car-hailing platform in the context of multi-party interests

Article

Aug 2023
ADV ENG INFORM

As an emerging internet service, online car-hailing allows users to do the car-hailing via their mobile phone instead of calling cars on the road. Though the convenient service provided by online car-hailing platforms, the following problems encountered in the process of real-time order-matching still cannot be well addressed: (1) the imbalance of the accessible cars in different areas, and (2) the conflict of interests among the drivers, platforms, and passengers. To address those problems, this paper proposes a set of order allocation strategy that fully consider the changing regional characteristics and the interests of multiple parties. Firstly, a dynamic region division method is proposed to identify the corresponding area types for real-time car-hailing. Then, a novel order allocation strategy with the regional multi-party subsidy mechanism is developed for different types of areas. Lastly, to help obtain a better performance for multiple parties, a parameter optimization method for the multi-party subsidy mechanism is introduced. The proposed order allocation strategy is evaluated on the real-world dataset sourced from an online car-hailing company. Our experiments verify the feasibility and effectiveness of the proposed method, without reducing drivers’ benefits, the order response time and order response rate are greatly improved, and the cost in all is acceptable for passengers and the platform. It is illustrated that a suitable regional multi-party subsidy mechanism could help online car-hailing platforms obtain higher passenger satisfaction than that without a regional multi-party subsidy mechanism.

Spatiotemporal Clustering of Parking Lots at the City Level for Efficiently Sharing Occupancy Forecasting Models

Article

Full-text available

May 2023
SENSORS-BASEL

This study aims to address the challenge of developing accurate and efficient parking occupancy forecasting models at the city level for autonomous vehicles. Although deep learning techniques have been successfully employed to develop such models for individual parking lots, it is a resource-intensive process that requires significant amounts of time and data for each parking lot. To overcome this challenge, we propose a novel two-step clustering technique that groups parking lots based on their spatiotemporal patterns. By identifying the relevant spatial and temporal characteristics of each parking lot (parking profile) and grouping them accordingly, our approach allows for the development of accurate occupancy forecasting models for a set of parking lots, thereby reducing computational costs and improving model transferability. Our models were built and evaluated using real-time parking data. The obtained correlation rates of 86% for the spatial dimension, 96% for the temporal one, and 92% for both demonstrate the effectiveness of the proposed strategy in reducing model deployment costs while improving model applicability and transfer learning across parking lots.

On ride-sourcing services of electric vehicles considering cruising for charging and parking

Article

Mar 2023
TRANSPORT RES D-TR E

Cruising of electric ride-sourcing vehicles (ERVs) when waiting for trip orders can create additional vehicle miles, which increase congestion and waste electricity. Reducing cruising is an important issue. This study investigates the strategy of allocating a portion of road space as parking for ERVs. Considering ERVs cruising for parking/charging, we analytically examine the trade-off between road capacity reduction due to reserving road space as parking and less cruising. We evaluate the effects of parking provision on reducing congestion and charging demand. We also investigate the optimal fare and fleet size of ERV services to achieve profit or social welfare maximization. Numerical studies indicate that vehicles cruising for charging might be reduced significantly with a mild increase of charging pile supply, where cruising can increase sharply after charging pile occupancy rate is at critical levels. By providing parking to ERVs, ride-sourcing demand increases, charging demand reduces, profit and social welfare increase.

A Survey of Machine Learning-Based Ride-Hailing Planning

Article

Jun 2024

Ride-hailing is a sustainable transportation paradigm where riders access door-to-door traveling services through a mobile phone application, which has attracted a colossal amount of usage. There are two major planning tasks in a ride-hailing system: 1) matching, i.e., assigning available vehicles to pick up the riders; and 2) repositioning, i.e., proactively relocating vehicles to certain locations to balance the supply and demand of ride-hailing services. Recently, many studies of ride-hailing planning that leverage machine learning techniques have emerged. In this article, we present a comprehensive overview on latest developments of machine learning-based ride-hailing planning. To offer a clear and structured review, we introduce a taxonomy into which we carefully fit the different categories of related works according to the types of their planning tasks and solution schemes, which include collective matching, distributed matching, collective repositioning, distributed repositioning, and joint matching and repositioning. We further shed light on many real-world data sets and simulators that are indispensable for empirical studies on machine learning-based ride-hailing planning strategies. At last, we propose several promising research directions for this rapidly growing research and practical field.

A data-driven framework for natural feature profile of public transport ridership: Insights from Suzhou and Lianyungang, China

Article

May 2024

A dynamic speed guidance method at on-ramp merging areas of urban expressway considering driving styles

Article

Feb 2024

Dynamic speed guidance for vehicles in on-ramp merging zones is instrumental in alleviating traffic congestion on urban expressways. To enhance compliance with recommended speeds, the development of a dynamic speed-guidance mechanism that accounts for heterogeneity in human driving styles is pivotal. Utilizing intelligent connected technologies that provide real-time vehicular data in these merging locales, this study proposes such a guidance system. Initially, we integrate a multi-agent consensus algorithm into a multi-vehicle framework operating on both the mainline and the ramp, thereby facilitating harmonized speed and spacing strategies. Subsequently, we conduct an analysis of the behavioral traits inherent to drivers of varied styles to refine speed planning in a more efficient and reliable manner. Lastly, we investigate a closed-loop feedback approach for speed guidance that incorporates the driver’s execution rate, thereby enabling dynamic recalibration of advised speeds and ensuring fluid vehicular integration into the mainline. Empirical results substantiate that a dynamic speed guidance system incorporating driving styles offers effective support for human drivers in seamless mainline merging.

Network-wide short-term inflow prediction of the multi-traffic modes system: An adaptive multi-graph convolution and attention mechanism based multitask-learning model

Article

Jan 2024

Book-ahead ride-hailing trip and its determinants: Findings from large-scale trip records in China

Article

Dec 2023

Multi-view dynamic graph convolution neural network for traffic flow prediction

Article

Mar 2023
EXPERT SYST APPL

Exploring the Impact of Spatiotemporal Granularity on the Demand Prediction of Dynamic Ride-Hailing

Article

Full-text available

Oct 2022

Dynamic demand prediction is a key issue in ride-hailing dispatching. Many methods have been developed to improve the demand prediction accuracy of an increase in demand-responsive, ride-hailing transport services. However, the uncertainties in predicting ride-hailing demands due to multiscale spatiotemporal granularity, as well as the resulting statistical errors, are seldom explored. This paper attempts to fill this gap and to examine the spatiotemporal granularity effects on ride-hailing demand prediction accuracy by using empirical data for Chengdu, China. A convolutional, long short-term memory model combined with a hexagonal convolution operation (H-ConvLSTM) is proposed to explore the complex spatial and temporal relations. Experimental analysis results show that the proposed approach outperforms conventional methods in terms of prediction accuracy. A comparison of 36 spatiotemporal granularities with both departure demands and arrival demands shows that the combination of a hexagonal spatial partition with an 800 m side length and a 30 min time interval achieves the best comprehensive prediction accuracy. However, the departure demands and arrival demands reveal different variation trends in the prediction errors for various spatiotemporal granularities.

Examine the Prediction Error of Ride-Hailing Travel Demands with Various Ignored Sparse Demand Effects

Article

Full-text available

Apr 2022
J ADV TRANSPORT

The accurate short-term travel demand predictions of ride-hailing orders can promote the optimal dispatching of vehicles in space and time, which is the crucial issue to achieve sustainable development of such dynamic demand-responsive service. The sparse demands are always ignored in the previous models, and the uncertainties in the spatiotemporal distribution of the predictions induced by setting subjective thresholds are rarely explored. This paper attempts to fill this gap and examine the spatiotemporal sparsity effect on ride-hailing travel demand prediction by using Didi Chuxing order data recorded in Chengdu, China. To obtain the spatiotemporal characteristics of the travel demand, three hexagon-based deep learning models (H-CNN-LSTM, H-CNN-GRU, and H-ConvLSTM) are compared by setting various threshold values. The results show that the H-ConvLSTM model has better prediction performance than the others due to its ability to simultaneously capture spatiotemporal features, especially in areas with a high proportion of sparse demands. We found that increasing the minimum demand threshold to delete more sparse data improves the overall prediction accuracy to a certain extent, but the spatiotemporal coverage of the data is also significantly reduced. Results of this study could guide traffic operations in providing better travel services for different regions.

Robust Matching-Integrated Vehicle Rebalancing in Ride-Hailing System with Uncertain Demand

Article

Full-text available

Jun 2021
TRANSPORT RES B-METH

With the rapid growth of the mobility-on-demand (MoD) market in recent years, ride-hailing companies have become an important element of the urban mobility system. There are two critical components in the operations of ride-hailing companies: driver-customer matching and vehicle rebalancing. In most previous literature, each component is considered separately , and performances of vehicle rebalancing models rely on the accuracy of future demand predictions. To better immunize rebalancing decisions against demand uncertainty, a novel approach, the matching-integrated vehicle rebalancing (MIVR) model, is proposed in this paper to incorporate driver-customer matching into vehicle rebalancing problems to produce better rebalancing strategies. The MIVR model treats the driver-customer matching component at an aggregate level and minimizes a generalized cost including the total vehicle miles traveled (VMT) and the number of unsatisfied requests. For further protection against uncertainty , robust optimization (RO) techniques are introduced to construct a robust version of the MIVR model. Problem-specific uncertainty sets are designed for the robust MIVR model. The proposed MIVR model is tested against two benchmark vehicle rebalancing models using real ride-hailing demand and travel time data from New York City (NYC). The MIVR model is shown to have better performances by reducing customer wait times compared to benchmark models under most scenarios. In addition, the robust MIVR model produces better solutions by planning for demand uncertainty compared to the non-robust (nominal) MIVR model.

Can dynamic ride-sharing reduce traffic congestion?

Article

Full-text available

Feb 2021
TRANSPORT RES B-METH

Can dynamic ride-sharing reduce traffic congestion? In this paper we show that the answer is yes if the trip density is high, which is usually the case in large-scale networks but not in medium-scale networks where opportunities for sharing in time and space be- come rather limited. When the demand density is high, the dynamic ride-sharing system can significantly improve traffic conditions, especially during peak hours. Sharing can compensate extra travel distances related to operating a mobility service. The situation is entirely different in small and medium-scale cities when trip shareability is small, even if the ride-sharing system is fully optimized based on the perfect demand prediction in the near future. The reason is simple, mobility services significantly increase the total travel distance, and sharing is simply a means of combating this trend without eliminating it when the trip density is not high enough. This paper proposes a complete framework to represent the functioning of the ride-sharing system and multiple steps to tackle the curse of dimensionality when solving the problem. We address the problem for two city scales in order to compare different trip densities. A city scale of 25 km2 with a total market of 11,235 shareable trips for the medium-scale network and a city scale of 80 km 2 with 205,308 demand for service vehicles for the large-scale network over a 4-hour period with a rolling horizon of 20 minutes. The solutions are assessed using a dynamic trip-based macroscopic simulation to account for the congestion effect and dynamic travel times that may influence the optimal solution obtained with predicted travel times. This outperforms most previous studies on optimal fleet management that usually consider constant and fully deterministic travel time functions.

A Mean-Field Markov Decision Process Model for Spatial-Temporal Subsidies in Ride-Sourcing Markets

Article

Aug 2021
TRANSPORT RES B-METH

Ride-sourcing services are increasingly popular because of their ability to accommodate on-demand travel needs. A critical issue faced by ride-sourcing platforms is the supply-demand imbalance, as a result of which drivers may spend substantial time on idle cruising and picking up remote passengers. Some platforms attempt to mitigate the imbalance by providing relocation guidance for idle drivers who may have their own self-relocation strategies and decline to follow the suggestions. Platforms then seek to induce drivers to system-desirable locations by offering them subsidies. This paper proposes a mean-field Markov decision process (MF-MDP) model to depict the dynamics in ride-sourcing markets with mixed agents, whereby the platform aims to optimize some objectives from a system perspective using spatial-temporal subsidies with predefined subsidy rates, and a number of drivers aim to maximize their individual income by following certain self-relocation strategies. To solve the model more efficiently, we further develop a representative-agent reinforcement learning algorithm that uses a representative driver to model the decision-making process of multiple drivers. This approach is shown to achieve significant computational advantages, faster convergence, and better performance. Using case studies, we demonstrate that by providing some spatial-temporal subsidies, the platform is able to well balance a short-term objective of maximizing immediate revenue and a long-term objective of maximizing service rate, while drivers can earn higher income.

Problem of data imbalance in building energy load prediction: Concept, influence, and solution

Article

Sep 2021
APPL ENERG

Building energy systems work under wide-scale operation conditions. The available data from some conditions might be far less than the data from the other conditions seriously. This is the so-called data imbalance problem, that is, the volumes of data are different for various conditions. This problem is always ignored in the field of building energy load prediction. Three questions remain unclear: how to identify various building operation conditions, how this problem affects the prediction accuracy, and how to overcome this problem. With the aim of addressing the above three questions, at first, this study proposes a clustering decision tree algorithm to identify the building operation conditions. Then, the effects of data imbalance are investigated by changing the proportions of model training samples from various operation conditions. Finally, a clustering decision tree-based multi-model prediction method is proposed to solve the data imbalance problem. The one-year historical operational data from a public building are utilized to validate the multi-model method. The results show that the proposed method has better prediction performance than the conventional single model-based method. It decreases the mean absolute errors of energy load prediction using artificial neural networks, gradient boosting trees, random forests, and support vector regression by 9.83%, 6.71%, 1.32%, and 12.22% on average, respectively. In addition, it increases the coefficients of determination of energy load prediction using the four algorithms by 8.47%, 4.59%, 0.26%, and 13.99% on average, respectively.

Analysis of ride-sharing with service time and detour guarantees

Article

Oct 2020
TRANSPORT RES B-METH

This paper explores whether upper bound guarantees to detour distances can be introduced in ride sharing services. By ride sharing we mean taxi ride aggregation services such as Uber-Pool. The paper develops an analytical model that for a given demand relates the guarantee levels to (i) the percent of rides that can be matched, (ii) the expected vehicle distance traveled; (iii) the expected passenger distance traveled; (iv) the fleet size required, and (v) the average passenger trip time including waiting and riding. The formulas developed reveal that for the full range of feasible fleet sizes, ridesharing with detour distance guarantees outperforms both ordinary ride-sharing and ordinary taxi. This suggests that there is a business opportunity that is not currently being exploited.

Dynamic optimization strategies for on-demand ride services platform: Surge pricing, commission rate, and incentives

Article

Aug 2020
TRANSPORT RES B-METH

On-demand ride services reshape urban transportation systems, human mobility, and travelers' mode choice behavior. Compared to the traditional street-hailing taxi, an on-demand ride services platform analyzes ride requests of passengers and coordinates real-time supply and demand with dynamic operational strategies in the ride-sourcing market. To test the impact of dynamic optimization strategies on the ride-sourcing market, this paper proposes a dynamic vacant car-passenger meeting model. In this model, the accumulative arrival rate and departure rate of passengers and vacant cars determine the waiting number of passengers and vacant cars, while the waiting number of passengers and vacant cars in turn influence the meeting rate (which equals to the departure rate of both passengers and vacant cars). The departure rate means the rate at which passengers and vacant cars match up and start a paid trip. Compared with classic equilibrium models, this model can be utilized to characterize the influence of short-term variances and disturbances of current demand and supply (i.e., arrival rates of passengers and vacant cars) on the waiting numbers of passengers and vacant cars. Using the proposed meeting model, we optimize dynamic strategies under two objective functions, i.e., platform revenue maximization, and social welfare maximization, while the driver's profit is guaranteed above a certain level. We also propose an algorithm based on approximate dynamic programming (ADP) to solve the sequential dynamic optimization problem. The results show that our algorithm can effectively improve the objective function of the multi-period problem, compared with the myopic algorithm. A broader range of surge pricing and commission rate and the introduction of incentives are helpful to achieve better optimization results. The dynamic optimization strategies help the on-demand ride services platform efficiently adjust supply and demand resources and achieve specific optimization goals.

Spatiotemporal influence of land use and household properties on automobile travel demand

Article

Jul 2020
TRANSPORT RES D-TR E

Understanding the patterns of automobile travel demand can help formulate policies to alleviate congestion and pollution. This study focuses on the influence of land use and household properties on automobile travel demand. Car license plate recognition (CLPR) data, point-of-interest (POI) data, and housing information data were utilized to obtain automobile travel demand along with the land use and household properties. A geographically and temporally weighted regression (GTWR) model was adopted to deal with both the spatial and temporal heterogeneity of travel demand. The spatial-temporal patterns of GTWR coefficients were analyzed. Also, comparative analyses were carried out between automobile and total person travel demand, and among travel demand of taxis, heavily-used private cars, and total automobiles. The results show that: (I) The GTWR model has significantly higher accuracy compared with the Ordinary Least Square (OLS) model and the Geographically Weighted Regression (GWR) model, which means the GTWR model can measure both the spatial and temporal heterogeneity with high precision; (II) The influence of built environment and household properties on automobile travel demand varies with space and time. In particular, the temporal distribution of regression coefficients shows significant peak phenomenon; and (III) Comparative analyses indicate that residents’ preference for automobiles over other travel modes varies with their travel purpose and destination. The above findings indicate that the proposed method can not only model spatial-temporal heterogeneous travel demand, but also provide a way to analyze the patterns of automobile travel demand.

Short-Term Traffic Forecasting by Mining the Non-Stationarity of Spatiotemporal Patterns

Article

May 2020

Short-term traffic forecasting is important for the development of an intelligent traffic management system. Critical to the performance of the traffic prediction model utilized in such a system is accurate representation of the spatiotemporal traffic characteristics. This can be achieved by integrating spatiotemporal traffic information or the dynamic traffic characteristics in the modeling process. The currently employed spatiotemporal k-nearest neighbor (STKNN) model is based on the spatial heterogeneity and adaptive spatiotemporal parameters of the traffic to improve the prediction accuracy. However, the non-stationary characteristics of the traffic cannot be fully represented by simply modeling the entire time range or all the time partitions based on experience. We therefore developed a dynamic STKNN model (D-STKNN) for short-term traffic forecasting based on the non-stationary spatiotemporal pattern of the road traffic. The different traffic patterns along the road are first automatically determined using an affinity propagation clustering algorithm. The Warped K-Means algorithm is then used to automatically partition the time periods for each traffic pattern. Finally, the D-STKNN model is developed based on the three-dimensional spatiotemporal tensor data models for the different road segments with different traffic patterns during different time periods. The D-STKNN model was verified through extensive experiments performed using actual vehicular speed datasets collected from city roads in Beijing, China, and expressways in California, U.S.A. The proposed model outperforms existing seven baselines in different time periods under different traffic patterns. The results confirmed the imperative of considering the non-stationary spatiotemporal traffic pattern in developing a model for short-term traffic prediction.

H-ConvLSTM-based bagging learning approach for ride-hailing demand prediction considering imbalance problems and sparse uncertainty

Abstract and Figures

Recommended publications

Exploring the impact of spatiotemporal granularity on the demand prediction of dynamic ride-hailing

Exploring the Impact of Spatiotemporal Granularity on the Demand Prediction of Dynamic Ride-Hailing

Examine the Prediction Error of Ride-Hailing Travel Demands with Various Ignored Sparse Demand Effec...

A residual spatio-temporal architecture for travel demand forecasting