PreprintPDF Available

Dynamic Demand Prediction for Expanding Electric Vehicle Sharing Systems: A Graph Sequence Learning Approach

March 2019

March 2019

Authors:

Hongkai Wen

University of Oxford

Luo Yi

Tongji University

Bowen Du

Tongji University

Show all 6 authorsHide

Preprints and early-stage research may not have been peer reviewed yet.

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the globe. During their fast expansion, one fundamental determinant for success is the capability of dynamically predicting the demand of stations as the entire system is evolving continuously. There are several challenges in this dynamic demand prediction problem. Firstly, unlike most of the existing work which predicts demand only for static systems or at few stages of expansion, in the real world we often need to predict the demand as or even before stations are being deployed or closed, to provide information and support for decision making. Secondly, for the stations to be deployed, there is no historical record or additional mobility data available to help the prediction of their demand. Finally, the impact of deploying/closing stations to the remaining stations in the system can be very complex. To address these challenges, in this paper we propose a novel dynamic demand prediction approach based on graph sequence learning, which is able to model the dynamics during the system expansion and predict demand accordingly. We use a local temporal encoding process to handle the available historical data at individual stations, and a dynamic spatial encoding process to take correlations between stations into account with graph convolutional neural networks. The encoded features are fed to a multi-scale prediction network, which forecasts both the long-term expected demand of the stations and their instant demand in the near future. We evaluate the proposed approach on real-world data collected from a major EV sharing platform in Shanghai for one year. Experimental results demonstrate that our approach significantly outperforms the state of the art, showing up to three-fold performance gain in predicting demand for the rapidly expanding EV sharing system.

The workflow of the proposed dynamic demand prediction approach.

…

The proposed multi-scale demand prediction network. Left: Decoder LSTM with attention mechanism for instant demand prediction. Right: Fully connected network for expected demand prediction.

…

Figures - uploaded by Bowen Du

Content may be subject to copyright.

Content uploaded by Bowen Du

Content may be subject to copyright.

Dynamic Demand Prediction for Expanding Electric Vehicle

Sharing Systems: A Graph Sequence Learning Approach

Man Luo1, Hongkai Wen1, Yi Luo2, Bowen Du1, Konstantin Klemmer1and Hongming Zhu2

1Department of Computer Science, University of Warwick, UK

2School of Software Engineering, Tongji University, China

{m.luo.1, hongkai.wen, b.du, k.klemmer}@warwick.ac.uk, {1731530, zhu_hongming}@tongji.edu.cn

ABSTRACT

Electric Vehicle (EV) sharing systems have recently experienced

unprecedented growth across the globe. Many car sharing service

providers as well as automobile manufacturers are entering this

competition by expanding both their EV eets and renting/returning

station networks, aiming to seize a share of the market and bring

car sharing to the zero emissions level. During their fast expan-

sion, one fundamental determinant for success is the capability of

dynamically predicting the demand of stations as the entire sys-

tem is evolving continuously. There are several challenges in this

dynamic demand prediction problem. Firstly, unlike most of the

existing work which predicts demand only for static systems or at

few stages of expansion, in the real world we often need to predict

the demand as or even before stations are being deployed or closed,

to provide information and support for decision making. Secondly,

for the stations to be deployed, there is no historical record or

additional mobility data available to help the prediction of their

demand. Finally, the impact of deploying/closing stations to the

remaining stations in the system can be very complex. To address

these challenges, in this paper we propose a novel dynamic demand

prediction approach based on graph sequence learning, which is

able to model the dynamics during the system expansion and pre-

dict demand accordingly. We use a local temporal encoding process

to handle the available historical data at individual stations, and

a dynamic spatial encoding process to take correlations between

stations into account with graph convolutional neural networks.

The encoded features are fed to a multi-scale prediction network,

which forecasts both the long-term expected demand of the sta-

tions and their instant demand in the near future. We evaluate the

proposed approach on real-world data collected from a major EV

sharing platform in Shanghai for one year. Experimental results

demonstrate that our approach signicantly outperforms the state

of the art, showing up to three-fold performance gain in predicting

demand for the rapidly expanding EV sharing system.

KEYWORDS

Electric Vehicle Sharing; Dynamic Demand Prediction; Expansion

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA

ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00

https://doi.org/10.1145/nnnnnnn.nnnnnnn

ACM Reference Format:

Man Luo

, Hongkai Wen

, Yi Luo

, Bowen Du

, Konstantin Klemmer

and Hongming Zhu

. 2019. Dynamic Demand Prediction for Expanding

Electric Vehicle Sharing Systems: A Graph Sequence Learning Approach.

In Proceedings of KDD’19. ACM, New York, NY, USA, Article 4, 9 pages.

https://doi.org/10.1145/nnnnnnn.nnnnnnn

1 INTRODUCTION

Car sharing services have long been recognised as an environmen-

tally friendly mobility option, reducing vehicles on the road while

cutting out unnecessary CO

emissions. With the recent advances

in battery technologies, a new generation of car sharing services is

going one step further, by oering full electric vehicle (EV) eets

with fast expanding infrastructures in major cities, e.g. Bluecity

in London, WeShare

in Berlin, and BlueSG

in Singapore. Tra-

ditional car sharing providers have also started to populate their

EV eets, e.g., ZipCar seeks to provide over 9,000 full electric ve-

hicles across London by 2025

. According to a recent study [

the global market of EV sharing services is poised for much faster

growth in the near future, due to the incentives and regulations put

in place by governments across the world to encourage overall EV

usages.

Despite their increased popularity, the practicality and utility

of EV sharing systems still rely heavily on the infrastructure at

renting/returning stations. In particular, for systems with the need

to rapidly expand their station networks, it is paramount to be

able to dynamically predict the accurate demand as or even before

implementing any expansion strategy. This is not only the key for

the stakeholders to make informed decisions as to where and when

to deploy new stations or close the poorly performing ones, but

also of great importance to the eective operation of currently used

stations, since understanding the potential impact of proposed ex-

pansion to their demand can provide valuable insights on a number

of vital tasks such as scheduling and rebalancing.

However, this dynamic demand prediction problem is not trivial,

especially in the context of fast expanding EV sharing systems. Most

of the existing work on demand prediction [

] as-

sumes the stations in the system are static, or only predicts demand

after xed expansion stages [

]. These assumptions often collapse

in the real world. Fig. 1(a)-(c) visualise the expansion process of a

major EV sharing platform in Shanghai during 2017. We see that in

the beginning stations are scattered within limited areas, while at

the end of 2017 the entire city has been densely covered. As shown

in Fig. 1(d), within just 12 months the total number of stations

1https://www.blue-city.co.uk

2https://www.volkswagenag.com/en/news/2018/08/VW_Brand_We_Share.html

3https://www.bluesg.com.sg

4https://www.zipcar.co.uk/electric

arXiv:1903.04051v1 [cs.AI] 10 Mar 2019

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA M. Luo et al.

(a) Station distribution in Jan.

New stations

since Jan.

(b) Station distribution in Jul.

New stations

since July

100

200

300

400

500

600

500

1000

1500

2000

2500

3000

Jan FebMar Apr May Jun Jul Aug Sep Oct NovDec

# of stations

# of stations in operation

Closed Stations

Newly Deployed Stations

Net Increase

(d) Statistics of stations.

Figure 1: The expansion process of an EV sharing system in Shanghai during the year 2017. Images better viewed in colour.

New station

(a)

1000

2000

3000

Apr May Jun Jul Aug

# of Orders

ABCNew station

(b)

New station

(c)

2000

4000

6000

Feb Mar Apr May Jun

# of Orders

EFGNew station

(d)

Figure 2: Dierent types of impact when deploying new stations to the current station network.

in operation has doubled, as each month there are continuously

hundreds of stations being deployed. In this context, predicting

demand at those newly deployed or to be deployed stations is very

challenging, since there is no sucient historical data available as

prior knowledge.

On the other hand, dynamics introduced by the expansion pro-

cess can have very complex impact on the entire system. For exam-

ple, as shown in Fig. 2, deploying stations at various places may

have completely dierent consequences. Obviously the new sta-

tion in Fig. 2(a) ‘steals’ demand from one of its neighbours in the

following months (station A, see the changes of their order num-

bers in Fig. 2(b)), because we found the new station was deployed

at a shopping centre which may attract more users. In contrast,

deploying the new station in Fig. 2(c) has increased orders of its

neighbour stations collectively (see Fig. 2(d)). This is because the

new station was deployed at the terminal on the east side of the

airport, and many users rent vehicles for convenient short-range

connections to/from stations E, F, and G which are on the west side.

This makes accurate demand prediction for the remaining stations

also very challenging in the presence of such dynamics, due to the

non-trivial impact caused by the expansion.

To address those challenges, in this paper we propose a novel

dynamic demand prediction approach, which models the expan-

sion of EV sharing systems using graph-based sequence learning,

and is able to predict the accurate demand of stations along with

the expansion process. Specically, for each station that comes in

operation, we employ a local temporal encoding process to capture

the correlations within the historical data. The extracted features

from all stations are then compiled by a dynamic spatial encoding

process, which considers the spatial dependencies between them

as multiple time-varying graphs, and fuses the station-level fea-

tures with Graph Convolutional Neural Networks (GCN). Based

on the encoded information and future expansion plan (i.e. which

stations to be deployed/closed), our prediction network predicts

station demand at multiple scales, from the instant demand in the

immediate near future, to the long term expected demand, for both

stations to be deployed and the ones remaining. Hence, the technical

contributions of this paper are as follows:

•

To the best of our knowledge, this is the rst work that

identies and formulates the dynamic demand prediction

problem in expanding electric vehicle sharing systems.

•

We propose a novel graph sequence learning approach, which

employs temporal and spatial encoding in tandem to model

the complex dynamics of the continuous system expansion.

•

We design a new multi-scale prediction network, which is

able to forecast not only the expected demand of stations in

the long term, but also the instant future demand in subse-

quent timestamps.

•

We evaluate the proposed approach on real-world data col-

lected from a major EV sharing platform for one year. Exten-

sive experiments have shown that our approach signicantly

outperforms the state of the art, oering up to three-fold

improvement in prediction accuracy.

2 PROBLEM FORMULATION

In this section, we rst introduce some key concepts used through-

out the paper, then we formulate the problem of dynamic demand

prediction and provide an overview of the proposed framework.

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA

2.1 Preliminaries

EV Stations:

Let

be a station in the Electric Vehicle (EV) sharing

system. In this paper, we assume

can be represented as a tuple

(xi,mi), where xiare the coordinates (e.g. latitude and longitude)

, and

is the number of charging docks within

. We also

assume that for a given

, we can extract a number of geospatial

features based on its location

, such as nearby Points of Interest

(POI) or the distribution of road networks within a certain radius.

Instant Station Demand:

We dene the instant demand of station

at timestamp

as the rent/return frequency of

when it is avail-

able, denoted as

di(t)

. In this paper the granularity of timestamp

days, i.e., we focus on daily station demand, but it is straightforward

to adopt other time granularity levels in our framework.

Expected Station Demand:

For a station

, the expected demand

over a period

[ts,te]

can be dened as the mean

di(ts,te)=

|te−ts|−1Íte

t=tsdi(t)

. In practice, we often consider the expected

demand from current time

towards the future, and aggregate it

according to some index, e.g., days of the week. Without loss of

generality, in this paper we denote the future expected demand of

station

di=[¯

dMo

i,¯

dTu

i, . . ., ¯

dSu

for dierent days of the week.

Station Network:

The stations of the EV sharing system can be

modelled as a graph

G=(S,A)

, where the nodes

si∈S

are stations

as dened above. An edge

ai j ∈A

may encode a certain type of

correlation between two stations

and

, e.g., the spatial distance

between them, or similarity between their POI/road network fea-

tures. Sec. 3 will discuss how our approach constructs multiple

graphs to capture such inter-station relationships in more details.

Station Network Dynamics:

Unlike existing work, in this pa-

per we assume the station network is continuously evolving over

time. More specically, let

Gt−1=(St−1,At−1)

represents the sta-

tion network at time

t−

1. We assume at time

t−

1, there is an

expansion plan to be implemented before time

, which shall ex-

pand the current station network from

Gt−1

to the planned network

t−1

. Let’s assume during this a set of new stations

S+

will be

deployed, while existing stations

S−

will be removed. If the ex-

pansion plan goes through, then at time

the station network

becomes

t−1

, where

Gt=(St,At)

St=(St−1−S−) ∪ S+

and

At=(At−1− {aij |si∈S−or sj∈S−}) ∪ {ai j |si∈S+or sj∈S+}.

2.2 Dynamic Demand Prediction Problem

Suppose that at time

, we have the previous topology

G1, . . .Gt

and

demand

D1, . . ., Dt

of the station network, where

Dt={di(t)|si∈

Gt}

. Let

be the planned station network. The dynamic demand

prediction problem tackled in this paper is that given the historical

data, for an arbitrary station in the planned network

si∈GP

(deployed or not yet deployed) we aim to estimate both its expected

future demand

and the subsequent

instant demand

[ˆ

di(t+

),ˆ

di(t+

), . . ., ˆ

di(t+k)]

, which minimise the mean square errors

with respect to the ground truth ¯

diand di:

δ¯

=|¯

di|−1∥ˆ

di−¯

di∥2, and δdi=k−1

t+k

τ=t+1

∥ˆ

di(τ) − di(τ)∥2(1)

In practice, the expected demand

can be viewed as a metric for

the long-term performance of stations

, e.g., if

is a station to

be deployed,

quanties the average level of demand it may be

LSTMs at

Individual Stations

Historical

Demand

Days of

Week

Weather

Condition

Events,

Holidays

Charging

Docks #

Local POI

Local

Road Net

Pairwise

Distance

POI Distr.

Sim.

Road Net

Sim.

Multi-graph

Construction

Dynamic GCN Context

Generation

Expected

Demand

Instant

Demand

Fully Connected

Network

Decoder LSTM

with Attention

…

… …

Local Temporal Encoding Dynamic Spatial Encoding Multi-scale Demand Prediction

Local features

Spatial

correlations

Figure 3: Overview of the proposed framework for dynamic

demand prediction.

able to attract. On the other hand, the sequence of instant demand

[ˆ

di(t+

),ˆ

di(t+

), . . ., ˆ

di(t+k)]

describes the immediate trend of

station demand under the impact of the expansion plan, which can

help to optimise key future operation strategies such as marketing

and resource allocation.

2.3 Framework Overview

Fig. 3 shows the overview of the proposed framework for dynamic

demand prediction, which consists of three major components:

Local Temporal Encoding:

During the life cycle of a station

(from being deployed to shut down), its demand can be viewed as a

time series, where the current demand

di(t)

should correlate with

the local historical demand

di(t−

), . . ., di(

)

. In addition, there

may exist other temporal factors that can inuence the demand

of individual stations, such as weather conditions, air pollution

levels, days of the week and public holidays etc. To model such

temporal dependencies, we assign a Long Short-Term Memory

(LSTM) network at each individual station when being deployed,

and use them to encode local temporal information at station level.

Dynamic Spatial Encoding:

Intuitively, the demand of a station

should also be aected by the others in the station network.

To capture the spatial correlations, at each time

we construct

multiple graphs to encode dierent spatial relationships between

the stations, e.g., inter-station distances, POI similarity, and road

network distributions. Then we use graph convolutional neural

networks (GCN) to fuse those graphs and encode the previously

computed local features of individual stations. In particular, as

the station network is evolving over time, we consider a dynamic

version of GCN which is able to process such time-varying graphs.

Multi-scale Demand Prediction:

Based on the results of the

above temporal and spatial encoding, we aim to predict both the

expected demand and subsequent instant demand of stations after

the planned expansion. To achieve that, we design a multi-scale

prediction network, which rstly compiles the previously learned

features into a context vector. For expected demand, it uses a fully

connected branch to perform the prediction, while on the other

hand, it considers a decoder LSTM network with attention mecha-

nism to forecast instant demand at multiple future timestamps.

We are now in a position to elaborate the proposed dynamic

demand prediction approach in more detail.

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA M. Luo et al.

Fully Connected

Decoder LSTM

Multi-graph

Dynamic GCN

Local LSTMs

t - 2 t - 1 t

Additional info.

Planned stations

Existing stations

Distance

POI

Road Net.

Distance

POI

Road Net.

Distance

POI

Road Net.

t + 1 t + 2 t + 3

Multi-scale

Prediction Networks Expected demand

Instant demand

…

…Planned stations

Existing stations

Planned stations

Existing stations

Figure 4: The workow of the proposed dynamic demand prediction approach.

3 METHODOLOGY

3.1 Local Temporal Encoding

Like in many other shared mobility systems, we observe that the

demand of stations in the EV sharing platform exhibits strong tem-

poral correlations, as shown later in Fig. 6(b). For instance, although

it uctuates largely over time, the demand at a station approximates

certain periodical patterns at dierent days across the week. In that

sense, exploiting such knowledge can help signicantly in estimat-

ing the accurate future demand of current stations, which will have

a positive knock-on eect when predicting demand for new stations

during expansion. However, those demand patterns are typically

inuenced by multiple complex factors such as weather, air quality

and events, and individual stations may react to those factors very

dierently. Therefore, it is often not optimal to only incorporate the

temporal information globally for the station network, but instead

in this paper we model such microdynamics at station level.

Concretely, when a station

is deployed, we instantiate a local

LSTM network which keeps processing its demand records and the

additional temporal information available, e.g. weather, days of the

week and public holiday/events. In our implementation, we train the

LSTMs with shared weights across stations. Then at a later time

the LSTM encodes the station’s historical demand

di(t),di(t−

), . . .

as well as the auxiliary information into a temporal feature

vector

fi(t)

. Moreover, in this paper we also condition

fi(t)

with a

static station feature

, which describes key attributes of

such

as its number of available charging docks

, nearby POIs and

environmental characteristics etc. Therefore,

fi(t)

and

carry

important local information about individual stations since they

started operating, which are then passed on as the input for spatial

encoding. Fig. 4 shows the workow of the proposed approach,

where we see that at each timestamp we maintain a collection of

local LSTMs to encode information of individual stations.

3.2 Dynamic Spatial Encoding

3.2.1 Constructing Multiple Graphs. As discussed in Sec. 2.1, at

a given time

we represent the station network as a graph

Gt=

(St,At)

, where

are the set of current stations and

is the

adjacent matrix describing the pairwise correlations between them.

In practice there are often more than one types of correlations,

which can’t be eectively captured by a single graph. Therefore in

this paper we construct multiple graphs to encode the complex inter-

station relationships, particularly the distance graph, the functional

similarity graph, and the road accessibility graph (see Fig. 4).

Distance:

In most cases, we observe that the demand of stations

close to each other are highly correlated, e.g. they may be deployed

around the same shopping centre, and thus tend to be used inter-

changeably. We capture such correlations with a distance graph

AD, whose elements are the reciprocal of station distance:

i j =∥xi−xj∥−1

2(2)

where

are the station coordinates, and

∥·∥2

is the Euclidean

distance. We also set diag (AD)to 1 to include self loops.

Functional Similarity:

Intuitively, stations deployed in areas with

similar functionalities should share comparable demand patterns.

For instance, stations close to university campuses typically have

signicantly higher demand during weekends. We characterise the

functionalities of stations by considering the distributions of their

surrounding POIs. Suppose we have

dierent categories of POIs

in total, and let

be the distribution of the

types of POIs within

a certain radius of station

. The functional similarity graph

then dened as:

i j =sim (pi,pj)(3)

where

sim () ∈ [

]

is a similarity measure which quanties the

distance between feature vectors. In our experiments, we use the

soft cosine function.

Road Accessibility:

Another important factor that aects station

demand is the accessibility to road networks. Intuitively, stations

close to major ring roads, or within areas that have densely con-

nected streets would have higher demand. To model this, we con-

sider the drivable streets in the vicinity of a station

as a local

road network, containing dierent types of road segments and their

junctions. We exact a feature vector

from the local road network,

which encodes information such as the road segments density, aver-

age junction degree and mean centrality etc. Given those features,

the road accessibility graph can be dened with certain similarity

function sim ():

i j =sim (ri,rj)(4)

3.2.2 Dynamic Multi-graph Convolution. At time

t−

1, given the

constructed graphs

At−1={AD

t−1,AF

t−1,AR

t−1}

which describe the

inter-station relationships, we propose a dynamic multi-graph GCN

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA

to fuse such spatial knowledge with local features

fi(t−

)

and

computed by the station-level temporal encoding. We perform

multi-graph convolution as follows:

H(l)

t−1=σÕ

At−1∈At−1

f(At−1)H(l−1)

t−1W(l−1)

t−1(5)

where

Hl−1

t−1

and

t−1

are the hidden features in layers

l−

1and

respectively, while

Wl−1

t−1∈RUl−1×Ul

is the feature transformation

matrix learned through end-to-end training. In particular, the in-

put

H(0)

t−1

is the collection of local features computed at individual

stations.

f(At−1)

is a function on graphs

At−1

, e.g. the symmet-

ric normalized Laplacian [

] or

-order polynomial function of

Laplacian [

], and

is a non-linear activation function such as

ReLU.

As discussed before, in our case the station network evolves over

time, i.e. new/existing stations can be opened/closed at any time.

For simplicity, suppose at

there is only one new station

has

been deployed. To capture that, we recalculate the inter-station

graphs

At−1

by appending new rows and columns to them, where

the new graphs

contain pairwise correlations between the new

and each existing stations. Note that the GCN input also changes,

i.e. now

H(0)

has an extra feature for this newly deployed station,

computed by the local encoding process.

On the other hand, let

be the station closed at time

. In our

implementation, instead of removing elements from the graphs,

we simply apply a mask of zeros to the corresponding rows and

columns of

, and set the

-th row of the input

H(0)

to zeros

since there won’t be local features generated from

anymore.

The intuition is that in our graph representation,

a:,j=

0means

station

has no correlation with any other station at all, and thus

won’t propagate information in the graph convolution. In addition,

note that although

f(At)

produces lters with the same size of the

feature

H(l)

at each layer

, Eq.

(5)

can still be viewed as a local

convolution given the graphs

. The reason is that by denition

many elements in

are near zero (e.g. in the distance graph

i.e. for a given station, it will be only aected by features of stations

with suciently high correlations (large non-zero elements in

)

with it. Conceptually, the dynamic GCN operates on snapshots of

the inter-station graphs which are constructed on-the-y, and fuses

the local temporal features at individual stations with the spatial

dependencies encoded in those graphs.

3.3 Multi-scale Demand Prediction

As discussed in Sec. 2.2, the dynamic prediction problem addressed

in this paper is to forecast the future demand of arbitrary EV sharing

stations under the planned expansion, given the historical data and

previous dynamics of the station network. We have shown in the

previous sections how we use local LSTM and GCN to encode the

spatial-temporal dynamics of the system, and in this section we

explain how to make predictions at multiple scales based on the

knowledge extracted from the models. Fig. 5 shows the architecture

of the proposed multi-scale demand prediction network.

3.3.1 Predicting Expected Demand. Let

be the planned station

network at time

. Without loss of generality, we assume that com-

paring to the current network

we will deploy a candidate new

su-1 su

D(u)D(u-1)

……

Ht’Ht-1Ht-2

…

au,… au,t-2 au,t-1 au,t

Seq. of Instant Demand Expected Demand

Seq. of Encoded Features

Attention

Decoder LSTM

FC Network

Figure 5: The proposed multi-scale demand prediction net-

work. Left: Decoder LSTM with attention mechanism for in-

stant demand prediction. Right: Fully connected network

for expected demand prediction.

station

, while we close an existing one

. The goal is to predict

the future demand of stations in

. We process this planned net-

work

with the same approach as discussed in previous sections.

Note that there is no historical data for station

since it is not

deployed yet, and therefore in local encoding we only construct

its static features

csN

, while keeping

fsN(t)

as zeros. Then we ap-

ply the same update to the inter-station graphs as discussed in

Sec. 3.2.2 (adding and masking the corresponding rows/columns),

and pass the new input

H(0)′

(containing features of

) through

the multi-graph GCN, producing an output

H′

. We consider this

H′

as the context for prediction, since it encodes both historical

data of existing stations and information on the new candidate

station sN, together with their spatial correlations.

In this paper, we consider the expected demand of station

over

dierent days of the week, i.e.

di=[¯

dMo

i,¯

dTu

i, . . ., ¯

dSu

. To predict

, we plug in a fully connected network to the context vector

H′

which is trained to output the future expected demand for each sta-

tion in the network

. For the station

, the predicted expected

demand of itself and nearby stations indicate the potential benets

of deploying

to the current station network. In Sec. 4.3 we will

show that in real-world experiments our approach signicantly

outperforms the existing techniques in prediction accuracy.

3.3.2 Predicting Instant Demand. We also predict the future instant

demand of stations in

over a certain time window

[t+

, . . ., t+k]

This is also of great importance in practice, especially for the station

, since it forecasts the immediate impact and future trends of

the station network once

is in operation. However it is more

challenging than predicting the expected demand, because essen-

tially for each station we need to predict a sequence of

concrete

demand instead of the aggregated values.

To address that, we design a decoder LSTM network with atten-

tion architecture, which takes the sequence of features computed

by the dynamic multi-graph GCN as input, and estimates the fu-

ture

instant demand. In this case, conceptually the prediction

framework becomes an encoder-decoder architecture, where the

processes of local temporal encoding and dynamic spatial encod-

ing serve together as the encoder. Let

[Ht−n, .. ., Ht−1,H′

be the

sequence of features generated by our GCN. Unlike in the previ-

ous case where we only consider the last output feature

H′

as the

context for prediction, here for each timestamp

in the prediction

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA M. Luo et al.

window

[t+

, . . ., t+k]

, we construct the context by fusing the

feature sequence with attention mechanism:

Cu=

v=t−n

αuvHv(6)

where

αuv

are the attention weights determining the contribution

of a feature

(

v∈ [t−n,t]

) in predicting the demand at time

t+u

Those weights

αuv

are trained through back propagation in the end-

to-end optimisation. Then the decoder LSTM consumes the context

vectors and predicts the

subsequent future demand. We found

in our experiments that the attention mechanism is very helpful,

since the station demand patterns tend to have strong periodic

components, e.g., demand on this Monday is highly correlated with

previous Mondays, and a single context vector is too compressed

to encode such correlation.

4 EVALUATION

In this section, we evaluate the performance of the proposed dy-

namic demand prediction approach on a real electric vehicle sharing

platform in Shanghai, China. We describe the datasets and baseline

approaches considered in our experiments (Sec. 4.1 and 4.2), and

then discuss the experimental results in Sec. 4.3.

4.1 Datasets

Electric Vehicle (EV) Sharing Data:

Our EV data is collected

from real-world operational records of an EV sharing platform

for one year (Jan. to Dec. 2017), containing information on its

renting/returning orders, and the detailed expansion process of the

station network. In particular, there were 1705 stations and 4725

electric vehicles at the beginning of 2017, while as of Dec 2017 it

had 3127 stations with a eet of 16148 vehicles in operation. In total,

the raw data contains 6,843,737 records, which were generated by

approximately 0.36 million users. Fig. 6(a) visualises the spatial

distribution of the orders (represented as lines between pick up and

return stations) in a month. Fig. 6(b) shows the number of orders in

dierent days over a month, which exhibits clear periodic patterns

with peaks in weekends.

POI Data:

We also collect the Point Of Interest (POI) date from an

online map service provider in China. In total we have extracted

4,126,844 POI entries in Shanghai, each of which consists of a GPS

coordinate and a category label. In our experiments, for each station

we consider the POIs within 1km radius. Table. 1 shows the statistics

of some POI categories.

Road Network Data:

We extract road network data in Shanghai

using OSMnx [

] from OpenStreetMap, which is formatted as a

graph (visualised in Fig. 6(c)). Similar with the POIs, we consider

the subgraphs within 1km radius of the stations. In our data, on

average a subgraph contains road segments of length 13.85km and

approximately 39 junctions, with a mean degree of 4.28.

Meteorology Data:

Finally, we collect the daily weather data

in Shanghai for 2017 from the publicly available sources. Each

record describes weather conditions of the day, which falls into

four dierent categories: sunny,overcast/foggy,drizzling/light snow

and heavy rain/snow. Fig. 6(d) shows the distribution of weather

conditions in Shanghai over the 12 months.

POI Type Number POI Type Number

Hospitals 4745 Banks 2988

Tourist attractions 2696 Companies 89,747

Gov. organizations 16,425 Higher education 6922

Airport services 126 Residences 51,089

Subway stations 1,729 Hotels 18,234

Bus stations 41,475 ... ...

Table 1: Statistics of some POI categories in our data.

4.2 Baselines and Metric

We evaluate two variants of the proposed dynamic demand predic-

tion approach respectively: 1)

DDP-Exp

, which predicts the future

expected demand of stations; and 2)

DDP-Seq

, which forecasts the

instant demand of stations in a subsequent time window. Both of

the two variants share the same local temporal and dynamic spatial

encoding processes, but they implement the two dierent branches

in demand prediction (as discussed in Sec. 3.3).

In particular, we compare our DDP-Exp with the following base-

lines:

KNN

, which uses a linear regressor to predict the expected demand

of existing stations. For the planned stations, it estimates their

demand with standard KNN, based on the similarity of features (e.g.

POIs) between them and the existing stations.

Random Forest (RF)

, which shares the similar idea as KNN, but

trains a random forest as the predictor.

Functional Zone (FZ)

, which implements the state of the art de-

mand prediction approach for system expansion in [

]. Note that

we don’t have taxi records in our data, but instead we directly feed

the ground truth check-in/out to favour this approach.

For DDP-Seq which computes the instant demand, we consider

three competing algorithms:

ARIMA + KNN

, which uses Auto-Regressive Integrated Moving

Average (ARIMA) [

] to forecast multi-step demand at existing

stations, and then uses KNN to estimate demand at new station

based on station features such as POIs.

LSTM + KNN

, which is similar with A-KNN, but trains LSTM

networks for temporal modelling.

Multi-graph GCN (MGCN)

, which implements a similar frame-

work as the state of the art in [

]. To perform fair comparison, here

we use our dynamic multi-graph GCN implementations that can

handle new/closed stations, and consider the same data sources as

in our approach.

For all approaches, we adopt the Root Mean Squared Error

(RMSE) and the Error Rate (ER) as the performance metric:

RMSE =v

i=1

(ˆ

zi−zi)2, and ER =ÍN

i=1|ˆ

zi−zi|

ÍN

i=1zi

(7)

where

and

are predicted and ground truth values respectively.

We implement the deep neural networks in the proposed ap-

proach with TensorFlow 1.10.0, and use the Adam optimiser with

learning rate of 0.001. The networks are trained on a single Titan

X GPU from scratch. For all approaches, we randomly select two

months of data for training while the subsequent month for testing,

and report the average performance.

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA

(a)

2000

4000

6000

8000

10000

12000

14000

16000

18000

0510 15 20 25 30

Day

# of Orders

(b) (c)

0.0

0.2

0.4

0.6

0.8

1.0

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Percentage

Sunny Overcast/Foggy

Drizzling/Light Snow Heavy Rain/Snow

(d)

Figure 6: Visualisation of data used in the experiments. (a) Spatial distribution of orders in one month. (b) Number of orders

in one month. (c) Road network in Shanghai. (d) Weather distribution of Shanghai in 2017.

Mon Tue Wed Thu Fri Sat Sun

RMSE

KNN RF FZ DDP-Exp

(a)

0.0

0.2

0.4

0.6

0.8

1.0

Mon Tue Wed Thu Fri Sat Sun

Error Rate

KNN RF FZ DDP-Exp

(b)

Existing New All

RMSE

KNN RF FZ DDP-Exp

(c)

0.0

0.2

0.4

0.6

0.8

1.0

Existing New All

Error Rate

KNN RF FZ DDP-Exp

(d)

Figure 7: Performance on predicting the expected demand. (a) RMSE and (b) ER of all stations across dierent days in the week.

ARIMA

+KNN

LSTM

+KNN

MGCN DDP-Seq

w/o att.

DDP-Seq

w att.

RMSE

(a)

0.00

0.25

0.50

0.75

1.00

ARIMA

+KNN

LSTM

+KNN

MGCN DDP-Seq

w/o att.

DDP-Seq

w att.

Error Rate

(b)

Figure 8: Performance on predicting the instant demand. (a)

RMSE and (b) ER of the competing approaches.

4.3 Evaluation Results

Accuracy of Predicting Expected Demand:

The rst set of ex-

periments evaluate the overall accuracy when predicting the ex-

pected demand of stations. Fig. 7(a) and (b) show the RMSE and ER

of the proposed approach (DDP-Exp) and competing algorithms

over dierent days of the week. We see that comparing to naive

KNN, the random forest based approach (RF) can reduce the RMSE

by about 30% while ER by 20%. However, our approach (DDP-Exp)

performs signicantly better, and can achieve up to three times

improvement in both RMSE and ER. In particular, on average the

RMSE of DDP-Exp is approximate 1.961, which means when pre-

dicting the station’s expected demand, the value estimated by our

approach is only about

2 with respect to the ground truth. This

t+1 t+2 t+3 t+4 t+5 t+6 t+7

RMSE

ARIMA+KNN LSTM+KNN

MGCN DDP-Seq

(a)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

t+1 t+2 t+3 t+4 t+5 t+6 t+7

Error Rate

ARIMA+KNN LSTM+KNN

MGCN DDP-Seq

(b)

Figure 9: (a) RMSE and (b) ER of the predicted instant de-

mand for dierent prediction lengths.

conrms that the proposed approach can eectively model the com-

plex temporal and spatial dependencies within the evolving station

network, and exploits that to make more accurate predictions. In

addition, we observe that the RMSE tends to increase on weekends

compared to weekdays for all algorithms. This is because in practice

the absolute demand on weekends is larger, which often leads to

bigger RMSE. Note that the ER remains relatively consistent across

dierent days.

Planned vs. Existing Stations:

This experiment investigates the

prediction performance of dierent approaches on the planned

stations which haven’t been deployed yet, and existing stations

which have already been in operation. Fig. 7(c) and (d) show the

average RMSE and ER of the proposed approach (DDP-Exp) and

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA M. Luo et al.

2.0

2.5

3.0

3.5

4.0

0.0 0.2 0.4 0.6 0.8 1.0

RMSE

Exp-Demand Inst-Demand

(a)

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Error Rate

Exp-Demand Inst-Demand

(b)

Figure 10: Sensitivity of our approach with dierent levels

of augmented station network dynamics.

the competing algorithms on the planned, existing, and all stations

respectively. We see that all of approaches perform better on the

existing stations than the planned. This is expected because for

existing stations we have access to their historical demand data,

which is not available for planned stations. We also observe that

although the functional zone based approach (FZ) performs better

than the baselines for the planned stations, it fails on the existing

stations (performs worse than RF). This is because by design FZ is

tuned to predict demand of new stations in the context of system

expansion, but not for existing ones. Finally, we see that for both

planned and existing stations our approach (DDP-Exp) performs

consistently the best. For the planned stations, it halves the errors

comparing to the state of the art approach FZ, while for the existing

stations, it oers about three-fold improvement over the baselines.

Accuracy of Predicting Instant Demand:

This set of experi-

ments evaluates the performance of dierent approaches when

predicting the future instant demand. Here we only consider the

planned stations, since it is straightforward to predict for the ex-

isting stations given their historical data. We ask all approaches to

predict the instant demand over the next seven days, and report the

average accuracy. Fig. 8 shows the RMSE and ER of the proposed

approach (DDP-Seq) and the competing algorithms. We see that

in this very challenging case, our approach (DDP-Seq) can still

achieve an average RMSE of 2.903, which is over 30% lower than

the baselines (similar gap can be observed in ER). It is also superior

to the state of the art MGCN approach which also uses multi-graph

GCN, with about 20% reduction in RMSE and ER. This conrms that

even for the planned stations without historical data, our approach

can still accurately predict their future instant demand within a

certain time window. In addition, we nd that the attention mech-

anism in our approach is very eective. Without using attention

architecture in the decoder, the performance of our approach drops

by approximately 15%, which is still better than the state of the art.

Accuracy vs. Prediction Length:

This experiment studies the

accuracy of competing approaches when predicting instant demand

over dierent time intervals. As in the previous experiment, here we

also only consider prediction performance for the planned station.

We vary the length of the prediction time window from 1 to 7, i.e.

from predicting the demand of stations on the immediate next day

1, to that on the subsequent seven days

7. Fig. 9 shows the

RMSE and ER of the approaches under dierent time windows. We

observe that in general, the RMSE increases as the length of the

time window grows, especially for our approach (DDP-Seq) and the

state of the art MGCN. This makes sense because clearly predicting

demand over a longer time window is more dicult. On the other

hand, we see that the ER of baselines are higher for short window

lengths comparing to the MGCN or our approach. We nd that

this is because the baselines tend to report random estimations

on the future demand, where for shorter windows this can lead to

larger ER, but will be averaged out for longer time windows as the

ground truth demand grows in later days. Finally, we see that MGCN

can oer comparable performance with our approach (DDP-Seq)

when predicting for the immediate next timestamp. However as the

prediction length increases, our approach consistently outperforms

MGCN, with a performance gap of up to 26%.

Augmentation of Station Network Dynamics:

The last set of

experiments investigates the impact of augmenting station network

dynamics during training. As discussed in previous sections, one of

the key challenges addressed in this paper is to forecast the demand

of planned stations which haven’t been deployed, in the presence of

a continuously evolving station network. This means that we have

to explicitly learn the particular dynamics caused by deploying new

stations in order to make accuracy predictions. To account for that,

in addition to the actual dynamics within the data, during training

we articially inject dierent levels of augmented dynamics to

the station network, by simulating the process of deploying new

stations. More concretely, at each timestamp we randomly pick a

subset of existing stations according to a probability

, and ignore

their previous demand records, i.e. we assume that those stations

have just been deployed. We vary

from 0 to 1, indicating the

least (no) augmentation to the most. As shown in Fig. 10, we see

that as

increases from zero, our approach tends to make more

accurate predictions for both expected and instant demand. This

conrms that by injecting the augmented dynamics, we essentially

force the GCN to learn how to better react to the deployment of

new stations. We also observed that for larger

values, the errors

(both RMSE and ER) increase for both types of demand. This is also

expected because in those cases the excessive injected dynamics

would mute the useful information coming from local LSTMs at

individual stations and confuse the GCN, leading to deterioration

of performance. Therefore empirically we set

to values around

0.4∼0.6 to achieve the desired balance.

5 RELATED WORK

Demand Prediction for Shared Mobility:

Predicting user de-

mand in shared mobility services (e.g. taxi and bike- or vehicle-

sharing systems) has received considerable interest in various re-

search communities. Most of the existing work takes the histori-

cal usage (e.g. picking-up and returning records), geospatial data

such as POIs, and other auxiliary information (e.g. weather) into

account, and builds prediction models that can forecast demand

over certain periods or aggregated time slots. They also predict

the demand at dierent spatial granularity, e.g. over the entire

systems [

], grids/regions [

], station clusters [

], or

individual stations [

]. This paper falls into the last

category since we aim to predict station-level demand of EV sharing

platforms. However, our work is fundamentally dierent in that we

assume the station network is not static, but dynamically evolving,

KDD’19, Auguest 3 - 7, 2019, Anchorage, Alaska, USA

i.e. stations can be deployed or closed at arbitrary times. In this

case, state of the art station-level demand predictors (e.g. [

]) will

fail because they rely heavily on station historical data to make

predictions, which are not available for newly deployed stations.

Shared Mobility Expansion:

There is also a solid body of work

focusing on modeling the expansion process of shared mobility sys-

tems, e.g. planning for optimal new stations [

], or increasing

the capacity of existing stations [

]. However, all of them assume

that demand of the stations (renting and returning) are known, or

can be estimated from other data sources such as taxi records, which

is dierent from our work. On the other hand, the work in [

]

proposes a functional zone based hierarchical demand predictor

for shared bike systems, which can estimate the average demand

at newly deployed stations across dierent expansion stages. Our

work shares similar assumptions with [

], yet diers substantially:

1) instead of xed stages, we can predict demand while the entire

station network is dynamically expanding; 2) we are able to es-

timate both the instant and expected demand of new or existing

stations, while [

] can only predict aggregated demand patterns;

and nally 3) we don’t require historical mobility data in the newly

expanded areas, like the taxi trip records used in [12].

Graph-based Deep Learning:

Due to their non-Euclidean nature,

many real-world problems such as demand/trac/air quality fore-

casting that require spatio-temporal analysis have been tackled with

the emerging graph-based deep learning techniques [

]. In

particular, existing work often employs the graph convolutional

neural network [

] to capture the spatial correlations, where tem-

poral dependencies are typically modelled with recurrent neural

networks. For instance, [

] models the trac ow as a diusion

process on directed graphs for trac forecasting, while [

] and [

]

propose frameworks that use multi-graph convolutional neural

networks (CNNs) to predict demand for taxi and ride-hailing ser-

vices. Another work in [

] uses an encoder-decoder structure on

top of multi-graph CNNs to estimate ow between stations in bike

sharing systems, which bears a close resemblance to this paper.

However, unlike [

] who only output demand at the immediate

next timestamp, our work considers a sequence to sequence model

with attention mechanism to perform multi-step forecasting to-

wards future demand. In addition, none of the above approaches

can work on new stations where historical data is not available.

6 CONCLUSION

In this paper, we propose a novel dynamic demand prediction ap-

proach for expanding electric vehicle (EV) sharing systems, which

learns the complex system dynamics from the continuous expansion

process, and is able to robustly predict demand for both existing

stations and the planned stations which haven’t been deployed.

Specically, we rst encode the local temporal information at in-

dividual station level, and then fuse the extracted features with

dynamic graph convolutional neural networks (GCN) to account for

the spatial dependencies between stations. The demand of stations

is estimated by a multi-scale prediction network, which forecasts

both the long-term expected demand and the instant future demand

of the system. We evaluate our approach on data collected from a

real-world EV sharing platform for a year. Extensive experiments

have shown that our approach consistently outperforms the state

of the art in predicting both long-term expected and immediate

future demand of the fast expanding system.

REFERENCES

[1]

Geo Boeing. 2017. OSMnx: New methods for acquiring, constructing, analyzing,

and visualizing complex street networks. Computers, Environment and Urban

Systems 65 (2017), 126 – 139. https://doi.org/10.1016/j.compenvurbsys.2017.05.

004

[2]

Joan Bruna, WojciechZaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral net-

works and locally connected networks on graphs. arXiv preprint arXiv:1312.6203

(2013).

[3]

Di Chai, Leye Wang, and Qiang Yang. 2018. Bike ow prediction with multi-graph

convolutional networks. In Proceedings of the 26th ACM SIGSPATIAL International

Conference on Advances in Geographic Information Systems. ACM, 397–400.

[4]

Bowen Du, Yongxin Tong, Zimu Zhou, Qian Tao, and Wenjun Zhou. 2018.

Demand-Aware Charger Planning for Electric Vehicle Sharing. In Proceedings of

the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data

Mining. ACM, 1330–1338.

[5]

Jon Froehlich, Joachim Neumann, Nuria Oliver, et al

2009. Sensing and predicting

the pulse of the city through shared bicycling.. In IJCAI, Vol. 9. 1420–1426.

[6]

Xu Geng, Yaguang Li, Leye Wang, Lingyu Zhang, Qiang Yang, Jieping Ye, and Yan

Liu. 2019. Spatiotemporal Multi-Graph Convolution Network for Ride-hailing

Demand Forecasting. In 2019 AAAI Conference on Articial Intelligence (AAAI’19).

[7]

Pierre Hulot, Daniel Aloise, and Sanjay Dominik Jena. 2018. Towards Station-

Level Demand Prediction for Eective Rebalancing in Bike-Sharing Systems.

In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge

Discovery & Data Mining. ACM, 378–386.

[8]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classication with

Graph Convolutional Networks. In International Conference on Learning Repre-

sentations (ICLR).

[9]

Yaguang Li, Rose Yu, Cyrus Shahabi, and Yan Liu. 2018. Diusion Convolutional

Recurrent Neural Network: Data-Driven Trac Forecasting. In International

Conference on Learning Representations (ICLR’18).

[10]

YexinLi, Yu Zheng, Huichu Zhang, and Lei Chen. 2015. Trac prediction in a bike-

sharing system. In Proceedings of the 23rd SIGSPATIAL International Conference

on Advances in Geographic Information Systems. ACM, 33.

[11]

Junming Liu, Qiao Li, Meng Qu, Weiwei Chen, Jingyuan Yang, Hui Xiong, Hao

Zhong, and Yanjie Fu. 2015. Station site optimization in bike sharing systems. In

Data Mining (ICDM), 2015 IEEE International Conference on. IEEE, 883–888.

[12]

Junming Liu, Leilei Sun, Qiao Li, Jingci Ming, Yanchi Liu, and Hui Xiong. 2017.

Functional zone based hierarchical demand prediction for bike system expansion.

In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge

Discovery and Data Mining. ACM, 957–966.

[13]

Eoin O’Mahony and David B Shmoys. 2015. Data Analysis and Optimization for

(Citi) Bike Sharing.. In AAAI. 687–694.

[14]

Susan Shaheen, Adam Cohen, and Mark Jaee. 2018. Innovative Mobility: Car-

sharing Outlook. (2018).

[15]

Wen Wang. 2016. Forecasting Bike Rental Demand Using New York Citi Bike

Data. (2016).

[16]

Billy M Williams and Lester A Hoel. 2003. Modeling and forecasting vehicular

trac ow as a seasonal ARIMA process: Theoretical basis and empirical results.

Journal of transportation engineering 129, 6 (2003), 664–672.

[17]

Yanhai Xiong, Jiarui Gan, Bo An, Chunyan Miao, and Ana LC Bazzan. 2015.

Optimal Electric Vehicle Charging Station Placement.. In IJCAI. 2662–2668.

[18]

Zidong Yang, Ji Hu, Yuanchao Shu, Peng Cheng, Jiming Chen, and Thomas

Moscibroda. 2016. Mobility modeling and prediction in bike-sharing systems.

In Proceedings of the 14th Annual International Conference on Mobile Systems,

Applications, and Services. ACM, 165–178.

[19]

Huaxiu Yao, Fei Wu, Jintao Ke, Xianfeng Tang, Yitian Jia, Siyu Lu, Pinghua Gong,

Jieping Ye, and Zhenhui Li. 2018. Deep Multi-View Spatial-Temporal Network

for Taxi Demand Prediction. In 2018 AAAI Conference on Articial Intelligence

(AAAI’18).

[20]

Yu-Chun Yin, Chi-Shuen Lee, and Yu-Po Wong. 2014. Demand Prediction of

Bicycle Sharing Systems. (2014).

[21]

Ming Zeng, Tong Yu, Xiao Wang, Vincent Su, Le T Nguyen, and Ole J Mengshoel.

2016. Improving Demand Prediction in Bike Sharing System by Learning Global

Features. Machine Learning for Large Scale Transportation Systems (LSTS)@

KDD-16 (2016).

ResearchGate has not been able to resolve any citations for this publication.

Data Analysis and Optimization for (Citi)Bike Sharing

Article

Full-text available

Feb 2015

Bike-sharing systems are becoming increasingly prevalent in urban environments. They provide a low-cost, environmentally-friendly transportation alternative for cities. The management of these systems gives rise to many optimization problems. Chief among these problems is the issue of bicycle rebalancing. Users imbalance the system by creating demand in an asymmetric pattern. This necessitates action to put the system back in balance with the requisite levels of bicycles at each station to facilitate future use. In this paper, we tackle the problem of maintaing system balance during peak rush-hour usageas well as rebalancing overnight to prepare the systemfor rush-hour usage. We provide novel problem formulationsthat have been motivated by both a close collaborationwith the New York City bike share (Citibike) and a careful analysisof system usage data. We analyze system data to discover the best placement of bikes tofacilitate usage. We solve routing problems forovernight shifts as well as clustering problems for handlingmid rush-hour usage. The tools developed from this research are currently in daily use at NYC Bike Share LLC, operators of Citibike.

Bike flow prediction with multi-graph convolutional networks

Conference Paper

Full-text available

Nov 2018

One fundamental issue in managing bike sharing systems is bike flow prediction. Due to the hardness of predicting flow for a single station, recent research often predicts flow at cluster-level. However, they cannot directly guide fine-grained system management issues at station-level. In this paper, we revisit the problem of the station-level bike flow prediction, aiming to boost the prediction accuracy using the breakthroughs of deep learning techniques. We propose a multi-graph convolutional neural network model to predict flow at station-level, where the key novelty is viewing the bike sharing system from the graph perspective. More specifically, we construct multiple graphs for a bike sharing system to reflect heterogeneous inter-station relationships. Afterward, we fuse multiple graphs and apply the convolutional layers to predict station-level future bike flow. The results on realistic bike flow datasets verify that our multi-graph model can outperform state-of-the-art prediction models by reducing up to 25.1% prediction error.

OSMNX: New Methods for Acquiring, Constructing, Analyzing, and Visualizing Complex Street Networks

Article

Full-text available

Jul 2017
COMPUT ENVIRON URBAN

Geoff Boeing

Urban scholars have studied street networks in various ways, but there are data availability and consistency limitations to the current urban planning/street network analysis literature. To address these challenges, this article presents OSMnx, a new tool to make the collection of data and creation and analysis of street networks simple, consistent, automatable and sound from the perspectives of graph theory, transportation, and urban design. OSMnx contributes five significant capabilities for researchers and practitioners: first, the automated downloading of political boundaries and building footprints; second, the tailored and automated downloading and constructing of street network data from OpenStreetMap; third, the algorithmic correction of network topology; fourth, the ability to save street networks to disk as shapefiles, GraphML, or SVG files; and fifth, the ability to analyze street networks, including calculating routes, projecting and visualizing networks, and calculating metric and topological measures. These measures include those common in urban design and transportation studies, as well as advanced measures of the structure and topology of the network. Finally, this article presents a simple case study using OSMnx to construct and analyze street networks in Portland, Oregon.

Modeling and Forecasting Vehicular Traffic Flow as a Seasonal ARIMA Process: Theoretical Basis and Empirical Results

Article

Full-text available

Nov 2003

This article presents the theoretical basis for modeling univariate traffic condition data streams as seasonal autoregressive integrated moving average processes. This foundation rests on the Wold decomposition theorem and on the assertion that a one-week lagged first seasonal difference applied to discrete interval traffic condition data will yield a weakly stationary transformation. Moreover, empirical results using actual intelligent transportation system data are presented and found to be consistent with the theoretical hypothesis. Conclusions are given on the implications of these assertions and findings relative to ongoing intelligent transportation systems research, deployment, and operations.

Demand-Aware Charger Planning for Electric Vehicle Sharing

Conference Paper

Jul 2018

Cars of the future have been predicted as shared and electric. There has been a rapid growth in electric vehicle (EV) sharing services worldwide in recent years. For EV-sharing platforms to excel, it is essential for them to offer private charging infrastructure for exclusive use that meets the charging demand of their clients. Particularly, they need to plan not only the places to build charging stations, but also the amounts of chargers per station, to maximally satisfy the requirements on global charging coverage and local charging demand. Existing research efforts are either inapplicable for their different problem formulations or are at a coarse granularity. In this paper, we formulate the \underlineE lectric \underlineV ehicle \underlineC harger \underlineP lanning (EVCP) problem especially for EV-sharing. We prove that the \shortpro problem is NP-hard, and design an approximation algorithm to solve the problem with a theoretical bound of $1-\frac1 e $. We also devise some optimization techniques to speed up the solution. Extensive experiments on real-world datasets validate the effectiveness and the efficiency of our proposed solutions.

Towards Station-Level Demand Prediction for Effective Rebalancing in Bike-Sharing Systems

Conference Paper

Jul 2018

Bike sharing systems continue gaining worldwide popularity as they offer benefits on various levels, from society to environment. Given that those systems tend to be unbalanced along time, bikes are typically redistributed throughout the day to better meet the demand. Reasonably accurate demand prediction is key to effective redistribution; however, it is has received only little attention in the literature. In this paper, we focus on predicting the hourly demand for demand rentals and returns at each station of the system. The proposed model uses temporal and weather features to predict demand mean and variance. It first extracts the main traffic behaviors from the stations. These simplified behaviors are then predicted and used to perform station-level predictions based on machine learning and statistical inference techniques. We then focus on determining decision intervals, which are often used by bike sharing companies for their online rebalancing operations. Our models are validated on a two-year period of real data from BIXI Montréal. A worst-case analysis suggests that the intervals generated by our models may decrease unsatisfied demands by 30% when compared to the current methodology employed in practice.

Functional Zone Based Hierarchical Demand Prediction For Bike System Expansion

Conference Paper

Aug 2017

Bike sharing systems, aiming at providing the missing links in public transportation systems, are becoming popular in urban cities. Many providers of bike sharing systems are ready to expand their bike stations from the existing service area to surrounding regions. A key to success for a bike sharing systems expansion is the bike demand prediction for expansion areas. There are two major challenges in this demand prediction problem: First. the bike transition records are not available for the expansion area and second. station level bike demand have big variances across the urban city. Previous research efforts mainly focus on discovering global features, assuming the station bike demands react equally to the global features, which brings large prediction error when the urban area is large and highly diversified. To address these challenges, in this paper, we develop a hierarchical station bike demand predictor which analyzes bike demands from functional zone level to station level. Specifically, we first divide the studied bike stations into functional zones by a novel Bi-clustering algorithm which is designed to cluster bike stations with similar POI characteristics and close geographical distances together. Then, the hourly bike check-ins and check-outs of functional zones are predicted by integrating three influential factors: distance preference, zone-to-zone preference, and zone characteristics. The station demand is estimated by studying the demand distributions among the stations within the same functional zone. Finally, the extensive experimental results on the NYC Citi Bike system with two expansion stages show the advantages of our approach on station demand and balance prediction for bike sharing system expansions.

Traffic prediction in a bike-sharing system

Conference Paper

Nov 2015

Bike-sharing systems are widely deployed in many major cities, providing a convenient transportation mode for citizens' commutes. As the rents/returns of bikes at different stations in different periods are unbalanced, the bikes in a system need to be rebalanced frequently. Real-time monitoring cannot tackle this problem well as it takes too much time to reallocate the bikes after an imbalance has occurred. In this paper, we propose a hierarchical prediction model to predict the number of bikes that will be rent from/returned to each station cluster in a future period so that reallocation can be executed in advance. We first propose a bipartite clustering algorithm to cluster bike stations into groups, formulating a two-level hierarchy of stations. The total number of bikes that will be rent in a city is predicted by a Gradient Boosting Regression Tree (GBRT). Then a multi-similarity-based inference model is proposed to predict the rent proportion across clusters and the inter-cluster transition, based on which the number of bikes rent from/ returned to each cluster can be easily inferred. We evaluate our model on two bike-sharing systems in New York City (NYC) and Washington D.C. (D.C.) respectively, confirming our model's advantage beyond baseline approaches (0.03 reduction of error rate), especially for anomalous periods (0.18/0.23 reduction of error rate).

Mobility Modeling and Prediction in Bike-Sharing Systems

Conference Paper

Jun 2016

As an innovative mobility strategy, public bike-sharing has grown dramatically worldwide. Though providing convenient, low-cost and environmental-friendly transportation, the unique features of bike-sharing systems give rise to problems to both users and operators. The primary issue among these problems is the uneven distribution of bicycles caused by the ever-changing usage and (available) supply. This bicycle imbalance issue necessitates efficient bike re-balancing strategies, which depends highly on bicycle mobility modeling and prediction. In this paper, for the first time, we propose a spatio-temporal bicycle mobility model based on historical bike-sharing data, and devise a traffic prediction mechanism on a per-station basis with sub-hour granularity. We extensively evaluated the performance of our design through a one-year dataset from the world's largest public bike-sharing system (BSS) with more than 2800 stations and over 103 million check in/out records. Evaluation results show an 85 percentile relative error of 0.6 for both check in and check out prediction. We believe this new mobility modeling and prediction approach can advance the bike re-balancing algorithm design and pave the way for the rapid deployment and adoption of bike-sharing systems across the globe.

Station Site Optimization in Bike Sharing Systems

Conference Paper

Nov 2015

Dynamic Demand Prediction for Expanding Electric Vehicle Sharing Systems: A Graph Sequence Learning Approach

Abstract and Figures

Recommended publications

Statical Scheduling of Flow Graphs Using Neural Networks

Interference of day‐to‐day activities on the working pressures in patients using elastic stockings

Adaptive Convolutional Filter Generation for Natural Language Understanding

NEWS: News Event Walker and Summarizer