ArticlePDF Available

Spatial interpolation using conditional generative adversarial neural networks

April 2019
International Journal of Geographical Information Science 34(3):1-24

April 2019
34(3):1-24

DOI:10.1080/13658816.2019.1599122

Authors:

Di Zhu

University of Minnesota Twin Cities

Ximeng Cheng

Fraunhofer-Institut für Nachrichtentechnik, Heinrich-Hertz-Institut

Fan Zhang

Peking University

Xin Yao

Show all 6 authorsHide

Spatial interpolation is a traditional geostatistical operation that aims at predicting the attribute values of unobserved locations given a sample of data defined on point supports. However, the continuity and heterogeneity underlying spatial data are too complex to be approximated by classic statistical models. Deep learning models, especially the idea of conditional generative adversarial networks (CGANs), provide us with a perspective for formalizing spatial interpolation as a conditional generative task. In this article, we design a novel deep learning architecture named conditional encoder-decoder generative adversarial neural networks (CEDGANs) for spatial interpolation, therein combining the encoder-decoder structure with adversarial learning to capture deep representations of sampled spatial data and their interactions with local structural patterns. A case study on elevations in China demonstrates the ability of our model to achieve outstanding interpolation results compared to benchmark methods. Further experiments uncover the learned spatial knowledge in the model’s hidden layers and test the potential to generalize our adversarial interpolation idea across domains. This work is an endeavor to investigate deep spatial knowledge using artificial intelligence. The proposed model can benefit practical scenarios and enlighten future research in various geographical applications related to spatial prediction.

An illustration of how a conditional encoder-decoder generative adversarial network (CEDGAN) works for spatial interpolation. (a) The main loop of training, where real images and fake images are discriminated by D conditioned on the same sampled data, and the gradients of D's output are used to update model's parameters. (b) For G, sampled images f ðxÞ are encoded into spatial feature maps, and then, fractionally strided convolutions upsample the deep features into fake spatial images Gðf ðxÞÞ of full size. For the discriminator D, the real spatial images x (or fake images Gðf ðxÞÞ) and the corresponding sampled images are merged as the input. The output of D is a scalar to determine a correct interpolation.

…

Study areas for the spatial interpolation of terrain elevations. Subregion A is the Shannan area of the Tibetan Plateau. Subregion B is the Sichuan Basin. Subregion C is the Pearl River Delta. Subregion D is the Qinling Mountains. We omit the north arrow and the map scale for simplicity.

…

Illustration of a 10 Â 10 uniform sampling configuration (u 100) on a single-channel 32 Â 32 DEM image. Elevations are represented by gray-scale colors so that the whiter a pixel is, the higher its elevation (all DEM images shown in this article share the same colorbar if not indicated). Pixels in the sampled image with null value are displayed in blue.

…

Training details of CEDGAN with a 10 Â 10 uniform sampling configuration. (a) Variation of the model accuracy and the BCELoss for G and D during the training procedure; early trainings with ε > 15 m are not shown in the plot. (b) Illustration of the adversarial game between G and D, where the model tends to converge to a game equilibrium during the training process.

…

Generator's performance on the validation set with a 10 Â 10 uniform sampling configuration (u 100). (a) Average interpolation error for different epochs. (b) Visualization of the generated fake image GðuðxÞÞ for different epochs on the same sampled image uðxÞ.

…

Figures - uploaded by Di Zhu

Content may be subject to copyright.

Content uploaded by Di Zhu

Content may be subject to copyright.

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=tgis20

International Journal of Geographical Information

Science

ISSN: 1365-8816 (Print) 1362-3087 (Online) Journal homepage: https://www.tandfonline.com/loi/tgis20

Spatial interpolation using conditional generative

adversarial neural networks

Di Zhu, Ximeng Cheng, Fan Zhang, Xin Yao, Yong Gao & Yu Liu

To cite this article: Di Zhu, Ximeng Cheng, Fan Zhang, Xin Yao, Yong Gao & Yu Liu (2019):

Spatial interpolation using conditional generative adversarial neural networks, International Journal

of Geographical Information Science, DOI: 10.1080/13658816.2019.1599122

To link to this article: https://doi.org/10.1080/13658816.2019.1599122

Published online: 16 Apr 2019.

Submit your article to this journal

View Crossmark data

RESEARCH ARTICLE

Spatial interpolation using conditional generative

adversarial neural networks

Di Zhu

a,b,c

, Ximeng Cheng

, Fan Zhang

a,d

, Xin Yao

, Yong Gao

and Yu Liu

Institute of Remote Sensing and Geographical Information Systems, School of Earth and Space Sciences,

Peking University, Beijing, China;

Beijing Key Lab of Spatial Information Integration and Its Applications,

Peking University, Beijing, China;

SpaceTimeLab, Department of Civil, Environmental and Geomatic

Engineering, University College London, London, UK;

Senseable City Laboratory, Massachusetts Institute

of Technology, Cambridge, MA, USA

ABSTRACT

Spatial interpolation is a traditional geostatistical operation that

aims at predicting the attribute values of unobserved locations

given a sample of data deﬁned on point supports. However, the

continuity and heterogeneity underlying spatial data are too com-

plex to be approximated by classic statistical models. Deep learning

models, especially the idea of conditional generative adversarial

networks (CGANs), provide us with a perspective for formalizing

spatial interpolation as a conditional generative task. In this article,

we design a novel deep learning architecture named conditional

encoder-decoder generative adversarial neural networks (CEDGANs)

for spatial interpolation, therein combining the encoder-decoder

structure with adversarial learning to capture deep representations

of sampled spatial data and their interactions with local structural

patterns. A case study on elevations in China demonstrates the

ability of our model to achieve outstanding interpolation results

compared to benchmark methods. Further experiments uncover the

learned spatial knowledge in the model’s hidden layers and test the

potential to generalize our adversarial interpolation idea across

domains. This work is an endeavor to investigate deep spatial

knowledge using artiﬁcial intelligence. The proposed model can

beneﬁt practical scenarios and enlighten future research in various

geographical applications related to spatial prediction.

ARTICLE HISTORY

Received 18 April 2018

Accepted 20 March 2019

KEYWORDS

Spatial interpolation;

generative adversarial

networks; deep learning;

encoder-decoder; spatial

prediction

1. Introduction

When attempting to understand a geographical phenomenon, such as the spatial

distribution of precipitation, we are often forced to collect a limited number of samples

instead of acquiring information at every possible location (Cochran 1963, Hedayat and

Sinha 1991, Goodchild et al.1993, Thompson 1996, Fotheringham and Rogerson 2008).

Spatial interpolation is a traditional geostatistical operation that aims at predicting the

value zðxÞat an unobserved location xgiven some sampled data zðxÞat observed

locations x(Atkinson and Lloyd 2009, Lam 2009). Tobler’sﬁrst law (TFL) of geography

(Tobler 1970) describes the essential nature of the real world from a geographic view.

CONTACT Yu Liu liuyu@urban.pku.edu.cn

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE

https://doi.org/10.1080/13658816.2019.1599122

Oliver and Webster (1990) further noted that spatially distributed data behave more like

random variables, where stochastic models are required to characterize the underlying

spatial autocorrelation (Hubert et al.1981, Azaele et al.2009, Fischer et al.2010) and

spatial non-stationarity (Anselin 1995, Marsily et al.2005, Fotheringham et al.2017).

The complex features of spatial distribution patterns necessitate the development of

interpolation methods, of which kriging (Matheron 1963, Cressie 1990, Li and Heap

2011) is the most commonly used geostatistical method and can be roughly divided into

two types that conceptually rely on diﬀerent approaches to modeling the spatial

variability. The ﬁrst type of methods, such as simple kriging (SK), ordinary kriging (OK)

and cokriging, characterizes the spatial structural features by estimating the semi-

variogram cloud, which is a plot of the semi-variances γðhÞfor paired data against the

distances hseparating the paired data points, and uses the ﬁtted model to make spatial

estimations (Matheron 1963, Diggle et al.1998). The other type of methods, such as

regression kriging (RK) and universal kriging (UK) (Appelhans et al.2015,Liet al.2015),

makes predictions by combining a regression of the dependent variable on auxiliary

variables with the SK of the regression residuals (Hengl et al.2007), which further leads

to the training-based multi-point geostatistics (MPS)(Mariethoz and Caers 2014).

Despite the above-mentioned endeavours in spatial interpolation, we have to admit

that the nature of spatial continuity and heterogeneity in geographical digital represen-

tations (Goodchild 2004, Zhu et al.2018) is substantially more complex than classic

statistical models (Shepard 1968, Oliver and Webster 1990). In recent years, deep

learning approaches have been increasingly used to understand spatial processes from

a data-driven perspective, as they can well extract underlying patterns given complex

spatial contexts. Convolutional neural networks (CNNs) have been proven to be extre-

mely eﬃcient for high-dimensional data representation and function approximation

(LeCun et al.2015). Through the backpropagation of gradients in the linear transform

layers combined with non-linear activations, these networks learn a way to transform

the input into an ideal output representation by capturing the deep features of gen-

eration as the high-dimensional parameters (Le 2013, Schmidhuber 2014). More impor-

tantly, the characteristics of the CNN’s architecture –local connectivity and shared

weights –enable the model to focus on features near to each other as well as far

away features, which is consistent with the function approximation objective in many

spatial analysis problems (Fischer 1998).

The workﬂow of spatial interpolation can be considered as a generative procedure:

only limited data on point supports (the space on which each observation is deﬁned)

(Atkinson and Lloyd 2009) can be acquired. The objective is to generate an accurate

global mapping of the spatial phenomenon through learning of observed reciprocities

among location attributes. A deep learning framework named generative adversarial

networks (GANs) (Goodfellow et al.2014) was recently introduced as a powerful archi-

tecture for training generative models, therein sidestepping the diﬃculty of approximat-

ing many intractable probabilistic computations by adopting an adversarial structure to

train the loss (Radford et al.2015, Salimans et al.2016). Based upon the idea of GANs,

conditional generative adversarial networks (CGANs) is an extension of GANs that

enables us to direct the data generation process by conditioning the model on certain

external information (Mirza and Osindero 2014). The CGAN has been widely used in

various data generation applications such as image super-resolution (Chen et al.2016,

2D. ZHU ET AL.

Zhao et al.2019), image-to-image translation (Isola et al.2016), face generation (Antipov

et al.2017) and terrain reconstruction (Gurin et al.2017).

Previous research on CGANs mainly formalizes the deterministic conditions of the

generation as some loosely coupled auxiliary features with no spatial information, and

their objective is for the generator to create realistic-looking fake images that the

discriminator is unable to identify. For example, Antipov et al.(2017) successfully

simulated the face aging of people by using a random latent vector to represent

a person’s identity and a conditional age term to control the generation. The accuracy

of the generated fake images is often beyond the scope of consideration in related

state-of-the-art CGANs (Lu et al.2017, Laloy et al.2018).

In contrast, spatial interpolation requires an accurate estimation of the real spatial

pattern instead of simply a realistic-looking reproduction. Therefore, a spatial extension

of state-of-the-art deep learning structures is needed to bridge the gap between CGANs

and the task of spatial interpolation such that an accurate global estimation given

certain spatial sampled data can be achieved.

This article introduces a novel idea of using conditional generative adversarial net-

works to capture deep spatial features underlying spatial distribution datasets and to

perform spatial interpolation. To achieve this objective, we designed a deep learning

model named conditional encoder-decoder generative adversarial neural networks

(CEDGANs) with spatial consideration. Incorporating an encoder-decoder structure

with the idea of adversarial learning, the proposed model can learn the deep features

of input sampled spatial data and their complex interactions with local structural

patterns. A case study on the terrains in China demonstrates the ability of our model

to gain outstanding spatial interpolation results compared to benchmark methods.

Further experiments investigate the learned complex spatial knowledge and demon-

strate the potential of generalizing the CEDGAN-based spatial interpolation idea to more

geographical applications.

2. Methodology

Considering the gaps between spatial interpolation and common conditional generation

tasks, we need to explicitly consider both spatial structural patterns and interpolation

accuracies in the generative adversarial model. The proposed model is assumed to take

spatial sampled data as the only deterministic input (with no priori noise) and to

perform accurate generation using the knowledge captured during the adversarial

learning. For clarity, we will ﬁrst brieﬂy present the concept of GANs and the state-of-

the-art CGANs, and then, we will show how to construct the adversarial spatial inter-

polation structure using a restructured CGAN.

2.1. Generative adversarial networks

Basically, the GAN framework introduced by Goodfellow et al.(2014) consists of two

models ðG;DÞ:ageneratorGthat attempts to capture the data distribution and

a discriminator Destimating the probability that a sample comes from the real

dataset rather than G. To learn a generator distribution pgsimilar to the distribution

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 3

pdataðxÞof a dataset x,Gusually maps a noise vector zfrom the prior distribution

pzðzÞto the data space as GðzÞ. The discriminator Doutputs a single scalar represent-

ing the probability that the input data come from the training set rather than the

generated samples of G.

Gand Dare trained following a two-player minimax game so that the parameters θgof

Gare adjusted to maximally confuse the discriminator, i.e. minimizing logð1DðGðzÞÞÞ,

and the parameters θdof Dare adjusted to make the best judgement, i.e. maximizing

logDðxÞþlogð1DðGðzÞÞÞ. The objective function of the minimax game is

min

θg

max

θdðEx,pdataðxÞ½logDðxÞ þ Ez,pzðzÞ½logð1DðGðzÞÞÞÞ:(1)

The training of the adversarial network can be conducted through simultaneously

updating θdand θgby descending the stochastic gradient of logistic loss functions, i.e.

θd

2nP

i¼1½logð1DðxðiÞÞÞ þ logDðGðzðiÞÞÞ (2)

and

θg

i¼1½logð1DðGðzðiÞÞÞÞ;(3)

respectively, where nis the number of samples in each data batch during training.

GANs can be extended to a conditional version named CGANs if both Gand Dare

conditioned on the same auxiliary information y, which can restrict Gin its generation

process and Din its discrimination process. In previous works, the prior input noise

vector zand the condition yhave been combined jointly as low-dimensional inputs for

Gto generate diﬀerent random fake data under the same condition, while the discrimi-

nator receives x(or Gðz;yÞ) and yas inputs to make a determination based on ywithout

considering z(Gauthier 2014, Mirza and Osindero 2014, Antipov et al.2017). The

objective function of a CGAN is formalized as Equation (4):

min

θg

max

θdðEx,pdataðxÞ½logDðx;yÞ þ Ez,pzðzÞ½logð1DðGðz;yÞ;yÞÞÞ:(4)

2.2. Adversarial spatial interpolation using point supports as conditions

For spatial interpolation scenarios, however, the traditional adversarial strategy needs to

be modiﬁed to ensure the stability of conditional generations. Speciﬁcally, the random

noise vector zthat is commonly used to generate random data samples should be

removed such that the conditional generation could be considered to be determined by

the sampled data as the only constraint.

Let the data space V¼

ΔRCWH,whereWand Hrepresent the size of a spatial raster data

(spatial image) and Cis the number of data channels. A real spatial image is deﬁned as

x2V. If the point supports (the space on which each observation is deﬁned) of a sampling

conﬁguration fon xwith msampled locations is f¼½ðc1;r1Þ;ðc2;r2Þ;;ðcm;rmÞ 2 R2m,

where ðck;rkÞis the coordinate of the kth observed point, we can formalize the sampled

spatial image fðxÞ2Vas

4D. ZHU ET AL.

fðxÞð:;i;jÞ:¼xð:;i;jÞifði;jÞ2f;

N=A otherwise:

(5)

When training an adversarial spatial interpolation network, we need a generator Gthat

requires the sampled image fðxÞas input and output a generated fake image GðfðxÞÞ 2

Vas close to the real image xas possible. In addition, a discriminator Dneeds to be

trained to distinguish the fake image GðfðxÞÞ from a real image xbased on the sampled

image fðxÞ. The objective function of adversarial spatial interpolation networks can be

deﬁned as

min

θg

max

θdðEx,pdataðxjfðxÞÞ½logDðx;fðxÞÞ þ Ex,pdataðxjfðxÞÞ½logð1DðGðfðxÞÞ;fðxÞÞÞÞ;(6)

where Gis a diﬀerentiable function representing the generator’s structure with para-

meters θgand Dis a diﬀerentiable function representing the discriminator’s structure

with parameters θd.Gattempts to approximate a conditional probability distribution

pgðGðfðxÞÞjfðxÞÞ most similar to the conditional probability pdataðxjfðxÞÞ in the real

dataset, therein minimizing the second term of Equation (6). Meanwhile, Djudges

whether a spatial image came from pgðGðfðxÞÞjfðxÞÞ or pdataðxjfðxÞÞ, maximizing both

terms in Equation (6).

Compared with GANs and CGANs (see Equation (1) and (4)), both terms of Equation

(6) contain a spatial conditional data fðxÞdeduced from the training data instead of

some explicit auxiliary conditional data y. The most important thing is that our adver-

sarial spatial interpolation learning is designed to approximate the conditional genera-

tive probability distribution given spatial sampled images (pdataðxjfðxÞÞ) rather than the

probability distribution of data existence (pdataðxÞ).

By discarding the prior noise vector z, the adversarial network only takes a pre-

deﬁned sampling conﬁguration function fto direct the generation, with no random

feature aﬀecting the result of the spatial interpolation such that the output can be stable

given a sampled image. The basic requirement for spatial interpolation is that we will

not obtain two diﬀerent interpolated images given the same sampled image. However,

if the scenario changes to where we allow multiple results given the same sampled

image, Equation (6) is actually not contradictory to Equation (4), as we can add a term z

to allow variations in the output.

We rephrase Equation (6) in the form of a binary cross-entropy (BCL) loss function JðθÞ

for clarity. Given a mini-batch xðiÞ



i¼1of ntraining real spatial images, the loss function

for Dis deﬁned to let Dassign a true label to real spatial images xðiÞbased on point

supports fðxðiÞÞand a false label to generated fake spatial images GðfðxðiÞÞÞ based on the

same fðxðiÞÞ:

JðθdÞ¼1

2nP

i¼1

logð1DðxðiÞ;fðxðiÞÞÞÞ þ P

i¼1

logDðGðfðxðiÞÞÞ;fðxðiÞÞÞ



:(7)

The loss function for Gis similar but relates only to the second term of Equation (6) and

attempts to trick D:

JðθgÞ¼1

i¼1

logð1DðGðfðxðiÞÞÞ;fðxðiÞÞÞÞ:(8)

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 5

Then, we can minimize the loss function of the adversarial interpolation by simulta-

neously updating θdand θgusing stochastic gradient descent θg:¼θgαJðθgÞ

and θd:¼θdαJðθdÞ.

2.3. Conditional encoder-decoder generative adversarial networks for spatial

interpolation

In Section 2.2, we have deﬁned the input and the training objective of an adversarial

spatial interpolation model. However, further considerations of how to capture local

spatial structural patterns and how to make the model trainable are not mentioned.

As for the training, Radford et al.(2015) introduced a class of stable architectures for

training GANs named deep convolutional GANs (DCGANs), where they replaced pooling

layers with strided convolutions in the discriminator and fractional-strided convolutions

in the generator to conserve the image continuity information (Kingma and Welling

2014, Mansimov et al.2015). However, the DCGANs’generator contains only a decoder

structure to generate images from noise, with no attempt to link deep features with

spatial constraints. Simultaneously, some encoder-decoder architectures, such as the

SegNet (Badrinarayanan et al.2017) and U-Net (Ronneberger et al.2015), that use an

encoder structure to obtain the deep feature maps from inputs and a decoder to

upsample the deep features into full-size image representations (Long et al.2015,

Isola et al.2016) can be adopted to design our generative model, which needs to

capture deep spatial representations.

Here, we propose a conditional encoder-decoder generative adversarial network

(CEDGAN) to model adversarial spatial interpolation. The main structure of CEDGAN is

illustrated in Figure 1(a). A CEDGAN consists of a generator Gand a discriminator D.G

attempts to learn the relationships between sampled spatial data and corresponding

real spatial data and to achieve the objective of generating as accurate as possible fake

spatial data. Dattempts to capture the correspondence between spatial data and the

sampled data, with the objective of determining whether the interpolated fake data can

be considered correct based on the limited samples.

In Figure 1(b), we display the details of Gand D. The generator Gis designed to be

a fully convolutional encoder-decoder structure that contains three two-dimensional

convolution layers as the encoder (convs 1, 2 and 3) and three two-dimensional

transposed convolution layers as the decoder (deconvs 1, 2 and 3). Each encoder layer

performs a zero-padding convolution with the given convolving kernel and stride

length. Each decoder layer implements the upsampling of the feature maps through

a fractionally strided transposed convolution with the same settings as that of the

encoder layers. The discriminator Dis a convolutional neural network similar to typical

models of image classiﬁcation except that we use a concat operation to merge the

sampled data fðxÞand the full-size real data x(or fake data GðfðxÞÞ) as the input. Each

layer of Dperforms a zero-padding convolution with the same settings of the encoder

layers in G. The output of Dis a scalar indicating whether the input full-size image is

a correct interpolation.

Batch normalization (BN) (Ioﬀeand Szegedy 2015) is applied to all layers except for

the output layer of Gand the input/output layer of D. This can avoid model instability

and help gradients ﬂow in the networks. The LeakyReLU activation (Xu et al.2015)is

6D. ZHU ET AL.

used after convolutions, and the ReLU activation (Nair and Hinton 2010) is used after

transposed convolutions. For the output layers of Gand D, we use the Tanh and Sigmoid

activation functions, respectively, according to Radford et al.(2015).

3. Experiments on spatial data: case of the DEM interpolation

3.1. Data descriptions

We use a dataset of digital elevation models (DEMs) in China as an example to test the

feasibility and eﬀectiveness of the proposed CEDGAN model. However, we hope to

address problems not only in DEMs. The method can be applied to a broader range of

spatial data. We select DEMs as our case study simply because the ground-truth terrain

data can help us test the accuracy of interpolations and thus demonstrate the feasibility

of our adversarial model in capturing deep spatial features.

Four representative subregions in mainland China are selected as the ground truths,

including the Shannan area of the Tibetan Plateau, the Sichuan Basin, the Pearl River Delta,

and the Qinling Mountains. These regions consist of various terrains that have a diverse

range of altitude and hypsography. An overview of the study areas is illustrated in 2.The

GDEM Version2

for these areas are collected as the raw DEM dataset. After preprocessing,

single-channel DEM tiles (1 32 32) with no repetition are randomly cropped using

Monte Carlo simulation as the DEM images. To address the concerns of over-ﬁtting and

Figure 1. An illustration of how a conditional encoder-decoder generative adversarial network

(CEDGAN) works for spatial interpolation. (a) The main loop of training, where real images and

fake images are discriminated by Dconditioned on the same sampled data, and the gradients of D‘s

output are used to update model’s parameters. (b) For G, sampled images fðxÞare encoded into

spatial feature maps, and then, fractionally strided convolutions upsample the deep features into

fake spatial images GðfðxÞÞ of full size. For the discriminator D, the real spatial images x(or fake

images GðfðxÞÞ) and the corresponding sampled images are merged as the input. The output of Dis

a scalar to determine a correct interpolation.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 7

memorization of training samples, we acquire 60,000 DEM images in total, with 48,000

images composing the training set and 12,000 images composing the validation set. For

each subarea, there are 15,000 ground-truth images, of which 12,000 are for training and

3,000 are for validation. Each DEM image covers a 0:10:1geographic tile. The terrain

elevations in the dataset range from −7 m to 6,999 m. We ﬁrst transform these images

linearly into ﬂoat tensor images (½0:0;1:0); then, we normalize the tensor images to have

0.5 mean and 0.5 standard deviation (½1:0;1:0) for improved training eﬃciency. All

elevations are mapped back to their original values in the reported accuracies.

Noting that there are many indicators that can measure the performance of a spatial

interpolation method (Li and Heap 2011), we simply choose the root mean square error

(RMSE) to calculate the interpolation error εat the pixel level, as it requires minimal

auxiliary information to utilize:

ε¼ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

i¼1ðpioiÞ2

r;(9)

where nis the total number of pixels, pis the predicted value, and ois the observed value.

3.2. Adversarial training procedure

In this section, we show the results of spatial interpolation with a 10 10 uniform

sampling conﬁguration u100 on 1 32 32 DEM images as an example. A 10 10

Figure 2. Study areas for the spatial interpolation of terrain elevations. Subregion Ais the Shannan

area of the Tibetan Plateau. Subregion Bis the Sichuan Basin. Subregion Cis the Pearl River Delta.

Subregion Dis the Qinling Mountains. We omit the north arrow and the map scale for simplicity.

8D. ZHU ET AL.

uniformly sampled DEM image contains the elevations of 100 locations that are evenly

distributed, and the other locations are null (Figure 3).

The network is trained using mini-batch stochastic gradient descent (SGD) with

a batch size of 64. The training dataset with 48,000 DEM images is randomly divided

into 750 batches, with each batch containing 64 images (dropping the last batch with

fewer than 64 images). Based on the parameter suggestions of Radford et al.(2015), for

layers with LeakyReLU activation, we set the slope of the leak to be 0.2. In addition, we

use the Adam optimizer, where β1¼0:5 and β2¼0:999, and the learning rate αfor

backpropagation is set to 0.0002. All gradients are computed using Equations (7) and (8).

The details of the adversarial training are shown in Figure 4. The evolution of our

model can be easily identiﬁed in the main plot of Figure 4(a), where the RMSE between

the generated fake data and real data are computed to plot the gray error curve. The

error curve shows that the accuracy of our model is evidently increasing during the ﬁrst

60,000 batches (80 epochs) of training; however, after that, the improvement is not very

signiﬁcant. We train on 150,000 batches (200 epochs) and ﬁnd that the average inter-

polation error per pixel gradually stabilized at 2:5 m, which is quite amazing since the

elevations range from −7 m to 6,999 m.

Apart from the relative stable decreasing trend, we can see some sudden rises in the

gray error curve, which reﬂect the adversarial nature of our model: when a local

optimum is reached whereby Gcannot further deceive D,Gwill jump out of the local

parameter space and attempt to ﬁnd a more optimal solution. However, these jump-out

attempts usually return worse results. The sub-plot in the upper right of Figure 4(a)

illustrates the variation in the binary cross entropy loss (BCELoss) for Dand G(Equations

(7) and (8)) throughout the training procedure. It can be observed that D(blue curve)

trends toward maximal confusion, with its loss approximating 0.5 and G‘s loss (orange

curve) continually improving.

(

)

Real image x

(

)

Sampled image u

(

)

Figure 3. Illustration of a 10 10 uniform sampling conﬁguration (u100) on a single-channel 32 32

DEM image. Elevations are represented by gray-scale colors so that the whiter a pixel is, the higher its

elevation (all DEM images shown in this article share the same colorbar if not indicated). Pixels in the

sampled image with null value are displayed in blue.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 9

A more comprehensible visualization of the combat between Gand Dis shown in

Figure 4(b), where the result after each trained batch is drawn as a scatter point. The

color of a point represents the number of trained batches, with the xvalue being the

BCELoss of Dand the yvalue being the BCELoss of G. In this scatter plot, yellow points

roughly cluster in the small area, where BCELoss(D)2½0:1;0:5and BCELoss(G)2½3;5,

indicating that our proposed adversarial model tends to converge to its game equili-

brium during the training.

3.3. Validation of the trained generator

To demonstrate that our generator is not producing high-quality interpolation results by

simply over-ﬁtting or memorizing training samples, we apply the trained Gof the 10 

10 uniform sampling conﬁguration on the validation set with 12,000 DEM images

diﬀerent from the training data.

By randomly choosing mini-batches of real DEM images xfrom the validation set, we

invoke the generator model Gevery 10 epochs during the training process, input the

sampled DEM images uðxÞinto Gand calculate the εbetween generated fake DEM

images GðuðxÞÞ and their corresponding x. The decreasing trend of the generator’s

average accuracy (Figure 5(a)) is similar to that of Figure 4(a), with no sign of over-

ﬁtting. A generator trained on 200 epochs can also achieve an interpolation error ε

2:5mon the validation DEM images collected in the same area.

Figure 5(b) displays the evolution of our generator regarding visual ﬁdelity. We list the

generated fake images GðuðxÞÞ by epochs, 0, 10, 20, 50, 100, and 200, on the same real

image xas an example. It is interesting to see that the generator Gis not aware of any

knowledge at the very beginning, generating a noise image by epoch 000. Then, after 10

epochs of training, Gquickly learns some fundamental knowledge of spatial interpolation,

such as the basic mapping between elevations and colors as well as a coarse spatial

continuity, and can produce a blurry fake image based on the given observations. After

that, Ggradually achieves more accurate generation by attempting to add more terrain

Figure 4. Training details of CEDGAN with a 10 10 uniform sampling conﬁguration. (a) Variation of

the model accuracy and the BCELoss for Gand Dduring the training procedure; early trainings with

ε>15 m are not shown in the plot. (b) Illustration of the adversarial game between Gand D, where

the model tends to converge to a game equilibrium during the training process.

10 D. ZHU ET AL.

details that seem to be correct, as we can see more valleys in the displayed images during

the evolution. Finally, by epoch 200, Gis capable of producing a high-quality fake image

that is almost visually indistinguishable from the real image. Moreover, in the lower-left

part of the real image, we can see two near-branches of the valley; however, no branching

can be identiﬁed in the fake image by epoch 200 (Figure 5(b)). The ultimate accuracy may

be limited by the spatial resolution of the given sampling conﬁguration; further discussion

can be found in Section 3.4.

3.4. Diﬀerent spatial sampling conﬁgurations

The proposed CEDGAN model requires a training process regarding each spatial sampling

conﬁguration. In practice, typical scenarios that require spatial interpolation may have

fewer sampled locations, and the distribution of the sampled locations can be irregular.

To address these concerns, we change the sampling conﬁguration in two respects: the ratio

of sampled locations under uniform sampling and random sampling to see how diﬀerent

spatial sampling conﬁgurations will aﬀect the performance of the model.

3.4.1. Ratio of sampled locations

We modify the ratio of the sampled locations, i.e. the number of sampled locations m

given a ﬁxed spatial image size, under the circumstances of systematic sampling (uni-

form sampling). Formally, ðci;rjÞis the coordinate of an observed value in an image of

size WH, which can be deﬁned as

ci¼c1þði1ÞδW;rj¼r1þðj1ÞδH

"i;j¼1;;ﬃﬃﬃﬃ

p;(10)

Figure 5. Generator’s performance on the validation set with a 10 10 uniform sampling conﬁg-

uration (u100). (a) Average interpolation error for diﬀerent epochs. (b) Visualization of the gener-

ated fake image GðuðxÞÞ for diﬀerent epochs on the same sampled image uðxÞ.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 11

where the initial sampled point is ðc1;r1Þ, the interval δW¼ðW1Þ=ðﬃﬃﬃﬃ

p1Þ, and

δH¼ðH1Þ=ðﬃﬃﬃﬃ

p1Þ. An illustration of the sampled images fðxÞusing diﬀerent ratios

of uniform sampling on the 32 32 DEM image is shown in Figure 6.

We set the number of sampled location mto be 36, 49, 64, 81, 100, 121, 144, 169, and

196, and we train the corresponding CEDGAN-based interpolation model separately. The

training processes with diﬀerent sampling ratios are illustrated in Figure 7. By the end of

training, each model exhibits a near-convergent status with diﬀerent ﬁnal accuracies. As

the ratio of uniform sampled locations increases, the ﬁnal accuracy increases as well:

when m¼36, ε4:2m, while when m¼196, ε2:0m. In addition, we can see a more

(a) x(32 ×32) (b) m=36, δ≈6.20 (c) m=100, δ≈3.44 (d) m=196, δ≈2.07

Figure 6. Sampled images uðxÞusing diﬀerent ratios of uniform sampling.

Figure 7. Training processes based on nine uniform sampling conﬁgurations with diﬀerent sampled

location ratios.

12 D. ZHU ET AL.

unstable curve when increasing the sampling ratio, which indicates that the learning

ability of the generator Gin our model will become more dominant compared to that of

discriminator Dwhen more observed values are given, therein showing more attempts

to jump out of the local optimum of the parameter space.

Meanwhile, Figure 8 shows the evolution of εfor generators with diﬀerent mon the

validation set. The decreasing trends of the interpolation error on the validation set are

similar to those of Figure 7 given diﬀerent uniform sampling conﬁgurations, which

further demonstrates the usability of our model in common interpolation tasks. The

multiple generation processes on the validation set also prove that our model is not

trained to produce high-quality interpolation results by simply over-ﬁtting or memoriz-

ing training samples.

3.4.2. Random sampling

As for the random sampling r,weﬁnd the ﬁnal accuracy with m¼100 is similar to that

of a uniform sampling with m¼36, as both ε4:2m(Figure 9(a)). However, the

produced spatial pattern can be problematic when we randomly choose the sampled

locations for each input image, as shown in Figure 9(b). This is caused mainly by the

variation in inputs during the CEDGAN’s training.

If we undersample in some areas, the local spatial variation patterns may not be

captured. Oversampling, on the other hand, may result in redundant data. Figure 9(b)

displays the diﬀerences between the interpolated results GðrðxÞÞ with some selected DEM

images x. It is interesting to ﬁnd that the CEDGAN-based interpolation method can

generate visually appealing fake images regardless of how the sampled locations are

distributed, even if the generated terrains in certain local areas may not be correct.

Actually, all interpolation methods suﬀer from the inﬂuence of improper spatial sampling

Figure 8. Generation process based on nine uniform sampling conﬁgurations with diﬀerent sampled

location ratios.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 13

conﬁguration to some extent. Although our model with random sampling still performs

well in terms of accuracy, the problem of improper sampling cannot be fully addressed

since the interpolated result may not be correct in some local structural patterns.

4. Discussion

4.1. Comparison with benchmark interpolation methods

To show how the CEDGAN-based spatial interpolation method outperforms classic

interpolation methods, we choose the inverse distance weighted (IDW) interpolation

(Shepard 1968) and ordinary kriging (OK) (Matheron 1963, Cressie 1990) as benchmarks

to test our method’s performance in terms of accuracy, computing speed, batch proces-

sing and visual ﬁdelity. The CEDGAN-based model is implemented using PyTorch, a deep

learning framework in python with GPU acceleration. IDW and OK are also implemented

in PyTorch by converting all variables into tensors for GPU acceleration. In this way, all

reported results in this section are computed using the same NVIDIA 1080TI GPU and

can be compared.

The comparisons between CEDGAN and IDW are listed in Table 1, where all CEDGANs

are trained for 200 epochs and the distance decay parameter of IDW is set to 2.0. We

apply these two methods under diﬀerent uniform sampling ratios, check their average

interpolation errors at the pixel level and record the corresponding computing speed.

The result shows that CEDGAN can achieve lower average errors (AE) compared to IDW.

For CEDGANs, the improvement in accuracy when increasing the sampling ratio is also

more signiﬁcant than with IDW. The computing speed (AS) of CEDGAN ( 1:5e3s)is

approximately 1000-times that of IDW ( 1:5s), and with increasing sampling ratio,

CEDGAN does not exhibit an obvious slow-down. Note that we do not take the average

training time (AT) into consideration when comparing the computing speed because, for

a pre-trained CEDGAN model, training is only ever performed once and can be done

beforehand. Given a mini-batch of 64 spatial images (32 32), the training time in our

experiment is shown in the second column of Table 1. AT increases as the sampling ratio

increases, mainly because we need time to sample from the real images. For the u100

Figure 9. Experiment based on a random sampling with 100 sampled locations.

14 D. ZHU ET AL.

sampling conﬁguration, the total training time for 200 epochs is 45,840 seconds

(0.3056 150,000), approximately 12.7 hours.

The comparison in Table 1 does not consider ordinary kriging because kriging

methods are naturally not suitable for batch processing due to the problem of semi-

variogram ﬁtting. Among a batch of spatial images, the shapes of experimental semi-

variograms can vary signiﬁcantly from image to image; thus, single arbitrary ﬁtting curve

is insuﬃcient to capture the complex spatial structures, and it is diﬃcult to determine

a prior ﬁtting function. Actually, the computing speed and pixel-level accuracy of OK are

both inferior to those of IDW when applied to batches of spatial images.

Figure 10 visualizes a batch of ground-truth DEM images and fake DEM images

generated by CEDGAN, IDW and OK under a 10 10 uniform sampling. Again, the

CEDGAN has been trained for 200 epochs, and the distance decay parameter of IDW is

set to 2.0. OK is implemented based on the PyKrige 1.3.2 package,

and we set the

ﬁtting curve to be spherical. The visual comparison between our method and bench-

mark methods is highly encouraging: given a relatively low sampling ratio ( <10%), both

Table 1. Comparisons of the CEDGAN-based and inverse distance weighted interpolation.

AE(m) AS(s)

fAT(s) CEDGAN IDW CEDGAN IDW

u36 0.1257 4.117 4.432 0.001471 1.397

u49 0.1620 3.433 3.539 0.001511 1.402

u64 0.2035 3.192 3.321 0.001611 1.428

u81 0.2445 2.693 3.160 0.001459 1.447

u100 0.3056 2.587 2.951 0.001448 1.408

u121 0.3629 2.455 2.794 0.001579 1.547

u144 0.4327 2.060 2.758 0.001588 1.582

u169 0.5006 2.156 2.708 0.001702 1.659

u196 0.5981 1.977 2.636 0.001659 1.711

fis the sampling conﬁguration, AT is the average training time for a mini-batch in CEDGAN, AE is the

average interpolation error (ε) at the pixel level, and AS is the average time for interpolating

a mini-batch of spatial images (each mini-batch contains 64 spatial images).

Figure 10. Visual comparison of the interpolation results of CEDGAN, IDW and OK based on a 10 

10 uniform sampling conﬁguration. No data augmentation is applied to any DEM images to show

the diﬀerence in terrains within a mini-batch; thus, the contrast ratio in some images may not be

high enough to be visible.

Note that it is actually unfair to compare diﬀerent spatial interpolation methods under the same circumstances: Kriging

and IDW are powerful when we do not have training data; training-based methods, such as MPS (Mariethoz and Caers

2014), are powerful when we have already acquired a physical model for the spatial process; and a well-trained

CEDGAN can provide satisfactory results without prior domain knowledge. We treat each spatial interpolation method

given its corresponding advantage; one should choose the most suitable method in practice.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 15

IDW and OK can only produce fake images that are very blurry (Figure 10(c) and 10(d)),

whereas CEDGAN can generate fake images (Figure 10(b)) that are very similar to the

real images (Figure 10(a)).

4.2. Investigation of the learned spatial knowledge

The reason why the pre-trained generator Goutperforms the benchmark spatial inter-

polation methods in both accuracy and visual ﬁdelity (Figure 10) is that we train the

generator through an encoder-decoder structure that can capture local geographical

structure patterns underlying the spatial distribution dataset after multiple adversarial

learning processes. Basically, the encoder module in Glooks for the relationships among

the sampled locations, while the decoder module assembles structural spatial patterns

with the determined sampled locations and outputs a most-convincing spatial distribu-

tion. In the case of a DEM, the structure patterns may be valleys and ridges of various

morphometric types (Wang et al.2010). Thus, the generated fake spatial images can be

very similar to the real images because of these combined local patterns, and the

accuracy is guaranteed by a suspicious discriminator who makes judgements based on

the priori sampled images.

4.2.1. Visualization of feature maps in a pre-trained generator

To further understand what spatial knowledge the CEDGAN-based spatial interpolation

model has learned, we adopted a pre-trained generator to visualize the feature maps in

the hidden layers during the generation process. Eight typical DEM images with diﬀerent

terrains are selected from the validation set. After a 10 10 uniform sampling, we input

the sampled DEM images into a pre-trained Gwith u100 and 200 epochs of training.

Here, we only display some representative feature maps captured in the ﬁrst hidden

layer (layer 1) and the last hidden layer (layer 5) of G(see Figure 1(b)). These two layers

belong to the encoder and decoder module, respectively, making their feature maps

worth investigation. In addition, since layer 1/layer 5 is the closest layer to the input/

output layer, it is easier to interpret its corresponding feature maps (Figure 11).

It can be observed that feature maps in layer 1 (fmð1Þ) aim at capturing the local

continuities around certain sampled locations as well as the relationships among

sampled locations, showing grid-style patterns with local hotspots and linear connec-

tions. After encoding and decoding, the feature maps in layer 5 (fmð5Þ) appear to have

captured some structural patterns related to diﬀerent terrain features such as valleys and

mountains. More importantly, the 1st and 3rd feature maps (from the left) in fmð1Þare

very similar; however, their corresponding feature maps in fmð5Þare signiﬁcantly diﬀer-

ent. A similar situation occurs to the 5th and 8th images. This phenomenon shows that

the pre-trained generator achieves good interpolation by learning many possible local

terrain patterns and that it somehow manages to merge these local patterns with

deterministicly sampled locations.

Therefore, instead of remembering training samples, our CEDGAN-based spatial

interpolation model captures complex spatial features underlying the given spatial

dataset and can perhaps be generalized into other domains with diﬀerent distribution

16 D. ZHU ET AL.

patterns but similar deep spatial features (a further discussion about the model’s

generalization ability is given in Section 4.3).

4.2.2. Slope analysis

In addition, we investigate the relationships among local slopes and the interpolation

accuracy at the pixel level. The results are illustrated in Figure 12. For each pixel of

a DEM image, we calculate the plane slope of a 3 3 neighborhood around it (fewer

neighborhoods for edge pixels) using the average maximum technique introduced by

Burrough and Mcdonnel (1999). The lower the slope value, the ﬂatter the terrain, and the

Figure 11. Deep feature maps in a pre-trained generator (200 epochs) with a 10 10 uniform

sampling. Each image is visualized using an independent color scale.

Figure 12. Correlations between local slopes and the accuracy of CEDGAN-based spatial

interpolation.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 17

elevations are considered more spatially continuous; the higher the slope value, the

steeper the terrain, and the elevations are less spatially continuous.

On the left side of Figure 12, we display the real DEM images (Real), the corresponding

slope images (Slope), the generated fake DEM images by u100 after 200 epochs (Fake), and

the error images (Error) for a mini-batch of the validation set. The variation pattern of the

error images seems to be correlated with that of the slope images. Then, we draw the

relationships between the slopes and errors in a kernel density estimation on the right side

of Figure 12. The slopes and errors are normalized to ½0;1for the visualization. The Pearson

correlation coeﬃcient ρ¼0:44 and the Spearman correlation coeﬃcient r¼0:52 indicate

a positive correlation between the local slope of terrains and the interpolation error.

Since the CEDGAN-based model is designed to capture spatial dependencies as basic

knowledge through the convolutional layers in G‘s encoder-decoder structure, it is

naturally more diﬃcult to predict values at locations where the spatial attribute values

vary too quickly. Slope analysis also shows that our model is designed to learn some

typical local spatial patterns of attributes (as shown in Section 4.2.1); thus, an abnormal

pattern, e.g. a very steep slope, is not easy to reproduce.

4.3. Potentials and limitations

4.3.1. Potential to apply pre-trained models across domains

Assuming the pre-trained CEDGAN model has learned enough DEM deep features because

our training dataset covers various terrains in mainland China, including plateau mountain

areas, basin areas, high altitude plains as well as river deltas (Section 3.1), we hope to answer

questions about how to solve problems in new domains through the transfer of the learned

spatial knowledge. Applying the pre-trained model outside our study area can help test the

model’s generalization ability. If the deep spatial features captured before are capable of

describing patterns in a new area, the generator should achieve satisfactory interpolation

results without any parameter ﬁne tuning (Yosinski et al.2014).

We choose data from Florence, Italy as a case study to conduct the transfer experi-

ment. Florence is the capital of the Italian region of Tuscany. It lies in a basin formed by

hills surrounding and with several rivers ﬂowing through it. The elevations in this area

range from 22 m to 1,626 m, which is quite diﬀerent from that of the selected areas in

China (−7 m to 6,999 m). Meanwhile, the terrain of the Florence, Italy area can be

considered as a basin-mountain area with relative low altitude, which is not explicitly

given during the previous model training for China.

Figure 13 illustrates how we use a pre-trained generator for China with a 6 6

uniform sampling conﬁguration (u36) and 200 epochs of training to interpolate the

DEM data of Florence, Italy. We cropped 3,000 real DEM images of size 1 32 32 using

the same method explained in Section 3.1. The overall interpolation accuracy reaches

approximately 9.1 m per pixel before any model ﬁne tuning is performed. Some fake

DEM images are displayed to help understand the result.

The accuracy for Florence is not as high as the 4.2 m obtained with u36 in Figure 7,

although it is acceptable because the terrains of Florence are indeed very diﬀerent from

the previous training set. The fake images in Figure 13 indicate that when transferring

the pre-trained model to a new domain, the generator can still generate realistic-looking

DEM images with similar local terrain structural patterns compared to the real images.

18 D. ZHU ET AL.

This experiment demonstrates that deep feature maps captured by our model for China

can be transferred to address new terrains in Florence. The pre-trained model can be applied

across domains if the spatial features in the new domain can be considered roughly similar to

those of the previous domain. However, if the features between two domains are too

diﬀerent, e.g. transferring from a DEM dataset to a meteorological dataset, the pre-trained

model may need additional training data in the new domain to improve its performance.

Depending on the domains, a complete new training might be necessary if data are available.

4.3.2. Limitations and future directions

In this methodology-oriented paper, we use a large DEM dataset that contains various

ground-truth terrains to validate the feasibility and test the stability of our method.

Admittedly, spatial interpolation based on CEDGAN, as proposed in this research, has

some limitations that invite future works to investigate.

This adversarial deep learning framework requires some training data to capture the

complex spatial patterns in certain domains and perform interpolation based on this

learned knowledge. However, in most scenarios of traditional GIS that need spatial

interpolation, we often do not have access to suﬃcient ground-truth data to train the

CEDGAN model. In Section 4.2, we investigated the learned spatial deep features of our

pre-trained model, and we tested its generalization ability to be transferred across

domains. In practice, spatial deep features may be highly diﬀerent when we transfer

a pre-trained model in a domain with suﬃcient spatial data into another domain with

little ground truth. It is basically impossible to expect the pre-trained model to perform

well in an unknown domain with no ﬁne tuning of the model’s parameters.

As good spatial coverage of sampled data is essential to retaining local spatial

variabilities in spatial interpolation, a lower sampling density would cause a worse

interpolation result. This is a truth that cannot be overcome by existing methods. With

Figure 13. Using the pre-trained generator for China to interpolate data on Florence, Italy.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 19

the emergence of geospatial big data, the acquisition of historical spatial distribution

datasets with very-high spatiotemporal resolution has become much easier. It is possible

to use these historical data as training sets and train our CEDGAN model to capture

spatial knowledge about geographical phenomena that are of interest, and thus repro-

duce spatial patterns with more realistic details.

For example, if we have the historical precipitation data of Nmeteorological obser-

vatories, it is practical to train a CEDGAN with only a small number of observatories as

the sampled locations. In this way, we can reduce the number of active observatories to

achieve cost savings. Moreover, if the captured deep spatial patterns of precipitation are

representative, we can directly transfer the pre-trained generator into a new area with

insuﬃcient meteorological observatories. Similarly, in a smart city, multiple types of

sensors are deployed with high spatial resolution to record the activities of urban

citizens. The CEDGAN-based spatial interpolation idea can help signiﬁcantly reduce the

number of sensors and contribute to the development of a smart city.

5. Conclusions

Deep learning approaches are increasingly used to understand spatial processes from

a data-driven perspective, as they are powerful in terms of their ability to extract underlying

patterns given complex spatial contexts. The remarkable characteristics of convolutional

neural networks –local connectivity and shared weights –enable the deep learning models

to better focus on both features near each other and far-away features and thus provide

a way to non-linearly approximate the complex functions describing spatial patterns.

Spatial interpolation is a family of geostatistical methods that attempts to capture the

spatial variation patterns underlying the observed limited spatial samples and make

a reasonable estimation of spatial patterns based on both spatial continuity and heteroge-

neity. Since the workﬂow of spatial interpolation can be basically regarded as a generative

procedure, we demonstrate, for the ﬁrst time, the feasibility of spatial interpolation based on

a modern deep learning framework named conditional generative adversarial neural net-

works. We design a conditional encoder-decoder generative adversarial network (CEDGAN)

that can capture the complex properties of input spatial data distributions and perform spatial

interpolation tasks under diﬀerent circumstances. A CEDGAN consists of a generator Gand

a discriminator D. The generator Gattempts to learn the relationships among sampled spatial

data and corresponding real spatial data, and it uses the learned spatial knowledge to

generate fake spatial data as accurately as possible. The discriminator Dcaptures the corre-

spondences among spatial data and their sampled data, with the objective of determining

whether the generated fake data from Gcan be considered correct.

A case study on terrain interpolation for China showed that the accuracy of the CEDGAN-

based method can achieve an error of approximately 2.5 meters per location even when the

sampling ratio is less than 10%.Diﬀerent sampling conﬁgurations were adopted to test the

stability of our proposed method. The CEDGAN-based spatial interpolation outperforms

benchmark approaches, such as inverse distance weighted (IDW) interpolation and ordinary

kriging (OK), in terms of accuracy, batching capability, computing speed and visual ﬁdelity.

In addition, multiple experiments were conducted to investigate the learned complex

spatial knowledge in pre-trained models, and we discussed the potential of generalizing

the CEDGAN-based spatial interpolation idea to a broader range of GIS applications.

20 D. ZHU ET AL.

Our work is a positive attempt to incorporate artiﬁcial intelligence into discovering

deep spatial features of geographical patterns. We introduce the idea of using condi-

tional adversarial generation to model the workﬂow of spatial interpolation and hope-

fully to enlighten future works concerning spatial prediction. With the rapid

development of big geo-data and artiﬁcial intelligence, the CEDGAN framework can

potentially be adopted in various geographic applications that are related to spatial

estimation, including both natural phenomena (precipitation, air temperature, air pres-

sure, etc.) and socio-economic phenomena (population, poverty, traﬃc, etc.).

Notes

1. METI of Japan and NASA released a second version of the Global Digital Elevation Model

(GDEM) from the Advanced Spaceborne Thermal Emission and Reﬂection Radiometer

(ASTER) in mid-October, 2011 (https://lpdaac.usgs.gov/). GDEM V2 has an overall accuracy

of approximately 17 m at the 95% conﬁdence level, and we consider these data as the

ground-truth elevations in this work.

2. https://pypi.python.org/pypi/PyKrige.

Acknowledgments

The authors would like to thank Dr. Lei Dong, Dr. Michael Goodchild, Dr. Tao Cheng, Dr. Krzysztof

Janowicz, Dr. May Yuan and the anonymous referees for their insightful comments.

Disclosure statement

No potential conﬂict of interest was reported by the authors.

Funding

This research was supported by the National Natural Science Foundation of China [41625003 and

41830645] and the National Key Research and Development Program of China [2017YFB0503602]

and the Open Project Fund of the institute for China Sustainable Urbanization, Tsinghua University

(TUCSU-K-17026-01).

Notes on contributors

Di Zhu received his B.S. in Geographic Information Systems from Peking University and a dual B.S.

in Economics also from Peking University. He is currently a PhD candidate at the Institute of

Remote Sensing and Geographical Information Systems, Peking University. His research interests

include geospatial modelling, social sensing and applied artiﬁcial intelligence.

Ximeng Cheng received the B.S. and M.S. degrees from China University of Geosciences (Beijing).

He is currently a PhD candidate in GIScience at the Institute of Remote Sensing and

Geographical Information Systems, Peking University. His research interests include spatiotem-

poral data mining, deep learning and urban analysis etc.

Fan Zhang received his B.S. degree from Beijing Normal University, Zhuhai and M.Sc and

PhD degree from Chinese University of Hong Kong. He is currently a postdoctoral fellow at

Insitute of Remote Sensing and Geographical Information Systems, Peking University. His research

interests include spatiotemporal data mining, machine learning and computer vision.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 21

Xin Yao received his B.S. degree from Wuhan University in 2015. He is currently pursuing the PhD

degree in GIScience with the Institute of Remote Sensing and Geographical Information Systems,

Peking University. His primary research interest lies in spatial data mining and geographic

information visualization.

Yong Gao received the B.S. degree from Beijing Normal University in 1997 and the M.S. and PhD

degrees from Peking University in 2000 and 2003, respectively. He is currently an Associate

Professor of GIScience with the Institute of Remote Sensing and Geographical Information

Systems, Peking University. His research interests lie in spatial data mining, geographic informa-

tion retrieval, and high-performance computing with geographical data.

Yu Liu received the B.S., M.S. and PhD degrees from Peking University in 1994, 1997 and 2003. He

is currently a professor at the Institute of Remote Sensing and Geographical Information Systems,

Peking University. His research interest mainly concentrates in humanities and social science based

on big geo-data.

ORCID

Di Zhu http://orcid.org/0000-0002-3237-6032

Ximeng Cheng http://orcid.org/0000-0001-9923-7240

Yu Liu http://orcid.org/0000-0002-0016-2902

References

Anselin, L., 1995. Local indicators of spatial association–LISA. Geographical Analysis, 27 (2), 93–115.

doi:10.1111/j.1538-4632.1995.tb00338.x

Antipov, G., Baccouche, M., and Dugelay, J.L., 2017. Face aging with conditional generative

adversarial networks. In:2017 IEEE International Conference on Image Processing (ICIP),

2089–2093, Beijing, China.

Appelhans, T., et al., 2015. Evaluating machine learning approaches for the interpolation of

monthly air temperature at Mt. Kilimanjaro, Tanzania. Spatial Statistics, 14, 91–113.

doi:10.1016/j.spasta.2015.05.008

Atkinson, P.M. and Lloyd, C.D., 2009. Geostatistics and spatial interpolation. In:The SAGE handbook

of spatial analysis. 159–181. London, United Kingdom: SAGE Publications.

Azaele, S., et al., 2009. Predicting spatial similarity of freshwater ﬁsh biodiversity. Proceedings of the

National Academy of Sciences, 106 (17), 7058–7062. doi:10.1073/pnas.0805845106

Badrinarayanan, V., Kendall, A., and Cipolla, R., 2017. SegNet: a deep convolutional

encoder-decoder architecture for scene segmentation. IEEE Transactions on Pattern Analysis &

Machine Intelligence, 39 (12), 2481–2495.

Burrough, P.A. and Mcdonnel, R.A., 1999. Principles of geographical information systems - spatial

information systems and geostatistics. Landscape & Urban Planning, 15 (3), 357–358.

Chen, Z., et al., 2016. Convolutional neural network based DEM super resolution. ISPRS -

International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences,

XLI-B3, 247–250. doi:10.5194/isprsarchives-XLI-B3-247-2016

Cochran, W.G., 1963.Sampling techniques. Hoboken, New Jersey, US: Wiley.

Cressie, N., 1990. The origins of kriging. Mathematical Geology,22,239–252. doi:10.1007/BF00889887

Diggle, P.J., Tawn, J.A., and Moyeed, R.A., 1998. Model-based geostatistics. Journal of the Royal

Statistical Society: Series C (Applied Statistics), 47 (3), 299–350. doi:10.1111/1467-9876.00113

Fischer, M.M., 1998. Computational neural networks: a new paradigm for spatial analysis.

Environment and Planning A, 30 (10), 1873–1891. doi:10.1068/a301873

Fischer, M.M., Reismann, M., and And Scherngell, T., 2010. Spatial interaction and spatial auto-

correlation. In: L. Anselin and S.J. Rey, eds. Perspectives on spatial data analysis. Berlin,

Heidelberg: Springer Berlin Heidelberg, 61–79. doi:10.1007/978-3-642-01976-0_5

22 D. ZHU ET AL.

Fotheringham, A.S. and Rogerson, P.A., 2008.The SAGE handbook of spatial analysis. London,

United Kingdom: SAGE Publications.

Fotheringham, A.S., Yang, W., and Kang, W., 2017. Multiscale geographically weighted regression

(MGWR). Annals of the American Association of Geographers, 107 (6), 1247–1265. doi:10.1080/

24694452.2017.1352480

Gauthier, J., 2014. Conditional generative adversarial nets for convolutional face generation. In:

Class project for Stanford CS231N: convolutional neural networks for visual recognition, Winter

semester. 5. Stanford, CA, US.

Goodchild, M.F., 2004. GIScience, geography, form, and process. Annals of the Association of

American Geographers, 94 (4), 709714.

Goodchild, M.F., Anselin, L., and Deichmann, U., 1993. A framework for the areal interpolation of

socioeconomic data. Environment & Planning A, 25 (3), 383–397. doi:10.1068/a250383

Goodfellow, I.J., et al., 2014. Generative adversarial nets. In:Advances in Neural Information

Processing Systems, 2672–2680. Montréal, Canada.

adversarial networks. Acm Transactions on Graphics, 36 (6), Article No. 228.

Hedayat, A. and Sinha, B.K., 1991.Design and inference in ﬁnite population sampling. Hoboken, New

Jersey, US: Wiley.

Hengl, T., Heuvelink, G.B., and Rossiter, D.G., 2007. About regression-kriging: from equations to

case studies. Computers & Geosciences, 33 (10), 1301–1315. doi:10.1016/j.cageo.2007.05.001

Hubert, L.J., Golledge, R.G., and Costanzo, C.M., 1981. Generalized procedures for evaluating spatial

autocorrelation. Geographical Analysis, 13 (3), 224–233. doi:10.1111/j.1538-4632.1981.tb00731.x

Ioﬀe, S. and Szegedy, C., 2015. Batch normalization: accelerating deep network training by

reducing internal covariate shift. International Conference on Machine Learning, 448–456. Lille,

France.

Isola, P., et al., 2016. Image-to-image translation with conditional adversarial networks. arXiv

preprint,p. arXiv:1611.07004.

Kingma, D.P. and Welling, M., 2014. Auto-encoding variational bayes. In:International Conference

on Learning Representations (ICLR) 2014.Banﬀ, Canada.

Laloy, E., et al., 2018. Trainingimage based geostatistical inversion using a spatial generative

adversarial neural network. Water Resources Research, 54, 381–406. doi:10.1002/2017WR022148

Lam, N., 2009. Spatial interpolation. International Encyclopedia of Human Geography,10(2),369–376.

Le, Q.V., 2013. Building high-level features using large scale unsupervised learning. In:IEEE

International Conference on Acoustics, Speech and Signal Processing, 8595–8598. Vancouver,

Canada.

LeCun, Y., Bengio, Y., and Hinton, G., 2015. Deep learning. Nature, 521 (7553), 436–444.

doi:10.1038/nature14539

Li, J. and Heap, A.D., 2011. A review of comparative studies of spatial interpolation methods in

environmental sciences: performance and impact factors. Ecological Informatics, 6 (3), 228–241.

doi:10.1016/j.ecoinf.2010.12.003

Li, L., Romary, T., and Caers, J., 2015. Universal kriging with training images. Spatial Statistics, 14,

240–268. doi:10.1016/j.spasta.2015.04.004

Long, J., Shelhamer, E., and Darrell, T., 2015. Fully convolutional networks for semantic segmenta-

tion. In:Proceedings of the IEEE conference on computer vision and pattern recognition.

3431–3440. Boston, MA, US.

Lu, Y., Tai, Y.W., and Tang, C.K., 2017. Conditional CycleGAN for attribute guided face image

generation. arXiv preprint, p. arXiv:1705.09966.

Mansimov, E., Parisotto, E., Ba, J. L., & Salakhutdinov, R., 2015. Generating images from captions

with attention. arXiv preprint, p. arXiv:1511.02793.

Mariethoz, G. and Caers, J., 2014.Multiple-point geostatistics: stochastic modeling with training

images. Hoboken, New Jersey, US: Wiley.

Marsily, G.D., et al., 2005.Dealing with spatial heterogeneity. Hydrogeology Journal, 13 (1), 161–183.

doi:10.1007/s10040-004-0432-3

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 23

Matheron, G., 1963. Principles of geostatistics. Economic Geology, 58 (8), 1246–1266. doi:10.2113/

gsecongeo.58.8.1246

Mirza, M. and Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint, p.

arXiv:1411.1784.

Nair, V. and Hinton, G.E., 2010. Rectiﬁed linear units improve restricted boltzmann machines.

International Conference on Machine Learning, 807–814. Haifa, Israel.

Oliver, M.A. and Webster, R., 1990. Kriging: a method of interpolation for geographical information

systems. International Journal of Geographical Information Systems, 4 (3), 313–332. doi:10.1080/

02693799008941549

Radford, A., Metz, L., and Chintala, S., 2015. Unsupervised representation learning with deep

convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434.

Ronneberger, O., Fischer, P., and Brox, T., 2015. U-Net: convolutional networks for biomedical

image segmentation. In:International Conference on Medical Image Computing and Computer-

Assisted Intervention, 234–241. Munich, Germany.

Salimans, T., et al., 2016. Improved techniques for training GANs. In:Advances in Neural Information

Processing Systems. 2234–2242. Barcelona, Spain.

Schmidhuber, J., 2014. Deep learning in neural networks: an overview. Neural Networks, 61,

85–117. doi:10.1016/j.neunet.2014.09.003

Shepard, D., 1968. A two-dimensional interpolation function for irregularly-spaced data. In:ACM

National Conference, 517–524. New York, NY, US. doi:10.1055/s-0028-1105114

Thompson, S.K., 1996.Adaptive sampling. Hoboken, New Jersey, US: Wiley.

Tobler, W.R., 1970. A computer movie simulating urban growth in the Detroit region. Economic

Geography, 46, 234–240. doi:10.2307/143141

Wang, D., et al., 2010. Morphometric characterisation of landform from DEMs. International Journal

of Geographical Information Science,24 (2), 305–326. doi:10.1080/13658810802467969

Xu, B., et al., 2015. Empirical evaluation of rectiﬁed activations in convolutional network. arXiv

preprint, p. arXiv:1505.00853.

Yosinski, J., et al., 2014. How transferable are features in deep neural networks? In:Advances in

Neural Information Processing Systems. 3320–3328. Montréal, Canada.

Zhao, L., et al., 2019. Simultaneous color-depth super-resolution with conditional generative

adversarial networks. Pattern Recognition, 88, 356–369. doi:10.1016/j.patcog.2018.11.028

Zhu, D., et al., 2018. Inferring spatial interaction patterns from sequential snapshots of spatial

distributions. International Journal of Geographical Information Science, 32 (4), 783–805.

doi:10.1080/13658816.2017.1413192

24 D. ZHU ET AL.

Integrating CEDGAN and FCNN for Enhanced Evaluation and Prediction of Plant Growth Environments in Urban Green Spaces

Article

Full-text available

Apr 2024

Conducting precise evaluations and predictions of the environmental conditions for plant growth in green spaces is crucial for ensuring their health and sustainability. Yet, assessing the health of urban greenery and the plant growth environment represents a significant and complex challenge within the fields of urban planning and environmental management. This complexity arises from two main challenges: the limitations in acquiring high-density, high-precision data, and the difficulties traditional methods face in capturing and modeling the complex nonlinear relationships between environmental factors and plant growth. In light of the superior spatial interpolation capabilities of CEDGAN (conditional encoder–decoder generative adversarial neural network), notwithstanding its comparative lack of robustness across different subjects, and the excellent ability of FCNN (fully connected neural network) to fit multiple nonlinear equation models, we have developed two models based on these network structures. One model performs high-precision spatial attribute interpolation for urban green spaces, and the other predicts and evaluates the environmental conditions for plant growth within these areas. Our research has demonstrated that, following training with various samples, the CEDGAN network exhibits satisfactory performance in interpolating soil pH values, with an average pixel error below 0.03. This accuracy in predicting both spatial distribution and feature aspects improves with the increase in sample size and the number of controlled sampling points, offering an advanced method for high-precision spatial attribute interpolation in the planning and routine management of urban green spaces. Similarly, FCNN has shown commendable performance in predicting and evaluating plant growth environments, with prediction errors generally less than 0.1. Comparing different network structures, models with fewer hidden layers and nodes yielded superior training outcomes.

A DEM Image Super-Resolution Reconstruction Method Based on the Texture Transfer of High-Resolution Remote Sensing Images

Article

Full-text available

Jan 2024

Traditional methods for acquiring high-resolution Digital Elevation Models (DEMs) are costly and laborious. Deep learning-based image super-resolution (SR) offers a promising alternative, but requires substantial training data. High-resolution DEMs, however, are often scarcer than satellite images at the same resolution. Recognizing the strong correlation between DEM grayscale images and high-resolution satellite imagery, we propose a novel method called EMASA-SR: Enhanced DEM Image super-resolution Reconstruction using Texture Transfer. It leverages texture information from satellite images to enhance the resolution of low-resolution DEMs. We address limitations of existing texture transfer methods by integrating a pyramid pooling module (PPM) and selective kernel convolution (SKC) into the network. PPM strengthens feature extraction for complex terrain objects, while SKC minimizes texture loss and feature confusion. Our experiments used 10 m Sentinel-2 remote sensing images and AW3D30 DEM data to upscale 30 m DEMs to 10 m resolution. Validation with ground-truth elevation data and ICESat-2 laser altimetry data revealed significant improvements. Compared to the original DEM, EMASA-SR achieved a 21.42%-37.44% reduction in elevation RMSE and a 23.30%-38.99% decrease in MAE. Moreover, it outperformed other super-resolution methods, achieving a 2.87%-28.27% reduction in RMSE and a 7.83%-30.04/% decrease in MAE.

LIE-DSM: Leveraging Single Remote Sensing Imagery to Enhance Digital Surface Model Resolution

Article

Jan 2024

Digital Surface Models (DSMs) have numerous valuable applications in infrastructure and industrial development. However, the spatial resolutions of DSMs are often limited due to data acquisition constraints, resulting in potential inaccuracies in these applications. Recently, deep learning-based super-resolution algorithms have been utilized to enhance the accuracy of DSMs. Despite their success, these algorithms still possess limited accuracy and robustness. As such, a new DSM super-resolution algorithm named LIE-DSM is proposed, which leverages single remote sensing imagery (RSI) to enhance the accuracy of low-resolution DSM. Specifically, we introduce a dual-input neural network tailored to the characteristics of RSIs and DSMs to generate a high-resolution output. Experiments show that LIE-DSM attains outstanding performance in all metrics and the improvements exceeds 15% in high-ratio upsampling. Moreover, the visualized results showcase more accurate shapes, crisper edges, and a distribution closely resembling the ground truth.

Linking spatial pattern to process: an old challenge with new barriers

Chapter

May 2024

Trisalyn A. Nelson

Digital Twin based Test- and Verify Framework of Human-Robot Collaboration Solutions

Conference Paper

May 2024

An ensemble spatial prediction method considering geospatial heterogeneity

Article

Jun 2024

Diff-DEM: A Diffusion Probabilistic Approach to Digital Elevation Model Void Filling

Article

Jan 2024

Digital Elevation Models (DEMs) are crucial for modeling and analyzing terrestrial environments, but voids in DEMs can compromise their downstream use. Diff-DEM is a self-supervised method for filling DEM voids that leverages a Denoising Diffusion Probabilistic Model (DDPM). Conditioned on a void-containing DEM, the DDPM acts as a transition kernel in the diffusion reversal, progressively reconstructing a sharp and accurate DEM. Both qualitative and quantitative assessments demonstrate Diff-DEM outperforms existing DEM inpainting, including Generative Adversarial Network (GAN) methods, Inverse Distance Weighting (IDW), Kriging, LR B-spline, and Perona-Malik diffusion. The comparison is on Gavriil’s and on our benchmark that expands Gavriil’s dataset from 63 to 217 full-size (5051 × 5051) 10-meter GeoTIFF images sourced from the Norwegian Mapping Authority; and from 50 DEMs to three groups of 1k each of increasing void size. Code and dataset: https://github.com/kylelo/Diff-DEM.

A systematic review and meta-analysis of artificial neural network, machine learning, deep learning, and ensemble learning approaches in field of geotechnical engineering

Article

Full-text available

May 2024
NEURAL COMPUT APPL

Artificial neural networks (ANN), machine learning (ML), deep learning (DL), and ensemble learning (EL) are four outstanding approaches that enable algorithms to extract information from data and make predictions or decisions autonomously without the need for direct instructions. ANN, ML, DL, and EL models have found extensive application in predicting geotechnical and geoenvironmental parameters. This research aims to provide a comprehensive assessment of the applications of ANN, ML, DL, and EL in addressing forecasting within the field related to geotechnical engineering, including soil mechanics, foundation engineering, rock mechanics, environmental geotechnics, and transportation geotechnics. Previous studies have not collectively examined all four algorithms—ANN, ML, DL, and EL—and have not explored their advantages and disadvantages in the field of geotechnical engineering. This research aims to categorize and address this gap in the existing literature systematically. An extensive dataset of relevant research studies was gathered from the Web of Science and subjected to an analysis based on their approach, primary focus and objectives, year of publication, geographical distribution, and results. Additionally, this study included a co-occurrence keyword analysis that covered ANN, ML, DL, and EL techniques, systematic reviews, geotechnical engineering, and review articles that the data, sourced from the Scopus database through the Elsevier Journal, were then visualized using VOS Viewer for further examination. The results demonstrated that ANN is widely utilized despite the proven potential of ML, DL, and EL methods in geotechnical engineering due to the need for real-world laboratory data that civil and geotechnical engineers often encounter. However, when it comes to predicting behavior in geotechnical scenarios, EL techniques outperform all three other methods. Additionally, the techniques discussed here assist geotechnical engineering in understanding the benefits and disadvantages of ANN, ML, DL, and EL within the geo techniques area. This understanding enables geotechnical practitioners to select the most suitable techniques for creating a certainty and resilient ecosystem.

DKNN: deep kriging neural network for interpretable geospatial interpolation

Article

May 2024

Generalized Yang Chizhong filtering and interpolation method without stationarity assumption

Article

Apr 2024
INT J GEOGR INF SCI

The stationarity assumption of geostatistical methods is difficult to satisfy in practice. To overcome this limitation, this study proposed a geometric and statistical coupling strategy for modeling spatial dependence structures and developed a generalized Yang Chizhong filtering and interpolation (GYangCZ) method without the assumption of stationarity. In this work, we theoretically prove the effectiveness of Yang Chizhong filtering in fitting spatial dependence structures from a geometric perspective, and develop an orientation-constrained Yang Chizhong filtering to fit the local and discontinuous spatial dependence structures. To measure nonstationary spatial dependence structure, we define a local statistical indicator (i.e., fundamental variation function) by comparing the variance of the original data and the fitted geometric surfaces obtained under different filtering radii. The fundamental variation function is used as the kernel function to obtain the approximate best linear unbiased estimators at unobserved locations. We theoretically demonstrate that when only a linear drift exists in local areas, GYangCZ does not require the stationarity assumption. GYangCZ was used to estimate the gold grade of the Xiadian gold deposit in China. The results show that GYangCZ outperformed ordinary kriging, moving window kriging, and kriging convolution networks. GYangCZ is easy to implement with wide applications in geoscience.

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Article

Full-text available

Dec 2017

We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] and also with the well known DeepLab-LargeFOV [3] , DeconvNet [4] architectures. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared to other architectures. We also provide a Caffe implementation of SegNet and a web demo at http://mi.eng.cam.ac.uk/projects/segnet/.

Inferring spatial interaction patterns from sequential snapshots of spatial distributions

Article

Full-text available

Dec 2017

Spatial interactions underlying consecutive sequential snapshots of spatial distributions, such as the migration flows underlying temporal population snapshots, can reflect the details of spatial evolution processes. In the era of big data, we have access to individual-level data, but the acquisition of high-quality spatial interaction data remains a challenging problem. Most research has been focused on distributions of movable objects or the modelling of spatial interaction patterns, with few attempts to identify hidden spatial interaction patterns from temporal transitions of spatial distributions. In this article, we introduced an approach to infer spatial interaction patterns from sequential snapshots of spatial population distributions by incorporating linear programming and the spatial constraints of human movement. Experiments using synthetic data were conducted using four simple scenarios to explore the characteristics of our method. The proposed method was used to extract interurban flows of migrants during the Chinese Spring Festival in 2016. Our research demonstrated the feasibility of using discrete multi-temporal snapshots of population distributions in space to infer spatial interaction patterns and offered a general analytical framework from snapshot data to spatial interaction patterns.

Face aging with conditional generative adversarial networks

Conference Paper

Sep 2017

Training-Image-Based Geostatistical Inversion Using a Spatial Generative Adversarial Neural Network

Article

Jan 2018

Probabilistic inversion within a multiple-point statistics framework is often computationally prohibitive for high-dimensional problems. To partly address this, we introduce and evaluate a new training-image based inversion approach for complex geologic media. Our approach relies on a deep neural network of the generative adversarial network (GAN) type. After training using a training image (TI), our proposed spatial GAN (SGAN) can quickly generate 2D and 3D unconditional realizations. A key characteristic of our SGAN is that it defines a (very) low-dimensional parameterization, thereby allowing for efficient probabilistic inversion using state-of-the-art Markov chain Monte Carlo (MCMC) methods. In addition, available direct conditioning data can be incorporated within the inversion. Several 2D and 3D categorical TIs are first used to analyze the performance of our SGAN for unconditional geostatistical simulation. Training our deep network can take several hours. After training, realizations containing a few millions of pixels/voxels can be produced in a matter of seconds. This makes it especially useful for simulating many thousands of realizations (e.g., for MCMC inversion) as the relative cost of the training per realization diminishes with the considered number of realizations. Synthetic inversion case studies involving 2D steady-state flow and 3D transient hydraulic tomography with and without direct conditioning data are used to illustrate the effectiveness of our proposed SGAN-based inversion. For the 2D case, the inversion rapidly explores the posterior model distribution. For the 3D case, the inversion recovers model realizations that fit the data close to the target level and visually resemble the true model well.

Generative Adversarial Nets

Article

Jun 2014

We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

Auto-Encoding Variational Bayes

Conference Paper

Dec 2014

How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results.

Simultaneous Color-Depth Super-Resolution with Conditional Generative Adversarial Networks

Article

Nov 2018
PATTERN RECOGN

In this paper, color-depth conditional generative adversarial networks (CDcGAN) are proposed to resolve the problems of simultaneous color image super-resolution and depth image super-resolution in 3D videos. Firstly, a generative network is presented to leverage the mutual information of the low-resolution color image and low-resolution depth image so that they can enhance each other considering their geometric structural similarity in the same scene. Secondly, three auxiliary losses of data loss, total variation loss, and 8-connected gradient difference loss are introduced to train this generative network to ensure that the generated images are close to the real ones in addition to the adversarial loss. Finally, we study the CDcGAN and its variants. Experimental results show that the proposed approach can produce the high-quality color image and depth image from a pair of low-quality images, and it is superior to several other leading methods. Additionally, it has also been used to resolve the problems of concurrent image smoothing and edge detection, as well as the problem of HR-color-image guided depth super-resolution to show the effectiveness and universality of the proposed method.

Multiscale Geographically Weighted Regression (MGWR)

Article

Aug 2017

Scale is a fundamental geographic concept, and a substantial literature exists discussing the various roles that scale plays in different geographical contexts. Relatively little work exists, though, that provides a means of measuring the geographic scale over which different processes operate. Here we demonstrate how geographically weighted regression (GWR) can be adapted to provide such measures. GWR explores the potential spatial nonstationarity of relationships and provides a measure of the spatial scale at which processes operate through the determination of an optimal bandwidth. Classical GWR assumes that all of the processes being modeled operate at the same spatial scale, however. The work here relaxes this assumption by allowing different processes to operate at different spatial scales. This is achieved by deriving an optimal bandwidth vector in which each element indicates the spatial scale at which a particular process takes place. This new version of GWR is termed multiscale geographically weighted regression (MGWR), which is similar in intent to Bayesian nonseparable spatially varying coefficients (SVC) models, although potentially providing a more flexible and scalable framework in which to examine multiscale processes. Model calibration and bandwidth vector selection in MGWR are conducted using a back-fitting algorithm. We compare the performance of GWR and MGWR by applying both frameworks to two simulated data sets with known properties and to an empirical data set on Irish famine. Results indicate that MGWR not only is superior in replicating parameter surfaces with different levels of spatial heterogeneity but provides valuable information on the scale at which different processes operate.

Efficient training-image based geostatistical simulation and inversion using a spatial generative adversarial neural network

Article

Aug 2017

Probabilistic inversion within a multiple-point statistics framework is still computationally prohibitive for large-scale problems. To partly address this, we introduce and evaluate a new training-image based simulation and inversion approach for complex geologic media. Our approach relies on a deep neural network of the spatial generative adversarial network (SGAN) type. After training using a training image (TI), our proposed SGAN can quickly generate 2D and 3D unconditional realizations. A key feature of our SGAN is that it defines a (very) low-dimensional parameterization, thereby allowing for efficient probabilistic (or deterministic) inversion using state-of-the-art Markov chain Monte Carlo (MCMC) methods. A series of 2D and 3D categorical TIs is first used to analyze the performance of our SGAN for unconditional simulation. The speed at which realizations are generated makes it especially useful for simulating over large grids and/or from a complex multi-categorical TI. Subsequently, synthetic inversion case studies involving 2D steady-state flow and 3D transient hydraulic tomography are used to illustrate the effectiveness of our proposed SGAN-based probabilistic inversion. For the 2D case, the inversion rapidly explores the posterior model distribution. For the 3D case, the inversion recovers model realizations that fit the data close to the target level and visually resemble the true model well. Future work will focus on the inclusion of direct conditioning data and application to continuous TIs.

Conditional CycleGAN for Attribute Guided Face Image Generation

Article

May 2017

State-of-the-art techniques in Generative Adversarial Networks (GANs) such as cycleGAN is able to learn the mapping of one image domain $X$ to another image domain $Y$ using unpaired image data. We extend the cycleGAN to ${\it Conditional}$ cycleGAN such that the mapping from $X$ to $Y$ is subjected to attribute condition $Z$. Using face image generation as an application example, where $X$ is a low resolution face image, $Y$ is a high resolution face image, and $Z$ is a set of attributes related to facial appearance (e.g. gender, hair color, smile), we present our method to incorporate $Z$ into the network, such that the hallucinated high resolution face image $Y'$ not only satisfies the low resolution constrain inherent in $X$, but also the attribute condition prescribed by $Z$. Using face feature vector extracted from face verification network as $Z$, we demonstrate the efficacy of our approach on identity-preserving face image super-resolution. Our approach is general and applicable to high-quality face image generation where specific facial attributes can be controlled easily in the automatically generated results.

Spatial interpolation using conditional generative adversarial neural networks

Abstract and Figures

Recommended publications

2409 Mapping by Kriging for the Path Designing

Assessing the local uncertainty of precipitation with copulas

Assessment of Geostatistical Interpolation Method for Spatial Soil Mapping in Imba-Numa watershed, J...

Spatial variation of soil salinity and organic matter under reclamation in an abandoned salt pan