ArticlePDF Available

A Detection Method for Collapsed Buildings Combining Post-Earthquake High-Resolution Optical and Synthetic Aperture Radar Images

MDPI
Remote Sensing
Authors:

Abstract and Figures

The detection of collapsed buildings based on post-earthquake remote sensing images is conducive to eliminating the dependence on pre-earthquake data, which is of great significance to carry out emergency response in time. The difficulties in obtaining or lack of elevation information, as strong evidence to determine whether buildings collapse or not, is the main challenge in the practical application of this method. On the one hand, the introduction of double bounce features in synthetic aperture radar (SAR) images are helpful to judge whether buildings collapse or not. On the other hand, because SAR images are limited by imaging mechanisms, it is necessary to introduce spatial details in optical images as supplements in the detection of collapsed buildings. Therefore, a detection method for collapsed buildings combining post-earthquake high-resolution optical and SAR images was proposed by mining complementary information between traditional visual features and double bounce features from multi-source data. In this method, a strategy of optical and SAR object set extraction based on an inscribed center (OpticalandSAR-ObjectsExtraction) was first put forward to extract a unified optical-SAR object set. Based on this, a quantitative representation of collapse semantic knowledge in double bounce (DoubleBounceCollapseSemantic) was designed to bridge a semantic gap between double bounce and collapse features of buildings. Ultimately, the final detection results were obtained based on the improved active learning support vector machines (SVMs). The multi-group experimental results of post-earthquake multi-source images show that the overall accuracy (OA) and the detection accuracy for collapsed buildings (Pcb) of the proposed method can reach more than 82.39% and 75.47%. Therefore, the proposed method is significantly superior to many advanced methods for comparison.
This content is subject to copyright.


Citation: Wang, C.; Zhang, Y.; Xie, T.;
Guo, L.; Chen, S.; Li, J.; Shi, F. A
Detection Method for Collapsed
Buildings Combining Post-
Earthquake High-Resolution Optical
and Synthetic Aperture Radar
Images. Remote Sens. 2022,14, 1100.
https://doi.org/10.3390/
rs14051100
Academic Editors:
Wojciech Drzewiecki,
Beata Hejmanowska and
Sławomir Mikrut
Received: 20 December 2021
Accepted: 21 February 2022
Published: 23 February 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
remote sensing
Article
A Detection Method for Collapsed Buildings Combining
Post-Earthquake High-Resolution Optical and Synthetic
Aperture Radar Images
Chao Wang 1, Yan Zhang 1, Tao Xie 2,3 ,*, Lin Guo 4,5, Shishi Chen 2, Junyong Li 1and Fan Shi 1
1School of Electronics and Information Engineering, Nanjing University of Information Science and
Technology, Nanjing 210044, China; chaowang@nuist.edu.cn (C.W.); 20191218023@nuist.edu.cn (Y.Z.);
20211249174@nuist.edu.cn (J.L.); 20191219098@nuist.edu.cn (F.S.)
2School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and
Technology, Nanjing 210044, China; 20201235001@nuist.edu.cn
3Laboratory for Regional Oceanography and Numerical Modeling, Qingdao National Laboratory for Marine
Science and Technology, Qingdao 266237, China
4Research and Development Center of Postal Industry Technology, School of Modern Posts,
Institute of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;
guolin@njupt.edu.cn
5National Laboratory of Solid State Microstructures, Nanjing University, Nanjing 210093, China
*Correspondence: xietao@nuist.edu.cn
Abstract:
The detection of collapsed buildings based on post-earthquake remote sensing images is
conducive to eliminating the dependence on pre-earthquake data, which is of great significance to
carry out emergency response in time. The difficulties in obtaining or lack of elevation information,
as strong evidence to determine whether buildings collapse or not, is the main challenge in the
practical application of this method. On the one hand, the introduction of double bounce features in
synthetic aperture radar (SAR) images are helpful to judge whether buildings collapse or not. On the
other hand, because SAR images are limited by imaging mechanisms, it is necessary to introduce
spatial details in optical images as supplements in the detection of collapsed buildings. Therefore, a
detection method for collapsed buildings combining post-earthquake high-resolution optical and
SAR images was proposed by mining complementary information between traditional visual features
and double bounce features from multi-source data. In this method, a strategy of optical and SAR
object set extraction based on an inscribed center (OpticalandSAR-ObjectsExtraction) was first put
forward to extract a unified optical-SAR object set. Based on this, a quantitative representation of
collapse semantic knowledge in double bounce (DoubleBounceCollapseSemantic) was designed to
bridge a semantic gap between double bounce and collapse features of buildings. Ultimately, the
final detection results were obtained based on the improved active learning support vector machines
(SVMs). The multi-group experimental results of post-earthquake multi-source images show that
the overall accuracy (OA) and the detection accuracy for collapsed buildings (P
cb
) of the proposed
method can reach more than 82.39% and 75.47%. Therefore, the proposed method is significantly
superior to many advanced methods for comparison.
Keywords: remote sensing images; multi-source data; collapsed buildings; double bounce
1. Introduction
Timely and accurate evaluation of earthquake damages to buildings after earthquakes
is an important part of disaster surveillance [
1
]. Compared with traditional field survey
methods, the remote sensing technology that adopts a remote imaging mode has many
advantages, such as timely acquisition of information and not being limited by field
conditions, so it has become the main technical means for extracting earthquake damage
information of buildings [2,3].
Remote Sens. 2022,14, 1100. https://doi.org/10.3390/rs14051100 https://www.mdpi.com/journal/remotesensing
Remote Sens. 2022,14, 1100 2 of 21
In recent years, the detection of buildings subjected to earthquake damages based on
remote sensing images has mainly focused on the identification of collapsed
buildings [4,5]
.
The reason is that collapsed buildings are usually severely damaged and people are trapped
therein, which are primary targets in post-earthquake emergency response and rescue [
6
].
In complex post-earthquake scenarios, collapsed and non-collapsed buildings are generally
significantly different in the height. Therefore, introducing elevation information to tra-
ditional high-resolution remote sensing images can provide direct evidence support for
judging whether buildings collapse or not. Even so, the acquisition of digital elevation data,
such as light detection and ranging (LiDAR), usually requires extracting ground control
points, with high computation complexity and time costs. Therefore, it is difficult to meet
the timeliness requirements for the detection of collapsed buildings after earthquakes [
7
,
8
].
For this reason, it is necessary to design a reliable detection method for collapsed buildings
in the event of a lack of elevation data. In accordance with different data sources used,
detection methods for collapsed buildings can be classified into three categories as a whole:
(1) methods based on pre-earthquake and post-earthquake images; (2) methods based on
post-earthquake images; and (3) methods combining elevation data.
(1)
Methods based on pre-earthquake and post-earthquake images: Such methods can
be used to evaluate the damage degree of buildings by extracting changes of typical
features from pre-earthquake/post-earthquake images [
9
]. Due to the introduction
of pre-earthquake data for reference, other ground objects with features similar to
collapsed buildings that have existed before earthquakes generally can be effectively
eliminated from detection results by using such methods. In spite of this, normal
urban evolution may also produce abundant changes in addition to earthquake impact.
Furthermore, the lack of pre-earthquake data after earthquakes is often the bottleneck
that restricts the popularization and application of such methods [1012].
(2)
Methods based on post-earthquake images: Such methods eliminate the dependence
on pre-earthquake data and have stronger universality compared with methods
based on pre-earthquake and post-earthquake images [
13
]. Collapsed buildings
are depicted by extracting manually defined or automatically extracted features
such as spectra, texture and space and thus an appropriate classifier is selected for
prediction [14]. Even so, the diversity of collapsed buildings and complexity of post-
earthquake scenarios lead to more prominent problems of different objects with the
same spectra and same object with different spectra, which requires establishing more
discriminative classification models. Furthermore, the lack of elevation information,
as the direct evidence to determine whether buildings collapse or not, is still the main
challenge in practical application of such methods [15].
(3)
Methods combining elevation data: Based on remote sensing images, elevation infor-
mation provided by elevation data, such as LiDAR and digital elevation model (DEM),
is used in such methods as a strong basis for determining whether buildings collapse
or not [
16
18
]. Although remote sensing images are strongly complementary with
elevation data, it is not a common practice to specially collect and produce elevation
data only for the detection of collapsed buildings in practical application. In addition,
there is no reliable method for scanning and measuring collapsed buildings at present.
Compared with traditional machine learning, deep learning adopts a deep nonlinear
network structure to achieve an approximation of a complex function through hierarchical
learning, thus extracting advanced features [
19
]. In recent years, scholars have carried out
research on semantic segmentation of the deep learning technique Mask Region-Based
Convolutional Neural Network (Mask RCNN), and obtained a great deal of research
findings in remote sensing applications. For example, Li et al. [
20
] proposed a novel
Histogram Thresholding Mask Region-Based Convolutional Neural Network (HTMask
R-CNN), which utilized the significant differences between old and new buildings in
grayscale histogram to improve the classification ability of the model. Mahmoud et al. [
21
]
proposed an adaptive Mask RCNN framework to detect multi-scale objects in optical remote
sensing images. In this method, the standard convolutional neural network in Mask RCNN
Remote Sens. 2022,14, 1100 3 of 21
is replaced by ResNet50 to overcome the vanishing gradient problem. Zhao et al. [
22
]
proposed a method combining Mask RCNN with building boundary regularization, which
could produce better regularized polygons. In addition to building detection, many state-
of-the-art Mask RCNN semantic segmentations have been proposed for other applications.
For example, Bhuiyan et al. [
23
] developed a high-throughput mapping workflow to
automatically detect and classify ice-wedge polygons (IWPs). Witharana et al. [
24
] gauged
the influence of spectral and spatial artifacts on the prediction accuracies of CNN models
by using Mask RCNN. Despite this, at present, training is usually carried out using deep
learning methods based on training samples in specific areas, so the portability of the model
remains unclear. In the meantime, the production and manual annotation of sample sets
after earthquakes are very time consuming and laborious, which seriously restricts the
application of such methods in the detection of collapsed buildings.
In conclusion, machine learning methods based on post-earthquake images neither
rely on pre-earthquake data nor require a large number of training samples, so they have
unique advantages in usability and timeliness. In view of the lack of elevation information
in such methods, introducing the double bounce features can provide supplementary
information. Among different scattering contributions present in high-resolution synthetic
aperture radar (SAR) images, the double bounce (which is caused by the corner reflector
assembled by the front wall and its adjacent ground) with linear characteristics indicates
the presence of a building or other artificial target. However, the double bounce features are
influenced by the orientation angle of the buildings and the ground material, as well as the
polarization. Adamo Ferro demonstrated that the double bounce effect has a strong power
signature for buildings which have a wall on the side closest to the sensor almost parallel
to the SAR azimuth direction [
25
]. In polarized SAR images, the double bounce intensity of
cross polarization is usually weaker than that of co-polarization, and the double bounce
intensity decreases with the increase of the angle between the orientation of the buildings
and the polarized SAR azimuth direction. In the post-earthquake scenario, the collapse
of a building results in a reduction of the ‘ground-wall’ dihedral structure. The main
difference of the scattering mechanism before and after building collapse is the change from
the primary double-bounce scattering mechanism to the primary single-bounce scattering
mechanism. This is embodied in the real post-earthquake SAR data in which the double
bounce generally appears as a bright line parallel to the wall of a non-collapsed building.
In comparison, the double bounce of collapsed buildings is not significant or exhibits
disorderly distributed speckle noise [
26
]. To this end, by taking SAR and optical satellite
images after an earthquake in Sendai, Japan in 2011 as examples, different manifestations
of the double bounce of collapsed and non-collapsed buildings are displayed in Figure 1.
Therefore, double bounce may indirectly reflect the elevation information and improve the
accuracy of collapsed building detection.
However, SAR images inevitably have problems, such as lack of spectral information,
complex noise, and blur degradation, so it is not reliable to detect collapsed buildings
by only relying on SAR images. Meanwhile, the spectral and spatial details contained in
high-resolution optical images are favorable for accurate location and profile extraction of
buildings. Therefore, based on the combination of post-earthquake high-resolution optical
and SAR images, traditional visual features of optical images such as spectra, texture and
morphology features can be combined with double bounce features. This can provide a
new technical path for accurately and reliably detecting collapsed buildings with a lack of
elevation information. In particular, in complex environments such as urban areas, many
scattering contributions from small structures with possibly different materials interfere,
which are not considered in the currently reported theoretical models.
Remote Sens. 2022,14, 1100 4 of 21
Remote Sens. 2022, 14, 1100 4 of 23
(a) (b) (c) (d)
Figure 1. The optical images and the corresponding SAR images: (a) optical images of non-collapsed
buildings; (b) SAR images of non-collapsed buildings; (c) optical images of collapsed buildings; (d)
SAR images of collapsed buildings. (In (b,d), green boxes represent double bounce of non-collapsed
buildings, while red boxes indicate double bounce of collapsed buildings.)
To achieve the complementary advantages of high-resolution optical and SAR im-
ages, it is prioritized to establish a unified object set from multi-source data. However,
due to a great difference in imaging mechanisms between optical and SAR images, the
same object may have significantly different manifestations in two types of data, so it is
difficult to extract profile pairs belonging to the same object from heterologous images. In
addition, at present, there are few quantitative representations and analysis methods for
collapse semantic knowledge contained in double bounce. Finally, the combination of
multi-source data means that the annotation of training samples is more time consuming
and laborious. Therefore, a reliable effectiveness measure is needed to fully mine and se-
lect representative training samples, so as to improve the detection efficiency and accu-
racy of collapsed buildings.
In view of the above challenges, a non-deep learning method for collapsed buildings
combining post-earthquake high-resolution optical and SAR images was proposed in this
research. Firstly, a strategy of optical and SAR object set extraction based on inscribed
center (OpticalandSAR-ObjectsExtraction) was designed to provide unified analysis ele-
ments for the subsequent feature modeling and detection of collapsed buildings. The in-
scribed center represents the center of the circle with the largest radius inside the bound-
ary of the object. Based on this, a quantitative representation of collapse semantic
knowledge in double bounce (DoubleBounceCollapseSemantic) was constructed accord-
ing to spatial distribution of double bounce. After that, feature modeling of collapsed
buildings was performed based on traditional visual features and double bounce features.
Finally, the samples were refined based on a category uncertainty index (CUI) between
the samples to be tagged and the tagged samples to optimize the active learning process,
thus detecting collapsed buildings.
The novel contributions of the proposed method are shown as follows: (1) The pro-
posed OpticalandSAR-ObjectsExtraction overcame imaging differences between heterol-
ogous images and extracted a unified object set from optical and SAR images. (2) The
proposed DoubleBounceCollapseSemantic provided a way to quantitatively extract dou-
ble bounce features from SAR images, which could significantly improve the accuracy of
collapsed building detection. (3) The CUI was put forward to improve the training process
Figure 1.
The optical images and the corresponding SAR images: (
a
) optical images of non-collapsed
buildings; (
b
) SAR images of non-collapsed buildings; (
c
) optical images of collapsed buildings;
(d) SAR
images of collapsed buildings. (In (
b
,
d
), green boxes represent double bounce of non-
collapsed buildings, while red boxes indicate double bounce of collapsed buildings).
To achieve the complementary advantages of high-resolution optical and SAR im-
ages, it is prioritized to establish a unified object set from multi-source data. However,
due to a great difference in imaging mechanisms between optical and SAR images, the
same object may have significantly different manifestations in two types of data, so it is
difficult to extract profile pairs belonging to the same object from heterologous images.
In addition, at present, there are few quantitative representations and analysis methods
for collapse semantic knowledge contained in double bounce. Finally, the combination of
multi-source data means that the annotation of training samples is more time consuming
and laborious. Therefore, a reliable effectiveness measure is needed to fully mine and select
representative training samples, so as to improve the detection efficiency and accuracy of
collapsed buildings.
In view of the above challenges, a non-deep learning method for collapsed buildings
combining post-earthquake high-resolution optical and SAR images was proposed in this
research. Firstly, a strategy of optical and SAR object set extraction based on inscribed center
(OpticalandSAR-ObjectsExtraction) was designed to provide unified analysis elements for
the subsequent feature modeling and detection of collapsed buildings. The inscribed center
represents the center of the circle with the largest radius inside the boundary of the object.
Based on this, a quantitative representation of collapse semantic knowledge in double
bounce (DoubleBounceCollapseSemantic) was constructed according to spatial distribution
of double bounce. After that, feature modeling of collapsed buildings was performed based
on traditional visual features and double bounce features. Finally, the samples were refined
based on a category uncertainty index (CUI) between the samples to be tagged and the
tagged samples to optimize the active learning process, thus detecting collapsed buildings.
The novel contributions of the proposed method are shown as follows: (1) The pro-
posed OpticalandSAR-ObjectsExtraction overcame imaging differences between heterol-
ogous images and extracted a unified object set from optical and SAR images. (2) The
proposed DoubleBounceCollapseSemantic provided a way to quantitatively extract double
bounce features from SAR images, which could significantly improve the accuracy of
collapsed building detection. (3) The CUI was put forward to improve the training process
of active learning support vector machines (SVMs), which was conducive to fully mining
and selecting representative training samples.
Remote Sens. 2022,14, 1100 5 of 21
2. Methodology
The proposed method mainly included four steps: building the unified optical-SAR
object set based on OpticalandSAR-ObjectsExtraction; extracting double bounce features
based on DoubleBounceCollapseSemantic; extracting traditional visual features based
on morphological attribute profiles (MAPs); and detecting collapsed buildings based on
improved active learning SVMs. The specific realization process is displayed in Figure 2.
Remote Sens. 2022, 14, 1100 5 of 23
of active learning support vector machines (SVMs), which was conducive to fully mining
and selecting representative training samples.
2. Methodology
The proposed method mainly included four steps: building the unified optical-SAR
object set based on OpticalandSAR-ObjectsExtraction; extracting double bounce features
based on DoubleBounceCollapseSemantic; extracting traditional visual features based on
morphological attribute profiles (MAPs); and detecting collapsed buildings based on im-
proved active learning SVMs. The specific realization process is displayed in Figure 2.
Candidate the Unified Optical-SA R Object S et Based on
OpticalandSAR-ObjectsExtraction
Post-earthquake High-resolution Optical and SAR Images
Extract Double Bounce Features Based on
DoubleBounceCollapseSemantic
Detect Collapsed Buildings
Based on Improved Active
Learning SVMs
Traversal
Initial Potential
Double Bounce
Pixels Set
Hough
Transformation
Refine Samples
Based on Category
Uncertainty Index
Affine
Transformation
Coefficient
Optical SAR Projection
Based on Inscribed Center
Region Growing Based
on Projection Point
Ima ge
Segmentation
by Ecog nition
Ima ge
Segmentation
by ICM
Optical
Ima ge
Object Set
SAR Image
Object Set
Calculate Optical
Object Hu Moment
Inv ar iants
Calculate SAR
Object Hu Moment
Inv ar iants
Detect PDBPs
Non-collapsed
Building
Pixel Set 1
Locally
Collapsed
Building
Pixel Set 1
Completely
Collapsed
Building
Pixel Set 1
Non-collapsed
Building
Pixel Set 2
Locally
Collapsed
Building
Pixel Set 2
Completely
Collapsed
Building
Pixel Set 2
Six-
dimen
sional
Visual
Words
Ex tra ct
Traditional
Visual Features
Based on MAPs
Optical
Multi-scale
MAPs
SAR
Multi-scale
MAPs
Calculate
Mean Gray
Values
Train Multi-
Classification
SVMs Models
Final
Collapsed
Buildings
Detection
Results
Candidate Collapsed Semantic Histograms
opt-SAR
R
SAR-opt
R
Optical-SAR:
SAR-Optical:
match
R
Figure 2. Specific flow of the method.
Figure 2. Specific flow of the method.
Remote Sens. 2022,14, 1100 6 of 21
2.1. Construction of the Unified Optical-SAR Object Set Based on OpticalandSAR-ObjectsExtraction
To construct the unified optical-SAR object set, the proposed OpticalandSAR-Objects
Extraction was mainly divided into three steps, namely image segmentation, establishment
of a coarse registration-based affine transformation equation, and projection of inscribed
center of an object and region growing.
2.1.1. Image Segmentation
Firstly, two images were segmented, and the inscribed center of the object was taken
as a feature point in segmentation results to establish the coarse registration-based affine
transformation equation. The optical image was segmented by utilizing the well-known
business software Ecognition to obtain the object set
Ropt
of the optical image [
27
]. The
segmentation was performed using the following parameters: scale parameter, 30; shape,
0.5; compactness, 0.2. Furthermore, an iterated conditional model (ICM) based on Markov
is conducive to better highlighting foreground targets including buildings in SAR image
segmentation, so this method was used to obtain the object set
RSAR
of the SAR image [
28
].
The reason why we adopted this method is that, on one hand, the method is an image
segmentation algorithm based on statistics, which is spatially constrained and has fewer
model parameters. On the other hand, this method has been successfully applied to SAR
images with different polarizations, such as single polarimetry, dual polarimetry and full
polarimetry and yields good results.
2.1.2. Establishment of the Coarse Registration-Based Affine Transformation Equation
In
Ropt
and
RSAR
, the matched object pairs are searched as a basis for establishing the
affine transformation equation. Because of invariance to translation, rotation and scaling
of moment invariants, the 7th-order Hu moment invariants are taken as measures for
similarity between objects [29]. The specific steps are demonstrated as follows:
Step 1: Based on Equation (1), moment invariants of the
i
th object in
Ropt
and
j
th object
in RSAR are calculated and all possible combinations are traversed.
dij =v
u
u
t7
n=1φi(n)ψj(n)2(1)
where,
φi(n)
and
ψj(n)
represent the
n
th moment invariant of the
i
th object in the optical
image and nth moment invariant of the jth object in the SAR image, respectively.
Step 2: An object with the smallest moment invariant is selected from
RSAR
for each
object in
Ropt
to construct a set of matched object pairs
RoptSA R
. An object with the
minimum moment invariant is selected from
Ropt
for each object in
RSAR
to constitute
another set of matched object pairs RSARo pt .
Step 3: The same matched object pairs in
RoptSA R
and
RSARopt
are retained as the
final set of matched object pairs Rmatch.
Step 4: Because inscribed circles of each object are bound to exist and locate inside of
the objects, the inscribed centers of each object can be calculated in
Rmatch
. On this basis,
each matched object pair can obtain a pair of matched inscribed centers (feature points),
thus obtaining the set of matched feature points
Pmatch
required for establishing the affine
transformation equation.
Step 5: By combining
Pmatch
and Equation (2), the affine transformation equation
between optical and SAR images can be established.
x0=a0+a1x+a2y
y0=b0+b1x+b2y(2)
2.1.3. Projection of Inscribed Centers of Objects and Region Growing
The objects in the SAR image that match with each object in the optical image are
searched. Based on coarse registration results, the inscribed centers of each object in
Ropt
Remote Sens. 2022,14, 1100 7 of 21
are directly projected into the SAR image according to the affine transformation equation,
thus obtaining a set of project points in the SAR image. Based on region growing of project
points, the SAR image is divided into connected regions corresponding to each object in
Ropt [30], so as to finally acquire the unified optical-SAR object set Runi .
2.2. Extraction of Double Bounce Features Based on Double Bounce Collapse Semantic
In view of extraction of collapse semantic features contained in double bounce, this
study mainly designed two parts, namely detection of potential double bounce pixels
(PDBPs) and construction of a collapse semantic histogram.
2.2.1. Detection of PDBPs
Since double bounce is shown as a highlighted line in the SAR image, this study
firstly used Hough transform for line detection [
31
], in which LOG operators were used for
edge detection to obtain a set of initial potential double bounce pixels (IPDBPs). On this
basis, for any one pixel
e
in IPDBPs, pixels belonging to IPDBPs are searched in its eight
neighborhoods. If there is only one pixel meeting the condition, the pixel
e
is regarded as
an endpoint. In this case, the pixels belonging to IPDBPs are searched continuously in a
5×5
window with
e
as the center and the overlapped pixels in eight neighborhoods of
these pixels and eight neighborhoods of the pixel
e
are all taken as PDBPs. Traversing all
pixels, the final set of PDBPs is extracted from the SAR image.
2.2.2. Construction of the Collapse Semantic Histogram
In the SAR image, a visual vocabulary based on collapse semantics was designed and
the collapse semantic histogram was constructed by combining with a spatial relationship
between
Runi
and PDBPs. When the total number of objects in
Runi
is
N
, for any one object
Rc
uni (c=1, 2, 3, . . . , N)
, the set of visual words and DoubleBounceCollapseSemantic rules
are defined as follows:
Pixel set 1 of non-collapsed buildings. Double bounce of non-collapsed buildings
usually appears as a highlighted line at the corner of a building. Therefore, line
segments of double bounce with features of non-collapsed buildings overlap or are
adjacent to profiles of the objects, showing a similar curvature and direction and a
certain length. The specific search and discrimination steps are shown as follows:
Step 1: The blurred line segment
e
L
that overlaps or is adjacent to the
Rc
uni
profile
is firstly searched. From any one pixel
g
on the profile, PDBPs are searched from eight
neighborhoods of
g
. If there is a PDBP, defined as
r
, PDBPs in eight neighborhoods of
r
are
searched. The newly searched PDBPs and
r
are retained and the fitting line
ˆ
L
is obtained
according to these pixels. On this basis, new PDBPs are searched continuously in the
existing newly searched eight neighborhoods of each PDBP. If they exist, the distance from
this point to
ˆ
L
is calculated. When the distance is smaller than
w
, this PDBP is retained. In
a similar way, all possible pixels are traversed and all retained PDBPs constitute the blurred
line segment e
L.
Step 2: For the next pixel
g0
on the profile, a blurred line segment corresponding to
g0
can be obtained by repeating Step 1. All points on the profile are traversed to form a
candidate blurred line-segment set
S1
. All blurred line segments with the length larger
than Taare retained to constitute a blurred line-segment set S2.
Step 3: For foot points of two endpoints of any one line segment
LS2
in the set
S2
on the
profile of the object, the line segment
L0
S2
of the profile between foot points is intercepted and
the line segment
LS2
which simultaneously meets the following two conditions constitutes
a blurred line-segment set
S3
. (1) The difference in average curvatures of
LS2
and
L0
S2
is calculated and should be smaller than the threshold
Tb
. (2)
LS2
and
L0
S2
are fitted by
straight lines using the ordinary least squares to calculate the slope difference of the two
straight lines which should be smaller than the threshold
Tc
.
S3
is the constructed visual
word. It should be pointed out that, in order to improve the automation degree of the
proposed method, the following adaptive extraction strategy is adopted for
Ta
,
Tb
and
Tc
.
Remote Sens. 2022,14, 1100 8 of 21
Compared with collapsed buildings, double bounce of non-collapsed buildings is usually
longer and more complete. Based on this assumption, an objective function
FS3(Ta,Tb,Tc)
is constructed, representing the number of pixels extracted from
S3
for an object under
different combinations of
Ta
,
Tb
and
Tc
. Let
Ta
,
Tb
and
Tc
be valued in intervals of [0, t],
[0, 1] and [0, 1] and tindicate the length of the diagonal of the bounding rectangle for
object
Rc
uni
. When
FS3
is the maximum,
Ta
,
Tb
and
Tc
constitute the optimal parameter
combination.
Pixel set 1 of locally collapsed buildings. In
S1
, blurred line segments with the length
smaller than or equal to Taare retained, namely the constructed visual word.
Pixel set 1 of completely collapsed buildings. Except for PDBPs which have been
defined as visual words, the other PDBPs located on the
Rc
uni
profile or within one
pixel outside the profile are the constructed visual word.
Pixel set 2 of non-collapsed buildings. Within pixels in
Rc
uni
profile, a candidate blurred
line-segment set
inner
meeting conditions is searched from any one pixel
u
. Except for
different starting points and scopes of search, other steps are exactly the same as
S1
above. Because the blurred line segments in
inner
are located inside
Rc
uni
, the
inner
is
directly regarded as the constructed visual word.
Pixel set 2 of locally collapsed buildings. In
Rc
uni
, PDBPs without being defined as
visual words are defined as
PDBPres
and the ratio of the number of
PDBPres
to the
total number of pixels is
ϕ
. Furthermore, the ratio of the total number of PDBPs in the
SAR image to the total number of pixels is defined as
ϕSAR
. If
ϕϕSAR
,
PDBPres
is
the constructed visual word.
Pixel set 2 of completely collapsed buildings. In
Rc
uni
, PDBPs without being defined as
visual words are the constructed visual word.
Based on the above six-dimensional visual words, the collapse sematic histogram
Icsh
of the double bounce of Rc
uni can be obtained.
2.3. Extraction of Traditional Visual Features Based on MAPs
The area, diagonal, normalized moment of inertia (NMI) and standard deviation in
MAPs have been proven to have strong discrimination ability in the detection of buildings.
To this end, the traditional visual features in optical and SAR images are extracted based
on the four attributes through the proposed automatic building detection from high-
resolution remote sensing images based on joint optimization and decision fusion of MAPs
(detailed steps can refer to previous studies [
32
]). On this basis, the multi-scale MAPs sets
corresponding to optical and SAR images are obtained, including
MAPsopt
and
MAPsSAR
.
In
MAPsopt
, the mean gray values of
Rc
uni
in each attribute profile (AP) are calculated, thus
obtaining the visual histogram
Iosh
of optical images corresponding to
Rc
uni
. Similarly, the
visual histogram Issh of SAR images can be obtained.
2.4. Detection of Collapsed Buildings Based on Improved Active Learning SVMs
In the classification stage,
Runi
is classified into non-collapsed buildings, collapsed
buildings and others by using active learning SVMs. The decision-making function of each
SVM classifier is shown as follows:
f(hk)=sign M
m=1
ymαmK(xm,hk)+b!(3)
Furthermore, when annotating samples by active learning SVMs, it is difficult to
annotate samples that are always on the category boundary with the greatest uncertainties.
Therefore, this study proposed the CUI, and the calculation process is demonstrated
as follows:
Remote Sens. 2022,14, 1100 9 of 21
Step 1: The probabilities of the samples
hk
to be annotated belonging to annotated
positive samples wp
land annotated negative samples vq
lare separately calculated:
Dwp
l/hk=
P
p=1
hhk,wp
li
khkkkwp
lk
P(4)
Dvq
l/hk=
Q
q=1
hhk,vq
li
khkkkvq
lk
Q(5)
where,
wp
l
indicates the
p(p=1, 2 · · · P)
th sample in the
l
th category of positive samples;
vq
lrepresents the q(q=1, 2 · · · Q)th sample in the lth category of negative samples.
Step 2: On this basis, the CUI of
hk
on the
l
th classifier is calculated by the
following formula
:
CUIwp
l,vq
l=2[(D2wp
l/hk1)×(D2vq
l/hk1)] (6)
where, the larger the CUI is, the greater the classification uncertainty of the sample hk.
Step 3: Based on this, the categorical decision function
fl(hk)
of the sample
hk
is
calculated in accordance with Equation (3). When CUI is the minimum and
fl(hk)
is the
maximum, the sample
hk
is annotated. The annotated samples are added into the training
samples to re-train the model. By repeating the above steps, the samples are refined to
obtain the final detection results of collapsed buildings.
3. Experiments and Evaluation
3.1. Study Area and Dataset Description
The study area is located in Sendai, Japan. An earthquake (Mw 9.0) occurred on 11
March 2011. The epicenter was located in the Pacific Ocean to the east of Miyagi Prefecture,
Japan, with a focal depth of 20 km. Sendai is one of the cities that were most seriously
hit by the earthquake. The earthquake and tsunami damaged lots of buildings, including
9877 collapsed buildings.
The post-earthquake high-resolution optical images adopted in this study are IKONOS
satellite images in Sendai, Japan, which were collected on 24 March 2011 and showed a
spatial resolution of 1 m, as displayed in Figure 3a. Post-earthquake high-resolution
SAR images were TerraSAR-X satellite images for the area, collected on 23 March 2011
with the spatial resolution of 3 m. The images were acquired in HH polarization in
stripmap mode, as demonstrated in Figure 3b. In the experiments, for the difference in
resolution between optical and SAR images, the images with the lower resolution were
re-sampled in this research, so that multi-source images had the same resolution. On this
basis, this study selected three representative regions for experiments. Dataset 1 is located
in an industrial zone where buildings were large and sparsely distributed, as shown in
Figure 4a. Compared with the industrial zone, the residential area is usually the most
severely affected, and is usually the primary target of post-earthquake emergency response
and post-disaster reconstruction. To this end, the constructed Datasets 2 and 3 are both
located in the residential area, as displayed in Figure 4b,c. Buildings in this area are usually
densely distributed and neatly arranged. Due to the significant radiometric and geometric
differences between optical and SAR data, their exact registration and high geometric
precision correction are not only very complex but also difficult to obtain the desired results.
In addition, since each object is extracted separately in optical and SAR images in this
paper, only the matching set of objects need to be found in both datasets. For this reason,
the following strategy was adopted for producing the datasets: taking the cropped and
segmented optical image as a reference, we cropped the corresponding area in the SAR
image which could completely cover all the objects in the optical image according to the
visual interpretation.
Remote Sens. 2022,14, 1100 10 of 21
Remote Sens. 2022, 14, 1100 10 of 23
by the earthquake. The earthquake and tsunami damaged lots of buildings, including 9877
collapsed buildings.
The post-earthquake high-resolution optical images adopted in this study are IKO-
NOS satellite images in Sendai, Japan, which were collected on 24 March 2011 and showed
a spatial resolution of 1 m, as displayed in Figure 3a. Post-earthquake high-resolution SAR
images were TerraSAR-X satellite images for the area, collected on 23 March 2011 with the
spatial resolution of 3 m. The images were acquired in HH polarization in stripmap mode,
as demonstrated in Figure 3b. In the experiments, for the difference in resolution between
optical and SAR images, the images with the lower resolution were re-sampled in this
research, so that multi-source images had the same resolution. On this basis, this study
selected three representative regions for experiments. Dataset 1 is located in an industrial
zone where buildings were large and sparsely distributed, as shown in Figure 4a. Com-
pared with the industrial zone, the residential area is usually the most severely affected,
and is usually the primary target of post-earthquake emergency response and post-disas-
ter reconstruction. To this end, the constructed Datasets 2 and 3 are both located in the
residential area, as displayed in Figure 4b,c. Buildings in this area are usually densely
distributed and neatly arranged. Due to the significant radiometric and geometric differ-
ences between optical and SAR data, their exact registration and high geometric precision
correction are not only very complex but also difficult to obtain the desired results. In
addition, since each object is extracted separately in optical and SAR images in this paper,
only the matching set of objects need to be found in both datasets. For this reason, the
following strategy was adopted for producing the datasets: taking the cropped and seg-
mented optical image as a reference, we cropped the corresponding area in the SAR image
which could completely cover all the objects in the optical image according to the visual
interpretation.
Dataset 1
Dataset 2
Dataset 3
(a)
Dataset 1
Dataset 3 Dataset 2
2.61.951.30.650.3250
Miles
2,7602,0701,380690345 Mile s
(b)
38°16'4'' N 38°17'50'' N
38°16'33'' N 38°18'1'' N
141°0'25'' E 141°1'35'' E 141°2'40'' E 141°0'40'' E 141°1'42'' E 141°2'46'' E
0
Figure 3.(a,b) Study area.
(a) (b) (c)
Figure 4. The optical images for the three datasets: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.
Figure 3. (a,b) Study area.
Remote Sens. 2022, 14, 1100 10 of 23
by the earthquake. The earthquake and tsunami damaged lots of buildings, including 9877
collapsed buildings.
The post-earthquake high-resolution optical images adopted in this study are IKO-
NOS satellite images in Sendai, Japan, which were collected on 24 March 2011 and showed
a spatial resolution of 1 m, as displayed in Figure 3a. Post-earthquake high-resolution SAR
images were TerraSAR-X satellite images for the area, collected on 23 March 2011 with the
spatial resolution of 3 m. The images were acquired in HH polarization in stripmap mode,
as demonstrated in Figure 3b. In the experiments, for the difference in resolution between
optical and SAR images, the images with the lower resolution were re-sampled in this
research, so that multi-source images had the same resolution. On this basis, this study
selected three representative regions for experiments. Dataset 1 is located in an industrial
zone where buildings were large and sparsely distributed, as shown in Figure 4a. Com-
pared with the industrial zone, the residential area is usually the most severely affected,
and is usually the primary target of post-earthquake emergency response and post-disas-
ter reconstruction. To this end, the constructed Datasets 2 and 3 are both located in the
residential area, as displayed in Figure 4b,c. Buildings in this area are usually densely
distributed and neatly arranged. Due to the significant radiometric and geometric differ-
ences between optical and SAR data, their exact registration and high geometric precision
correction are not only very complex but also difficult to obtain the desired results. In
addition, since each object is extracted separately in optical and SAR images in this paper,
only the matching set of objects need to be found in both datasets. For this reason, the
following strategy was adopted for producing the datasets: taking the cropped and seg-
mented optical image as a reference, we cropped the corresponding area in the SAR image
which could completely cover all the objects in the optical image according to the visual
interpretation.
Dataset 1
Dataset 2
Dataset 3
(a)
Dataset 1
Dataset 3 Dataset 2
2.61.951.30.650.3250
Miles
2,7602,0701,3 80690345 Miles
(b)
38°16'4'' N 38°17'50'' N
38°16'33'' N 38°18'1'' N
141°0'25'' E 141°1'35'' E 1 41°2'40'' E 141°0'40'' E 141°1'42'' E 141°2'46'' E
0
Figure 3.(a,b) Study area.
(a) (b) (c)
Figure 4. The optical images for the three datasets: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.
Figure 4. The optical images for the three datasets: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.
3.2. Experimental Settings and Methods for Comparison
In the experiments, four different advanced methods were selected for comparison.
The first is the detection method of optical images based on sparse dictionary (SD-OPT).
The spatial context information is further introduced by constructing the same and different
pairs of words with this method, so as to construct multi-visual features to model collapsed
buildings [
33
]. The second is the detection method of SAR images based on multi-texture
feature fusion (RF-SAR). In this method, texture features are extracted by comprehen-
sively utilizing the gray-level histogram, gray-level co-occurrence matrix (GLCM), local
binary pattern (LBP) and Gabor filtering, and then post-earthquake collapse information
of buildings is obtained by Random Forest [
34
]. This third method is a deep learning
method based on object context and boundary enhanced loss (OCR-BE). In this method, a
novel loss function, BE loss, is designed according to the distance between pixels and the
boundary, forcing the network to pay more attention to learning of boundary pixels [
35
].
The fourth method is a deep learning method based on UNet 3+ (UNet 3+). UNet 3+ takes
advantage of full-scale skip connections and deep supervisions to make full use of the
multi-scale features [
36
]. Among the four methods, the first two are single-source imaging
methods based on traditional machine learning. By comparing them, complementarity
and combined advantages of two data sources (optical and SAR images) in the detection
of collapsed buildings can be verified. The latter two methods belong to deep learning
methods combining multi-source data. Comparing them, it is conducive to analyze the
difference in performances between the proposed method and deep learning methods in
the detection of collapsed buildings, especially under small sample conditions.
All experiments are based on the three datasets in Section 3.1. In order to ensure consis-
tency of evaluation indexes for accuracy across different methods, semantic segmentation
results obtained by OCR-BE and UNet 3+ were converted into object-level detection results
according to the proportion of pixels belonging to different categories. In the experiments,
Matlab 2018 was taken as a simulation platform in all traditional machine learning methods.
The two deep learning methods were based on PyTorch-1.3.1 framework and implemented
in Ubuntu 16.04.
Remote Sens. 2022,14, 1100 11 of 21
3.3. General Results and Analysis
Based on the three datasets, the detection results of collapsed buildings by different
methods are demonstrated in Figures 57. In addition, the true ground map made through
visual interpretation is regarded as a basis for accuracy evaluation, in which white, gray and
black represent the collapsed buildings, non-collapsed buildings and others, respectively.
Remote Sens. 2022, 14, 1100 12 of 23
(a) (b) (c)
(d) (e) (f)
Figure 5. Detection results of collapsed buildings based on Dataset 1: (a) reference map; (b) pro-
posed method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.
(a) (b) (c)
(d) (e) (f)
Figure 5.
Detection results of collapsed buildings based on Dataset 1: (
a
) reference map; (
b
) proposed
method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.
As shown in the above figures, the detection results obtained by the proposed method
are significantly superior to those of the four methods for comparison as a the whole.
As traditional machine learning methods, SD-OPT adopts optical images, while RF-SAR
uses SAR images. Compared with the proposed method, SD-OPT and RF-SAR separately
have prominent false negatives (FNs) and false positives (FPs) because they only rely
on single-source data, as displayed in (c) and (d) in Figures 57. As two deep learning
methods, OCR-BE and UNet 3+ need massive training samples to fully train a deep network;
otherwise, it is difficult to obtain ideal detection effects. In the experiments, the sample
sizes of the three datasets are 1880, 2036 and 2058, and the proportions of the sample sizes
of collapsed buildings in the total number of samples are only 9.2%, 10.6% and 12.8%.
This results in serious over-fitting and poor generalization effects of the model on the test
set. This is the main reason why the detection accuracies of collapsed buildings using
OCR-BE and UNet 3+ are significantly lower than that of the traditional machine learning
methods. It is considered that with the increase of the sample size of collapsed buildings,
the accuracy of the deep learning methods gradually improves until the model converges.
In addition, for plants with a large size and low detection difficulty in the industrial zone
(Figure 5), except for the RF-SAR method with a lot of FNs and FPs, other methods have
good detection effects on buildings with large sizes. For densely distributed small buildings
that are difficult to detect in the residential area (Figures 6and 7), the proposed method
and SD-OPT method are significantly superior to other methods regarding the FN rate and
Remote Sens. 2022,14, 1100 12 of 21
FP rate. This indicates that rich spatial details provided by optical images are favorable
for accurate characterization of collapsed buildings in the complex context compared with
SAR images.
Remote Sens. 2022, 14, 1100 12 of 23
(a) (b) (c)
(d) (e) (f)
Figure 5. Detection results of collapsed buildings based on Dataset 1: (a) reference map; (b) pro-
posed method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.
(a) (b) (c)
(d) (e) (f)
Figure 6.
Detection results of collapsed buildings based on Dataset 2: (
a
) reference map; (
b
) proposed
method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.
Furthermore, six evaluation indexes, including overall accuracy (OA), FP rate and FN
rate as well as detection accuracies of non-collapsed buildings (P
nb
), collapsed buildings
(P
cb
) and others (P
o
) were used for quantitative accuracy evaluation, and the results are
shown in Table 1, Table 2and Table 3. In the three experiments, OAs of the proposed method
separately rise to 82.39%, 80.60% and 78.61%. Particularly, P
cb
concerned here reaches more
than 73.94%, showing the best performance in all experimental methods and consistent
conclusions with visual analysis. In comparison with the proposed method, SD-OPT and
RF-SAR only depend on single-source data and their FN rates and FP rates increase by more
than 3.77% and 6.94%, respectively. As deep learning methods, OCR-BE and Unet 3+ can
only obtain better detection effects of non-collapsed buildings than the proposed method
under small sample sizes, while other accuracy indexes reduce significantly, especially
with the P
cb
being the lowest at only 9.43%. Nevertheless, the detection effects of the two
deep learning methods will be greatly improved under the condition of sufficient training
samples. Therefore, the strategy of combining optical and SAR images proposed in this
research is necessary, feasible and effective in the detection of collapsed buildings and can
achieve ideal effects under small sample conditions.
Remote Sens. 2022,14, 1100 13 of 21
Figure 7.
Detection results of collapsed buildings based on Dataset 3: (
a
) reference map; (
b
) proposed
method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.
Table 1. Detection accuracy based on Dataset 1.
Method/
Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po(%)
Evaluation
Criteria
The Higher
the Better
The Lower
the Better
The Lower
the Better
The Higher
the Better
The Higher
the Better
The Higher
the Better
Proposed
method 82.39 9.65 17.61 52.92 74.57 88.55
SD-OPT 75.85 14.59 25.15 41.35 46.82 92.62
RF-SAR 63.99 21.96 36.01 25.29 44.51 73.17
OCR-BE 78.46 13.31 15.35 78.94 29.48 90.10
UNet 3+ 77.28 15.80 19.96 80.21 14.45 92.52
Table 2. Detection accuracy based on Dataset 2.
Method/
Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po(%)
Evaluation
Criteria
The Higher
the Better
The Lower
the Better
The Lower
the Better
The Higher
the Better
The Higher
the Better
The Higher
the Better
Proposed
method 80.60 10.74 19.40 50.22 73.94 85.68
SD-OPT 74.66 14.51 26.34 17.77 47.22 87.14
RF-SAR 63.41 22.39 36.59 24.38 29.63 74.02
OCR-BE 77.14 12.49 22.20 76.03 22.30 87.26
UNet 3+ 76.80 11.87 24.59 50.00 21.76 89.73
Remote Sens. 2022,14, 1100 14 of 21
Table 3. Detection accuracy based on Dataset 3.
Method/
Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po(%)
Evaluation
Criteria
The Higher
the Better
The Lower
the Better
The Lower
the Better
The Higher
the Better
The Higher
the Better
The Higher
the Better
Proposed
method 78.61 12.80 22.69 55.08 75.47 83.51
SD-OPT 66.81 19.90 33.19 45.72 41.13 77.17
RF-SAR 59.23 25.60 40.77 22.73 27.55 74.77
OCR-BE 76.39 13.92 22.89 63.98 33.32 85.04
UNet 3+ 75.11 14.43 25.55 65.24 9.43 92.88
3.4. Visual Comparison of Representative Patches
For further detailed visual analysis and discussion, representative patches were se-
lected in the three datasets, as shown in Figures 810. Yellow and red boxes separately
represent the collapsed buildings and non-collapsed buildings.
Remote Sens. 2022, 14, 1100 15 of 23
UNet 3+ 75.11 14.43 25.55 65.24 9.43 92.88
3.4. Visual Comparison of Representative Patches
For further detailed visual analysis and discussion, representative patches were se-
lected in the three datasets, as shown in Figures 8–10. Yellow and red boxes separately
represent the collapsed buildings and non-collapsed buildings.
(a) (b) (c) (d) (e) (f) (g)
Figure 8. Detection results of collapsed buildings in the representative patches in Dataset 1: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 9. Detection results of collapsed buildings in the representative patches in Dataset 2: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 10. Detection results of collapsed buildings in the representative patches in Dataset 3: (a)
original drawing of the representative patches; (b) reference diagram of the representative patches;
Figure 8.
Detection results of collapsed buildings in the representative patches in Dataset 1:
(a) original
drawing of the representative patches; (
b
) reference diagram of the representative patches;
(c) proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
Remote Sens. 2022, 14, 1100 15 of 23
UNet 3+ 75.11 14.43 25.55 65.24 9.43 92.88
3.4. Visual Comparison of Representative Patches
For further detailed visual analysis and discussion, representative patches were se-
lected in the three datasets, as shown in Figures 8–10. Yellow and red boxes separately
represent the collapsed buildings and non-collapsed buildings.
(a) (b) (c) (d) (e) (f) (g)
Figure 8. Detection results of collapsed buildings in the representative patches in Dataset 1: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 9. Detection results of collapsed buildings in the representative patches in Dataset 2: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 10. Detection results of collapsed buildings in the representative patches in Dataset 3: (a)
original drawing of the representative patches; (b) reference diagram of the representative patches;
Figure 9.
Detection results of collapsed buildings in the representative patches in Dataset 2:
(a) original
drawing of the representative patches; (
b
) reference diagram of the representative patches;
(c) proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
Remote Sens. 2022,14, 1100 15 of 21
Remote Sens. 2022, 14, 1100 15 of 23
UNet 3+ 75.11 14.43 25.55 65.24 9.43 92.88
3.4. Visual Comparison of Representative Patches
For further detailed visual analysis and discussion, representative patches were se-
lected in the three datasets, as shown in Figures 8–10. Yellow and red boxes separately
represent the collapsed buildings and non-collapsed buildings.
(a) (b) (c) (d) (e) (f) (g)
Figure 8. Detection results of collapsed buildings in the representative patches in Dataset 1: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 9. Detection results of collapsed buildings in the representative patches in Dataset 2: (a) orig-
inal drawing of the representative patches; (b) reference diagram of the representative patches; (c)
proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
(a) (b) (c) (d) (e) (f) (g)
Figure 10. Detection results of collapsed buildings in the representative patches in Dataset 3: (a)
original drawing of the representative patches; (b) reference diagram of the representative patches;
Figure 10.
Detection results of collapsed buildings in the representative patches in Dataset 3:
(a) original
drawing of the representative patches; (
b
) reference diagram of the representative patches;
(c) proposed method; (d) SD-OPT; (e) RF-SAR; (f) OCR-BE; (g) UNet 3+.
The above figures demonstrate that buildings in the industrial zone are easy to detect
because of their large sizes and sparse distribution, so better detection effects of collapsed
buildings can be acquired by all experimental methods. However, only UNet 3+ (yel-
low box in Figure 8g) shows FNs and SD-OPT presents FPs (yellow box in Figure 8e).
For non-collapsed buildings in the industrial zone (red boxes in Figure 8), the proposed
method and the two deep learning methods do not incur FPs, while SD-OPT and RF-SAR
separately show FPs (red box in Figure 8d) and FNs (red box in Figure 8e). In the resi-
dential area with neatly arranged and densely distributed buildings, only the proposed
method yields completely correct detection results for collapsed buildings (yellow boxes in
Figures 9and 10
). By contrast, RF-SAR (yellow box in Figure 10e) and OCR-BE (yellow box
in Figure 10f) present FPs, while SD-OPT (yellow box in Figure 9d) and UNet 3+ (yellow
boxes in
Figures 9g and 10g
) show FNs. For non-collapsed buildings, the visual analysis
results are similar to that of the industrial zone and good effects are reached by different
methods. However, only SD-OPT and RF-SAR have obvious FPs and FNs. In conclusion,
the detection effects of non-collapsed buildings by the five methods are good, while the
proposed method has higher P
cb
and less FPs and FNs by combining with optical and SAR
images, which is consistent with the quantitative analysis results.
4. Discussion
4.1. Validity Analysis of Combined Optical and SAR Images
To further verify the validity of combined optical and SAR images, experiments based
on single-source data (optical or SAR images) were conducted by using the proposed
method. The accuracy evaluation of experimental results obtained by combining optical
and SAR images, based on the optical image and based on the SAR image, are separately
shown in Table 4.
On this basis, after combining optical and SAR images, the OA in experiments based
on three datasets increases by 6.31~7.71% and P
cb
rises by 12.56~19.03% compared with
that under single-source data. Therefore, earthquake damage characteristics of buildings
were depicted from multiple perspectives by combining optical and SAR images after the
earthquake, and the extracted complementary information could significantly improve the
detection accuracy of collapsed buildings. Particularly, double bounce in the SAR image
provides key evidence support for judging whether buildings collapse or not, so P
cb
in the
experiment based on the SAR image is always significantly higher than that based on the
optical image only with traditional visual features.
Remote Sens. 2022,14, 1100 16 of 21
Table 4.
Comparison of detection accuracies of combining optical and SAR images with those based
on single-source data.
Datasets
Method/Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po(%)
Evaluation
Criteria
The Higher
the Better
The Lower
the Better
The Lower
the Better
The Lower
the Better
The Higher
the Better
The Higher
the Better
Dataset 1
Optical and SAR 82.39 9.65 17.61 52.92 74.57 88.55
Optical 66.49 30.56 27.12 40.86 46.59 75.79
SAR 74.68 14.49 23.32 50.19 57.23 81.10
Dataset 2
Optical and SAR 80.60 10.74 19.40 50.22 73.94 85.68
Optical 64.62 20.76 30.38 38.43 47.66 72.24
SAR 74.71 14.48 25.29 45.04 54.91 83.33
Dataset 3
Optical and SAR 78.61 12.80 22.69 55.08 75.47 83.51
Optical 65.40 19.95 31.65 69.79 50.57 67.02
SAR 72.30 16.07 25.70 45.45 62.91 79.76
In addition, two representative patches were selected for further visual analysis, as
shown in Figures 11 and 12. For collapsed buildings with fragmented distribution in both
images (yellow boxes in Figure 11), correct results can be obtained by the three methods. For
collapsed buildings with well-preserved roofs in optical images (yellow boxes in Figure 12),
correct judgement can be made by only the proposed method and the method based on
SAR images because the double bounce in SAR images shows the typical collapse semantic
features. However, FPs appear when using the method based on optical images (yellow box
in Figure 12d). Non-collapsed buildings shown in red boxes in Figure 11 illustrate complete
profile and single texture in optical and SAR images, so correct results are obtained from the
three methods. For non-collapsed buildings with intact roofs in optical images but regional
fragmentation distribution in SAR images, only the proposed method and that based on
optical images can make correct judgements, while the method based on SAR images
have obvious FPs. Therefore, it is feasible and effective to improve P
cb
by complementing
advantages of optical and SAR images.
Remote Sens. 2022, 14, 1100 17 of 23
In addition, two representative patches were selected for further visual analysis, as
shown in Figures 11 and 12. For collapsed buildings with fragmented distribution in both
images (yellow boxes in Figure 11), correct results can be obtained by the three methods.
For collapsed buildings with well-preserved roofs in optical images (yellow boxes in Fig-
ure 12), correct judgement can be made by only the proposed method and the method
based on SAR images because the double bounce in SAR images shows the typical col-
lapse semantic features. However, FPs appear when using the method based on optical
images (yellow box in Figure 12d). Non-collapsed buildings shown in red boxes in Figure
11 illustrate complete profile and single texture in optical and SAR images, so correct re-
sults are obtained from the three methods. For non-collapsed buildings with intact roofs
in optical images but regional fragmentation distribution in SAR images, only the pro-
posed method and that based on optical images can make correct judgements, while the
method based on SAR images have obvious FPs. Therefore, it is feasible and effective to
improve Pcb by complementing advantages of optical and SAR images.
(a) (b) (c) (d) (e)
Figure 11. Detection results of collapsed buildings in the representative sub-patch 1: (a) original
drawing of the representative sub-patch 1; (b) reference diagram; (c) proposed method; (d) only
optical images; (e) only SAR images.
(a) (b) (c) (d) (e)
Figure 12. Detection results of collapsed buildings in the representative sub-patch 2: (a) original
drawing of the representative sub-patch 2; (b) reference diagram; (c) proposed method; (d) only
optical images; (e) only SAR images.
4.2. Validity Analysis of DoubleBounceCollapseSemantic
To clarify the validity of the constructed DoubleBounceCollapseSemantic, the com-
parative experiments were carried out by adding or not adding double bounce features
extracted by DoubleBounceCollapseSemantic to traditional visual features of combined
optical and SAR images. The results are shown in Table 5.
Figure 11.
Detection results of collapsed buildings in the representative sub-patch 1: (
a
) original
drawing of the representative sub-patch 1; (
b
) reference diagram; (
c
) proposed method; (
d
) only
optical images; (e) only SAR images.
4.2. Validity Analysis of DoubleBounceCollapseSemantic
To clarify the validity of the constructed DoubleBounceCollapseSemantic, the com-
parative experiments were carried out by adding or not adding double bounce features
extracted by DoubleBounceCollapseSemantic to traditional visual features of combined
optical and SAR images. The results are shown in Table 5.
Remote Sens. 2022,14, 1100 17 of 21
Remote Sens. 2022, 14, 1100 17 of 23
In addition, two representative patches were selected for further visual analysis, as
shown in Figures 11 and 12. For collapsed buildings with fragmented distribution in both
images (yellow boxes in Figure 11), correct results can be obtained by the three methods.
For collapsed buildings with well-preserved roofs in optical images (yellow boxes in Fig-
ure 12), correct judgement can be made by only the proposed method and the method
based on SAR images because the double bounce in SAR images shows the typical col-
lapse semantic features. However, FPs appear when using the method based on optical
images (yellow box in Figure 12d). Non-collapsed buildings shown in red boxes in Figure
11 illustrate complete profile and single texture in optical and SAR images, so correct re-
sults are obtained from the three methods. For non-collapsed buildings with intact roofs
in optical images but regional fragmentation distribution in SAR images, only the pro-
posed method and that based on optical images can make correct judgements, while the
method based on SAR images have obvious FPs. Therefore, it is feasible and effective to
improve Pcb by complementing advantages of optical and SAR images.
(a) (b) (c) (d) (e)
Figure 11. Detection results of collapsed buildings in the representative sub-patch 1: (a) original
drawing of the representative sub-patch 1; (b) reference diagram; (c) proposed method; (d) only
optical images; (e) only SAR images.
(a) (b) (c) (d) (e)
Figure 12. Detection results of collapsed buildings in the representative sub-patch 2: (a) original
drawing of the representative sub-patch 2; (b) reference diagram; (c) proposed method; (d) only
optical images; (e) only SAR images.
4.2. Validity Analysis of DoubleBounceCollapseSemantic
To clarify the validity of the constructed DoubleBounceCollapseSemantic, the com-
parative experiments were carried out by adding or not adding double bounce features
extracted by DoubleBounceCollapseSemantic to traditional visual features of combined
optical and SAR images. The results are shown in Table 5.
Figure 12.
Detection results of collapsed buildings in the representative sub-patch 2: (
a
) original
drawing of the representative sub-patch 2; (
b
) reference diagram; (
c
) proposed method; (
d
) only
optical images; (e) only SAR images.
Table 5.
Validity analysis of DoubleBounceCollapseSemantic.
and
separately represent that a
feature is added and not added.
Datasets
Method/Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po(%)
Evaluation Criteria The Higher
the Better
The Lower
the Better
The Lower
the Better
The Higher
the Better
The Higher
the Better
The Higher
the Better
Dataset 1
DoubleBounceCollapseSemantic 82.39 9.65 17.61 52.92 74.57 88.55
DoubleBounceCollapseSemantic 78.83 11.86 20.81 48.64 67.63 85.52
Dataset 2
DoubleBounceCollapseSemantic 80.60 10.74 19.40 50.22 73.94 85.68
DoubleBounceCollapseSemantic 77.26 12.44 23.01 46.28 64.46 84.48
Dataset 3
DoubleBounceCollapseSemantic 78.61 12.80 22.69 55.08 75.47 83.51
DoubleBounceCollapseSemantic 74.69 15.29 25.40 42.25 68.68 81.47
As illustrated in the above table, the OA in the case of adding DoubleBounceCol-
lapseSemantic increases by 3.34~3.92%, while FP rate and FN rate decrease by 1.7~1.49%
and 2.71~3.61% compared with those in the case without adding DoubleBounceCollaps-
eSemantic. P
cb
rises by 6.49%, 9.48% and 6.79%, respectively. Therefore, the proposed
DoubleBounceCollapseSemantic is effective. On this basis, six collapsed buildings and
six non-collapsed buildings were selected from the three datasets and histogram statistics
were created on pixels of double bounce belonging to different visual words, as displayed
in Figure 13a,b.
Remote Sens. 2022, 14, 1100 18 of 23
Table 5. Validity analysis of DoubleBounceCollapseSemantic. and – separately represent that a
feature is added and not added.
Datasets
Method/Indicator OA (%) FP (%) FN (%) Pnb (%) Pcb (%) Po (%)
Evaluation Criteria
The
Higher
the Better
The
Lower the
Better
The
Lower
the Better
The
Higher the
Better
The
Higher the
Better
The
Higher the
Better
Dataset 1 DoubleBounceCollapseSemantic 82.39 9.65 17.61 52.92 74.57 88.55
DoubleBounceCollapseSemantic 78.83 11.86 20.81 48.64 67.63 85.52
Dataset 2
DoubleBounceCollapseSemantic 80.60 10.74 19.40 50.22 73.94 85.68
DoubleBounceCollapseSemantic 77.26 12.44 23.01 46.28 64.46 84.48
Dataset 3 DoubleBounceCollapseSemantic 78.61 12.80 22.69 55.08 75.47 83.51
DoubleBounceCollapseSemantic 74.69 15.29 25.40 42.25 68.68 81.47
As illustrated in the above table, the OA in the case of adding DoubleBounceCol-
lapseSemantic increases by 3.34~3.92%, while FP rate and FN rate decrease by 1.7~1.49%
and 2.71~3.61% compared with those in the case without adding DoubleBounceCol-
lapseSemantic. Pcb rises by 6.49%, 9.48% and 6.79%, respectively. Therefore, the proposed
DoubleBounceCollapseSemantic is effective. On this basis, six collapsed buildings and six
non-collapsed buildings were selected from the three datasets and histogram statistics
were created on pixels of double bounce belonging to different visual words, as displayed
in Figure 13a,b.
0
0.1
0.2
0.3
0.4
0.5
0.6
123456
Proportion of pixels of double bounce
Collapsed
building 1
Collapsed
building 2
Collapsed
building 3
Collapsed
building 4
Collapsed
building 5
Collapsed
building 6
Feature of non-
collapse 1
Feature of locally
collapse 1
Feature of complete
collapse 1
Feature of non-
collapse 2
Feature of locally
collapse 2
Feature of complete
collapse 2
(a)
Figure 13. Cont.
Remote Sens. 2022,14, 1100 18 of 21
Remote Sens. 2022, 14, 1100 19 of 23
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
123456
Non-
collapsed
building 1
Non-
collapsed
building 2
Non-
collapsed
building 3
Non-
collapsed
building 4
Non-
collapsed
building 5
Non-
collapsed
building 6
Proportion o f pixels of double bounce
Feature of non-
collapse 1
Feature of locally
collapse 1
Feature of complete
collapse 1
Feature of non-
collapse 2
Feature of locally
collapse 2
Feature of complete
collapse 2
(b)
Figure 13. Histograms of pixels of double bounce belonging to different visual words for (a) col-
lapsed buildings and (b) non-collapsed buildings.
Therefore, the histogram of collapsed buildings shows similar distribution, with low
intra-class separability. Meanwhile, the proportion of the pixels of collapsed buildings is
significantly higher than that of the non-collapsed ones, so it is conducive to obtaining
correct identification results. For non-collapsed buildings, the proportions of pixels of col-
lapsed to non-collapsed are contrary to collapsed buildings. Therefore, collapsed and non-
collapsed buildings have good inter-class separability in the above histograms. Further-
more, for pixels of locally collapsed buildings, their proportion in collapsed buildings is
significantly higher than that in non-collapsed buildings, so this is favorable for enhancing
inter-class separability between collapsed and non-collapsed buildings.
4.3. Validity Analysis of CUI
To verify the validity of CUI, comparative experiments were conducted by adding or
not adding CUI to the active learning SVMs, and the accuracy was evaluated. The results
are shown in Table 6.
Table 6. Validity analysis of CUI. and – separately represent that an index is added and not added.
Datasets
Method/Indicator OA (%) FP (%) FN (%)
Evaluation Criteria The Higher
the Better
The Lower the
Better
The Lower the
Better
Dataset 1
CUI 82.39 9.65 17.61
CUI 81.58 9.81 18.33
Dataset 2 CUI 80.60 10.74 19.40
CUI 79.07 12.45 20.01
Dataset 3 CUI 78.61 12.80 22.69
CUI 76.90 13.27 23.78
As listed in Table 6, OAs in the three experiments increase by 0.81%, 1.53% and 1.71%,
while FP rate and FN rate reduce by 0.16~1.71% and 0.61~1.09%, respectively. Therefore,
Figure 13.
Histograms of pixels of double bounce belonging to different visual words for (
a
) collapsed
buildings and (b) non-collapsed buildings.
Therefore, the histogram of collapsed buildings shows similar distribution, with low
intra-class separability. Meanwhile, the proportion of the pixels of collapsed buildings is
significantly higher than that of the non-collapsed ones, so it is conducive to obtaining
correct identification results. For non-collapsed buildings, the proportions of pixels of
collapsed to non-collapsed are contrary to collapsed buildings. Therefore, collapsed and
non-collapsed buildings have good inter-class separability in the above histograms. Fur-
thermore, for pixels of locally collapsed buildings, their proportion in collapsed buildings is
significantly higher than that in non-collapsed buildings, so this is favorable for enhancing
inter-class separability between collapsed and non-collapsed buildings.
4.3. Validity Analysis of CUI
To verify the validity of CUI, comparative experiments were conducted by adding or
not adding CUI to the active learning SVMs, and the accuracy was evaluated. The results
are shown in Table 6.
Table 6.
Validity analysis of CUI.
and
separately represent that an index is added and not added.
Datasets
Method/Indicator OA (%) FP (%) FN (%)
Evaluation
Criteria
The Higher
the Better
The Lower
the Better
The Lower
the Better
Dataset 1 CUI 82.39 9.65 17.61
CUI 81.58 9.81 18.33
Dataset 2 CUI 80.60 10.74 19.40
CUI 79.07 12.45 20.01
Dataset 3 CUI 78.61 12.80 22.69
CUI 76.90 13.27 23.78
As listed in Table 6, OAs in the three experiments increase by 0.81%, 1.53% and 1.71%,
while FP rate and FN rate reduce by 0.16~1.71% and 0.61~1.09%, respectively. Therefore,
this suggests that the proposed CUI is conducive to selecting more representative samples
for model training, which can significantly improve the classification accuracy.
Remote Sens. 2022,14, 1100 19 of 21
4.4. Analysis of Effects of the Number of Initial Training Samples
In order to verify the performance of improved active learning SVMs proposed in this
study under different numbers of initial training samples, the number of initial training
samples in each category is valued in the interval [5, 50] at the step length of 5. The change
trend of the OA with the increase of the number of training samples is shown in Figure 14.
Remote Sens. 2022, 14, 1100 20 of 23
this suggests that the proposed CUI is conducive to selecting more representative samples
for model training, which can significantly improve the classification accuracy.
4.4. Analysis of Effects of the Number of Initial Training Samples
In order to verify the performance of improved active learning SVMs proposed in
this study under different numbers of initial training samples, the number of initial train-
ing samples in each category is valued in the interval [5, 50] at the step length of 5. The
change trend of the OA with the increase of the number of training samples is shown in
Figure 14.
Number of in itial tr aining sam ples
82.39
80.60
78.61
83.05
81.43
79.14
Figure 14. Effects of the number of initial training samples on OA.
Therefore, with the increase of the number of initial training samples, the OA rapidly
rises in the interval [0, 20] and then tends to be stable. The OAs under Datasets 1 and 2
reach the peak (83.05% and 81.43%) when the sample size is 45. The OA under Dataset 3
reaches the peak of 79.14% when the sample size is 50. Although the peak OA rises by
0.53~0.83% compared with that under the sample size of 20, the number of the training
samples required is more than doubled. In accordance with the above analysis, the num-
ber of training samples in each category is suggested to be 20.
5. Conclusions
With a lack of pre-earthquake data, the detection method for collapsed buildings re-
lying on post-earthquake high-resolution optical and SAR images was proposed. To solve
the challenge for accurately judging whether buildings collapse or not due to the lack of
elevation data, the elevation information of buildings was indirectly acquired with this
method by mining the double bounce features in SAR images. Moreover, automated de-
tection of collapsed buildings was reached through combining multi-source traditional
visual features. To this end, this research firstly designed the OpticalandSAR-Objects Ex-
traction strategy to construct the unified optical-SAR object set. Based on this, the Dou-
bleBounceCollapseSemantic was constructed, thus bridging the semantic gap between
double bounce and collapse features of buildings. In the classification stage, the CUI was
put forward, which was conducive to selecting more representative samples to optimize
Figure 14. Effects of the number of initial training samples on OA.
Therefore, with the increase of the number of initial training samples, the OA rapidly
rises in the interval [0, 20] and then tends to be stable. The OAs under Datasets 1 and 2
reach the peak (83.05% and 81.43%) when the sample size is 45. The OA under
Dataset 3
reaches the peak of 79.14% when the sample size is 50. Although the peak OA rises by
0.53~0.83% compared with that under the sample size of 20, the number of the training
samples required is more than doubled. In accordance with the above analysis, the number
of training samples in each category is suggested to be 20.
5. Conclusions
With a lack of pre-earthquake data, the detection method for collapsed buildings
relying on post-earthquake high-resolution optical and SAR images was proposed. To
solve the challenge for accurately judging whether buildings collapse or not due to the
lack of elevation data, the elevation information of buildings was indirectly acquired with
this method by mining the double bounce features in SAR images. Moreover, automated
detection of collapsed buildings was reached through combining multi-source traditional
visual features. To this end, this research firstly designed the OpticalandSAR-Objects
Extraction strategy to construct the unified optical-SAR object set. Based on this, the
DoubleBounceCollapseSemantic was constructed, thus bridging the semantic gap between
double bounce and collapse features of buildings. In the classification stage, the CUI was
put forward, which was conducive to selecting more representative samples to optimize
active learning SVMs and finally automatically detect collapsed buildings. Through multi-
group comparative experiments on post-earthquake remote sensing images in different
regions, the proposed method shows excellent performances compared with visual and
quantitative analysis. The OA and P
cb
can reach 82.39% and 75.47% at most, so the proposed
method is superior to multiple methods for comparison. However, the proposed model
does not dig into the influence of factors such as orientation angle of the building and
Remote Sens. 2022,14, 1100 20 of 21
polarization on double bounce intensity. In the future, we will focus on these issues to
develop more complicated and advanced models.
Author Contributions:
Conceptualization, C.W.; methodology, C.W. and Y.Z.; software, Y.Z.; valida-
tion, T.X., Y.Z. and S.C.; formal analysis, Y.Z. and L.G.; investigation, F.S. and J.L.; resources, C.W.;
writing—original draft preparation, Y.Z.; writing—review and editing, C.W.; visualization, C.W. and
Y.Z.; supervision, C.W., T.X. and S.C.; project administration, C.W. All authors have read and agreed
to the published version of the manuscript.
Funding:
This work was supported in part by the Natural Science Foundation of Jiangsu Province
(Grants No JSZRHYKJ202114), the Natural Science Foundation of Jiangsu Province (Grants No
YJGL-YF-2020-16), the National Natural Science Foundation of China (Grant No. 42176180), the
Post-doctoral fund of Jiangsu Province under Grant No. 2021K013A, the Universities Natural Science
Research project of Jiangsu Province under Grant 19KJB510048, the Opening fund for National Key
Laboratory of Solid Microstructure Physics in Nanjing University under Grant No. M30006, the
Post-doctoral fund of Jiangsu Province under Grant No. 1701132B and the Six Talent-peak Project in
Jiangsu Province under Grant 2019XYDXX135.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement:
The data presented in this study are available on request from the
corresponding author. The data are not publicly available due to other ongoing research.
Conflicts of Interest:
All authors have reviewed the manuscript and approved submission to this
journal. The authors declare that there is no conflict of interest regarding the publication of
this article
.
References
1. Moya, L.; Geiß, C.; Hashimoto, M.; Mas, E. Disaster Intensity-Based Selection of Training Samples for Remote Sensing Building
Damage Classification. IEEE Trans. Geosci. Remote Sens. 2021,59, 8288–8304. [CrossRef]
2.
Cotrufo, S.; Sandu, C.; Tonolo, F.G.; Boccardo, P. Building damage assessment scale tailored to remote sensing vertical imagery.
Eur. J. Remote Sens. 2018,51, 991–1005. [CrossRef]
3.
Li, J.; Zhao, S.; Jin, H.; Li, Y.; Guo, Y. A method of combined texture features and morphology for building seismic damage
information extraction based on GF remote sensing images. Acta Seismol. 2019,5, 658–670.
4.
Rui, Z.; Yi, Z.; Shi, W. Construction and Application of a Post-Quake House Damage Model Based on Multiscale Self-Adaptive
Fusion of Spectral Textures Images. In Proceedings of the IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing
Symposium, Waikoloa, HI, USA, 26 September–2 October 2020; pp. 6631–6634.
5.
Rui, X.; Cao, Y.; Yuan, X.; Kang, Y.; Song, W. DisasterGAN: Generative Adversarial Networks for Remote Sensing Disaster Image
Generation. Remote Sens. 2021,13, 4284. [CrossRef]
6.
Wen, Q.; Jiang, K.; Wang, W.; Liu, Q.; Guo, Q.; Li, L.; Wang, P. Automatic Building Extraction from Google Earth Images under
Complex Backgrounds Based on Deep Instance Segmentation Network. Sensors 2019,19, 333. [CrossRef]
7. Janalipour, M.; Mohammadzadeh, A. A novel and automatic framework for producing building damage map using post-event
LiDAR data. Int. J. Disaster Risk Reduct. 2019,39, 101238. [CrossRef]
8.
Kaoshan, D.; Ang, L.; Hexiao, Z. Surface damage quantification of post-earthquake building based on terrestrial laser scan data.
Struct. Control. Health Monit. 2018,25, e2210.
9.
Jihui, T.; Deren, L.; Wenqing, F. Detecting Damaged Building Regions Based on Semantic Scene Change from Multi-Temporal
High-Resolution Remote Sensing Images. ISPRS Int. J. Geo-Inf. 2017,6, 131.
10.
Jun, L.; Pei, L. Extraction of Earthquake-Induced Collapsed Buildings from Bi-Temporal VHR Images Using Object-Level
Homogeneity Index and Histogram. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019,12, 2755–2770.
11.
Akhmadiya, A.; Nabiyev, N.; Moldamurat, K. Use of Sentinel-1 Dual Polarization Multi-Temporal Data with Gray Level Co-
Occurrence Matrix Textural Parameters for Building Damage Assessment. Pattern Recognit. Image Anal.
2021
,31, 240–250.
[CrossRef]
12.
Zhou, Z.; Gong, J.; Hu, X. Community-scale multi-level post-hurricane damage assessment of residential buildings using
multi-temporal airborne LiDAR data. Autom. Constr. 2019,98, 30–45. [CrossRef]
13.
Jiang, X.; He, Y.; Li, G.; Liu, Y. Building Damage Detection via Superpixel-Based Belief Fusion of Space-Borne SAR and Optical
Images. IEEE Sens. J. 2019,20, 2008–2022. [CrossRef]
14.
Xin, Y.; Ming, L.; Jun, W. Building-Based Damage Detection from Postquake Image Using Multiple-Feature Analysis. IEEE Geosci.
Remote Sens. Lett. 2017,14, 499–503.
15.
Chen, Q.; Yang, H.; Li, L.; Liu, X. A Novel Statistical Texture Feature for SAR Building Damage Assessment in Different
Polarization Modes. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020,13, 154–165. [CrossRef]
Remote Sens. 2022,14, 1100 21 of 21
16.
Wang, X.; Li, P. Extraction of urban building damage using spectral, height and corner information from VHR satellite images
and airborne LiDAR data. ISPRS J. Photogramm. Remote Sens. 2020,159, 322–336. [CrossRef]
17.
Adriano, B.; Xia, J.; Baier, G.; Yokoya, N.; Koshimura, S. Multi-Source Data Fusion Based on Ensemble Learning for Rapid
Building Damage Mapping during the 2018 Sulawesi Earthquake and Tsunami in Palu, Indonesia. Remote Sens.
2019
,11, 886.
[CrossRef]
18.
Guo, J.; Luan, Y.; Li, Z.; Liu, X.; Li, C.; Chang, X. Mozambique Flood (2019) Caused by Tropical Cyclone Idai Monitored from
Sentinel-1 and Sentinel-2 Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021,14, 8761–8772. [CrossRef]
19.
Zheng, W.; Hu, D.; Wang, J. Fault Localization Analysis Based on Deep Neural Network. Math. Probl. Eng.
2016
,4, 1–11.
[CrossRef]
20.
Li, Y.; Xu, W.; Chen, H.; Jiang, J.; Li, X. A Novel Framework Based on Mask R-CNN and Histogram Thresholding for Scalable
Segmentation of New and Old Rural Buildings. Remote Sens. 2021,13, 1070. [CrossRef]
21.
Mahmoud, A.; Mohamed, S.; El-Khoribi, R.; Abdelsalam, H. Object Detection Using Adaptive Mask RCNN in Optical Remote
Sensing Images. Int. Intell. Eng. Syst. 2020,13, 65–76. [CrossRef]
22.
Zhao, K.; Kang, J.; Jung, J.; Sohn, G. Building Extraction from Satellite Images Using Mask R-CNN With Building Boundary
Regularization. In CVPR Workshops; IEEE: New York, NY, USA, 2018; pp. 247–251.
23.
Bhuiyan, M.A.E.; Witharana, C.; Liljedahl, A.K. Use of Very High Spatial Resolution Commercial Satellite Imagery and Deep
Learning to Automatically Map Ice-Wedge Polygons across Tundra Vegetation Types. J. Imaging
2020
,6, 137. [CrossRef] [PubMed]
24.
Witharana, C.; Bhuiyan, A.E.; Liljedahl, A.K.; Kanevskiy, M.; Epstein, H.E.; Jones, B.M.; Daanen, R.; Griffin, C.G.; Kent, K.;
Jones, M.K.W. Understanding the synergies of deep learning and data fusion of multispectral and panchromatic high resolution
commercial satellite imagery for automated ice-wedge polygon detection. ISPRS J. Photogramm. Remote Sens.
2020
,170, 174–191.
[CrossRef]
25.
Ferro, A.; Brunner, D.; Bruzzone, L.; Lemoine, G. On the Relationship Between Double Bounce and the Orientation of Buildings
in VHR SAR Images. IEEE Geosci. Remote Sens. Lett. 2011,8, 612–616. [CrossRef]
26.
Cho, K.; Park, S.; Cho, J.; Moon, H.; Han, S. Automatic Urban Area Extraction from SAR Image Based on Morphological Operator.
IEEE Geosci. Remote Sens. Lett. 2021,18, 831–835. [CrossRef]
27.
Zhang, A.; Sun, G.; Liu, S. Multi-scale segmentation of very high-resolution remote sensing image based on gravitational field
and optimized region merging. Multimed. Tools Appl. 2017,76, 15105–15122. [CrossRef]
28.
Nazarinezhad, J.; Dehghani, M. A contextual-based segmentation of compact PolSAR images using Markov Random Field (MRF)
model. Int. J. Remote Sens. 2018,40, 985–1010. [CrossRef]
29. Li, Q.; Yin, K.; Yuan, G. ROI Extraction of Village Targets and Heterogeneous Image Registration. Modern Radar. 2019,41, 31–36.
30.
Huang, H.; Li, X.; Chen, C. Individual Tree Crown Detection and Delineation from Very-High-Resolution UAV Images Based
on Bias Field and Marker-Controlled Watershed Segmentation Algorithms. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
2018
,
11, 2253–2262. [CrossRef]
31.
Liu, W.; Zhang, Z.; Chen, X.; Li, S.; Zhou, Y. Dictionary Learning-Based Hough Transform for Road Detection in Multispectral
Image. IEEE Geosci. Remote Sens. Lett. 2017,14, 2330–2334. [CrossRef]
32.
Wang, C.; Zhang, Y.; Chen, X.; Jiang, H.; Mukherjee, M.; Wang, S. Automatic Building Detection from High-Resolution Remote
Sensing Images Based on Joint Optimization and Decision Fusion of Morphological Attribute Profiles. Remote Sens.
2021
,13, 357.
[CrossRef]
33.
Shi, F.; Wang, C.; Shen, Y.; Zhang, Y.; Qui, X. High-resolution Remote Sensing Image Post-earthquake Building Detection Based
on Sparse Dictionary. Chin. J. Sci. Instrum. 2020,41, 205–213.
34.
Du, Y.; Gong, L.; Li, Q. Earthquake-Induced Building Damage Assessment on SAR Multi-Texture Feature Fusion. In Proceedings
of the IGARSS 2020—2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, 26 September–2
October 2020; pp. 6608–6610.
35.
Wang, C.; Qiu, X.; Huan, H.; Wang, S.; Zhang, Y.; Chen, X.; He, W. Earthquake-Damaged Buildings Detection in Very High-
Resolution Remote Sensing Images Based on Object Context and Boundary Enhanced Loss. Remote Sens.
2021
,13, 3119. [CrossRef]
36.
Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.-W.; Wu, J. UNet 3+: A Full-Scale Connected UNet
for Medical Image Segmentation. arXiv 2020, arXiv:2004.08790.
... Even in high-resolution images, it was difficult to understand the details of the damages from close-nadirlooking satellite images [29,47]. Therefore, based on the previous studies [48,[62][63][64][65], and since the visual interpretation of pre-and post-disaster images was complicated for multiclass labeling, we only considered two classes of damage for further steps. Additionally, the building vector map was employed to locate the buildings and decrease human error during interpretation. ...
... Even in high-resolution images, it was difficult to understand the details of the damages from close-nadir-looking satellite images [29,47]. Therefore, based on the previous studies [48,[62][63][64][65], and since the visual interpretation of pre-and post-disaster images was complicated for multi-class labeling, we only considered two classes of damage for further steps. Additionally, the building vector map was employed to locate the buildings and decrease human error during interpretation. ...
Article
Full-text available
When natural disasters occur, timely and accurate building damage assessment maps are vital for disaster management responders to organize their resources efficiently. Pairs of pre- and post-disaster remote sensing imagery have been recognized as invaluable data sources that provide useful information for building damage identification. Recently, deep learning-based semantic segmentation models have been widely and successfully applied to remote sensing imagery for building damage assessment tasks. In this study, a two-stage, dual-branch, UNet architecture, with shared weights between two branches, is proposed to address the inaccuracies in building footprint localization and per-building damage level classification. A newly introduced selective kernel module improves the performance of the model by enhancing the extracted features and applying adaptive receptive field variations. The xBD dataset is used to train, validate, and test the proposed model based on widely used evaluation metrics such as F1-score and Intersection over Union (IoU). Overall, the experiments and comparisons demonstrate the superior performance of the proposed model. In addition, the results are further confirmed by evaluating the geographical transferability of the proposed model on a completely unseen dataset from a new region (Bam city earthquake in 2003).
... In other research, the authors detected collapsed buildings using three different textural features derived from the GLCM and applying an SVM classifier [51]. The SVM and a synergy of high-resolution optical and Synthetic Aperture Radar (SAR) images were also used for the detection of the collapsed buildings after the 2011 Japan earthquake [52]. Finally, in [41], the authors detected collapsed buildings after the 2017 Iran-Iraq earthquake using ten spectral indices in combination with seven different textural features derived from the GLCM and applying an SVM classifier. ...
Article
Full-text available
The recovery phase following an earthquake event is essential for urban areas with a significant number of damaged buildings. A lot of changes can take place in such a landscape within the buildings’ footprints, such as total or partial collapses, debris removal and reconstruction. Remote sensing data and methodologies can considerably contribute to site monitoring. The main objective of this paper is the change detection of the building stock in the settlement of Vrissa on Lesvos Island during the recovery phase after the catastrophic earthquake of 12 June 2017, through the analysis and processing of UAV (unmanned aerial vehicle) images and the application of Artificial Neural Networks (ANNs). More specifically, change detection of the settlement’s building stock by applying an ANN on Gray-Level Co-occurrence Matrix (GLCM) texture features of orthophotomaps acquired by UAVs was performed. For the training of the ANN, a number of GLCM texture features were defined as the independent variable, while the existence or not of structural changes in the buildings were defined as the dependent variable, assigning, respectively, the values 1 or 0 (binary classification). The ANN was trained based on the Levenberg–Marquardt algorithm, and its ability to detect changes was evaluated on the basis of the buildings’ condition, as derived from the binary classification. In conclusion, the GLCM texture feature changes in conjunction with the ANN can provide satisfactory results in predicting the structural changes of buildings with an accuracy of almost 92%.
... In this study, we only use the single-temporal post-earthquake SAR data to identify the degree of damage of buildings in disaster areas, whereby we can avoid the multitemporal data registration operation. (10) Fourpolarimetric SAR (PolSAR) data contain more information than remote sensing data of a singlepolarimetric or dual-polarimetric radar, because PolSAR data comprise four polarimetric channels: HH, HV, VH, and VV, where H represents horizontal polarization and V represents vertical polarization. In cases where only the single post-earthquake SAR data can be used for assessing post-earthquake building damage, we select to use PolSAR data to achieve a higher accuracy of damage identification and a more reliable post-earthquake damage assessment. ...
Article
Full-text available
Timely and accurate building damage mapping is essential for supporting disaster response activities. While RS satellite imagery can provide the basis for building damage map generation, detection of building damages by traditional methods is generally challenging. The traditional building damage mapping approaches focus on damage mapping based on bi-temporal pre/post-earthquake dataset extraction information from bi-temporal images, which is difficult. Furthermore, these methods require manual feature engineering for supervised learning models. To tackle the abovementioned limitation of the traditional damage detection frameworks, this research proposes a novel building damage map generation approach based only on post-event RS satellite imagery and advanced deep feature extractor layers. The proposed DL based framework is applied in an end-to-end manner without additional processing. This method can be conducted in five main steps: (1) pre-processing, (2) model training and optimization of model parameters, (3) damage mapping generation, (4) accuracy assessment, and (5) visual explanations of the proposed method’s predictions. The performance of the proposed method is evaluated by two real-world RS datasets that include Haiti-earthquake and Bata-explosion. Results of damage mapping show that the proposed method is highly efficient, yielding an OA of more than 84%, which is superior to other advanced DL-based damage detection methods.
Article
Full-text available
Rapid progress on disaster detection and assessment has been achieved with the development of deep-learning techniques and the wide applications of remote sensing images. However, it is still a great challenge to train an accurate and robust disaster detection network due to the class imbalance of existing data sets and the lack of training data. This paper aims at synthesizing disaster remote sensing images with multiple disaster types and different building damage with generative adversarial networks (GANs), making up for the shortcomings of the existing data sets. However, existing models are inefficient in multi-disaster image translation due to the diversity of disaster and inevitably change building-irrelevant regions caused by directly operating on the whole image. Thus, we propose two models: disaster translation GAN can generate disaster images for multiple disaster types using only a single model, which uses an attribute to represent disaster types and a reconstruction process to further ensure the effect of the generator; damaged building generation GAN is a mask-guided image generation model, which can only alter the attribute-specific region while keeping the attribute-irrelevant region unchanged. Qualitative and quantitative experiments demonstrate the validity of the proposed methods. Further experimental results on the damaged building assessment model show the effectiveness of the proposed models and the superiority compared with other data augmentation methods.
Article
Full-text available
The tropical cyclone Idai caused severe floods in Mozambique in March 2019. Sentinel-1 and Sentinel-2 images are processed to monitor the flood disaster in Pungue River and Buzi River by combining the object-based image analysis approach and the decision tree algorithm. Water is preliminarily extracted from Sentinel-1 image, and shadows of mountain and buildings in the study area are extracted from Sentinel-2 image. Mountain shadow is extracted based on the decision tree classification rules constructed by the digital elevation model and the index model, while building shadow is extracted by constructing the decision tree classification rules using the features of objects. Water extraction results are combined with shadows to eliminate confusing shadows in the preliminary extraction to obtain the accurate flood extent. Sentinel-2 images are used to classify land use types in multiple periods before and after the flood and analyze the disaster by combining with the change information of the water. Land use type changes are analyzed and predicted with the CA-Markov model. The results show that the maximum submerged area was 3.5 times of the normal water area, and that most areas were submerged for more than 10 days. The area between Pungue River and Buzi River were severely affected, where a large number of crops and villages were submerged. Grassland and cropland were the most seriously submerged. The overall accuracy of extracted water results ranges from 86% to 92%. The results indicated that the combining method can effectively suppress the influence of shadows on water extraction results.
Article
Full-text available
Fully convolutional networks (FCN) such as UNet and DeepLabv3+ are highly competitive when being applied in the detection of earthquake-damaged buildings in very high-resolution (VHR) remote sensing images. However, existing methods show some drawbacks, including incomplete extraction of different sizes of buildings and inaccurate boundary prediction. It is attributed to a deficiency in the global context-aware and inaccurate correlation mining in the spatial context as well as failure to consider the relative positional relationship between pixels and boundaries. Hence, a detection method for earthquake-damaged buildings based on the object contextual representations (OCR) and boundary enhanced loss (BE loss) was proposed. At first, the OCR module was separately embedded into high-level feature extractions of the two networks DeepLabv3+ and UNet in order to enhance the feature representation; in addition, a novel loss function, that is, BE loss, was designed according to the distance between the pixels and boundaries to force the networks to pay more attention to the learning of the boundary pixels. Finally, two improved networks (including OB-DeepLabv3+ and OB-UNet) were established according to the two strategies. To verify the performance of the proposed method, two benchmark datasets (including YSH and HTI) for detecting earthquake-damaged buildings were constructed according to the post-earthquake images in China and Haiti in 2010, respectively. The experimental results show that both the embedment of the OCR module and application of BE loss contribute to significantly increasing the detection accuracy of earthquake-damaged buildings and the two proposed networks are feasible and effective.
Article
Full-text available
Mapping new and old buildings are of great significance for understanding socio-economic development in rural areas. In recent years, deep neural networks have achieved remarkable building segmentation results in high-resolution remote sensing images. However, the scarce training data and the varying geographical environments have posed challenges for scalable building segmentation. This study proposes a novel framework based on Mask R-CNN, named Histogram Thresholding Mask Region-Based Convolutional Neural Network (HTMask R-CNN), to extract new and old rural buildings even when the label is scarce. The framework adopts the result of single-object instance segmentation from the orthodox Mask R-CNN. Further, it classifies the rural buildings into new and old ones based on a dynamic grayscale threshold inferred from the result of a two-object instance segmentation task where training data is scarce. We found that the framework can extract more buildings and achieve a much higher mean Average Precision (mAP) than the orthodox Mask R-CNN model. We tested the novel framework’s performance with increasing training data and found that it converged even when the training samples were limited. This framework’s main contribution is to allow scalable segmentation by using significantly fewer training samples than traditional machine learning practices. That makes mapping China’s new and old rural buildings viable.
Article
Full-text available
High-resolution remote sensing (HRRS) images, when used for building detection, play a key role in urban planning and other fields. Compared with the deep learning methods, the method based on morphological attribute profiles (MAPs) exhibits good performance in the absence of massive annotated samples. MAPs have been proven to have a strong ability for extracting detailed characterizations of buildings with multiple attributes and scales. So far, a great deal of attention has been paid to this application. Nevertheless, the constraints of rational selection of attribute scales and evidence conflicts between attributes should be overcome, so as to establish reliable unsupervised detection models. To this end, this research proposes a joint optimization and fusion building detection method for MAPs. In the pre-processing step, the set of candidate building objects are extracted by image segmentation and a set of discriminant rules. Second, the differential profiles of MAPs are screened by using a genetic algorithm and a cross-probability adaptive selection strategy is proposed; on this basis, an unsupervised decision fusion framework is established by constructing a novel statistics-space building index (SSBI). Finally, the automated detection of buildings is realized. We show that the proposed method is significantly better than the state-of-the-art methods on HRRS images with different groups of different regions and different sensors, and overall accuracy (OA) of our proposed method is more than 91.9%.
Article
Full-text available
Previous applications of machine learning in remote sensing for the identification of damaged buildings in the aftermath of a large-scale disaster have been successful. However, standard methods do not consider the complexity and costs of compiling a training data set after a large-scale disaster. In this article, we study disaster events in which the intensity can be modeled via numerical simulation and/or instrumentation. For such cases, two fully automatic procedures for the detection of severely damaged buildings are introduced. The fundamental assumption is that samples that are located in areas with low disaster intensity mainly represent nondamaged buildings. Furthermore, areas with moderate to strong disaster intensities likely contain damaged and nondamaged buildings. Under this assumption, a procedure that is based on the automatic selection of training samples for learning and calibrating the standard support vector machine classifier is utilized. The second procedure is based on the use of two regularization parameters to define the support vectors. These frameworks avoid the collection of labeled building samples via field surveys and/or visual inspection of optical images, which requires a significant amount of time. The performance of the proposed method is evaluated via application to three real cases: the 2011 Tohoku-Oki earthquake-tsunami, the 2016 Kumamoto earthquake, and the 2018 Okayama floods. The resulted accuracy ranges between 0.85 and 0.89, and thus, it shows that the result can be used for the rapid allocation of affected buildings.
Article
The post-earthquake building detection based on high-resolution remote sensing image is of great significance for emergency response and rescue. In the absence of pre-earthquake data, the key is to construct an efficient feature space for feature modeling of buildings damaged by earthquake. Therefore, a post-earthquake building detection method based on sparse dictionary is proposed in this study. Firstly, the post-earthquake buildings are depicted from multiple angles by combining spectral, textural and geometric morphological features. Secondly, the spatial context information is further introduced by constructing the same and different pairs of words to construct the multi-feature initial visual dictionary. On this basis, the orthogonal matching pursuit algorithm and the K-singular value decomposition algorithm are fused to perform sparse representation of the visual dictionary to reduce redundant information as much as possible. Finally, the detection result is achieved by support vector machine. Experimental results of multiple post-earthquake image show that the overall accuracy of the proposed method can reach over 85%. It is significantly superior to the comparison method in visual interpretation and quantitative analysis.