ArticlePDF Available

A Detection Method for Collapsed Buildings Combining Post-Earthquake High-Resolution Optical and Synthetic Aperture Radar Images

Remote Sensing

February 2022
14(5):1100

DOI:10.3390/rs14051100

License
CC BY 4.0

Authors:

Tao Xie

Nanjing University of Science and Technology

Show all 7 authorsHide

The detection of collapsed buildings based on post-earthquake remote sensing images is conducive to eliminating the dependence on pre-earthquake data, which is of great significance to carry out emergency response in time. The difficulties in obtaining or lack of elevation information, as strong evidence to determine whether buildings collapse or not, is the main challenge in the practical application of this method. On the one hand, the introduction of double bounce features in synthetic aperture radar (SAR) images are helpful to judge whether buildings collapse or not. On the other hand, because SAR images are limited by imaging mechanisms, it is necessary to introduce spatial details in optical images as supplements in the detection of collapsed buildings. Therefore, a detection method for collapsed buildings combining post-earthquake high-resolution optical and SAR images was proposed by mining complementary information between traditional visual features and double bounce features from multi-source data. In this method, a strategy of optical and SAR object set extraction based on an inscribed center (OpticalandSAR-ObjectsExtraction) was first put forward to extract a unified optical-SAR object set. Based on this, a quantitative representation of collapse semantic knowledge in double bounce (DoubleBounceCollapseSemantic) was designed to bridge a semantic gap between double bounce and collapse features of buildings. Ultimately, the final detection results were obtained based on the improved active learning support vector machines (SVMs). The multi-group experimental results of post-earthquake multi-source images show that the overall accuracy (OA) and the detection accuracy for collapsed buildings (Pcb) of the proposed method can reach more than 82.39% and 75.47%. Therefore, the proposed method is significantly superior to many advanced methods for comparison.

The optical images and the corresponding SAR images: (a) optical images of non-collapsed buildings; (b) SAR images of non-collapsed buildings; (c) optical images of collapsed buildings; (d) SAR images of collapsed buildings. (In (b,d), green boxes represent double bounce of non-collapsed buildings, while red boxes indicate double bounce of collapsed buildings).

…

Specific flow of the method.

…

(a,b) Study area.

…

The optical images for the three datasets: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.

…

+14

Detection results of collapsed buildings based on Dataset 1: (a) reference map; (b) proposed method; (c) SD-OPT; (d) RF-SAR; (e) OCR-BE; (f) UNet 3+.

…

Figures - available from: Remote Sensing

This content is subject to copyright.

Access to this full-text is provided by MDPI.

Content available from Remote Sensing

This content is subject to copyright.





Citation: Wang, C.; Zhang, Y.; Xie, T.;

Guo, L.; Chen, S.; Li, J.; Shi, F. A

Detection Method for Collapsed

Buildings Combining Post-

Earthquake High-Resolution Optical

and Synthetic Aperture Radar

Images. Remote Sens. 2022,14, 1100.

https://doi.org/10.3390/

rs14051100

Academic Editors:

Wojciech Drzewiecki,

Beata Hejmanowska and

Sławomir Mikrut

Received: 20 December 2021

Accepted: 21 February 2022

Published: 23 February 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

remote sensing

Article

A Detection Method for Collapsed Buildings Combining

Post-Earthquake High-Resolution Optical and Synthetic

Aperture Radar Images

Chao Wang 1, Yan Zhang 1, Tao Xie 2,3 ,*, Lin Guo 4,5, Shishi Chen 2, Junyong Li 1and Fan Shi 1

1School of Electronics and Information Engineering, Nanjing University of Information Science and

Technology, Nanjing 210044, China; chaowang@nuist.edu.cn (C.W.); 20191218023@nuist.edu.cn (Y.Z.);

20211249174@nuist.edu.cn (J.L.); 20191219098@nuist.edu.cn (F.S.)

2School of Remote Sensing and Geomatics Engineering, Nanjing University of Information Science and

Technology, Nanjing 210044, China; 20201235001@nuist.edu.cn

3Laboratory for Regional Oceanography and Numerical Modeling, Qingdao National Laboratory for Marine

Science and Technology, Qingdao 266237, China

4Research and Development Center of Postal Industry Technology, School of Modern Posts,

Institute of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;

guolin@njupt.edu.cn

5National Laboratory of Solid State Microstructures, Nanjing University, Nanjing 210093, China

*Correspondence: xietao@nuist.edu.cn

Abstract:

The detection of collapsed buildings based on post-earthquake remote sensing images is

conducive to eliminating the dependence on pre-earthquake data, which is of great signiﬁcance to

carry out emergency response in time. The difﬁculties in obtaining or lack of elevation information,

as strong evidence to determine whether buildings collapse or not, is the main challenge in the

practical application of this method. On the one hand, the introduction of double bounce features in

synthetic aperture radar (SAR) images are helpful to judge whether buildings collapse or not. On the

other hand, because SAR images are limited by imaging mechanisms, it is necessary to introduce

spatial details in optical images as supplements in the detection of collapsed buildings. Therefore, a

detection method for collapsed buildings combining post-earthquake high-resolution optical and

SAR images was proposed by mining complementary information between traditional visual features

and double bounce features from multi-source data. In this method, a strategy of optical and SAR

object set extraction based on an inscribed center (OpticalandSAR-ObjectsExtraction) was ﬁrst put

forward to extract a uniﬁed optical-SAR object set. Based on this, a quantitative representation of

collapse semantic knowledge in double bounce (DoubleBounceCollapseSemantic) was designed to

bridge a semantic gap between double bounce and collapse features of buildings. Ultimately, the

ﬁnal detection results were obtained based on the improved active learning support vector machines

(SVMs). The multi-group experimental results of post-earthquake multi-source images show that

the overall accuracy (OA) and the detection accuracy for collapsed buildings (P

) of the proposed

method can reach more than 82.39% and 75.47%. Therefore, the proposed method is signiﬁcantly

superior to many advanced methods for comparison.

Keywords: remote sensing images; multi-source data; collapsed buildings; double bounce

1. Introduction

Timely and accurate evaluation of earthquake damages to buildings after earthquakes

is an important part of disaster surveillance [

]. Compared with traditional ﬁeld survey

methods, the remote sensing technology that adopts a remote imaging mode has many

advantages, such as timely acquisition of information and not being limited by ﬁeld

conditions, so it has become the main technical means for extracting earthquake damage

information of buildings [2,3].

Remote Sens. 2022,14, 1100. https://doi.org/10.3390/rs14051100 https://www.mdpi.com/journal/remotesensing

Remote Sens. 2022,14, 1100 2 of 21

In recent years, the detection of buildings subjected to earthquake damages based on

remote sensing images has mainly focused on the identiﬁcation of collapsed

buildings [4,5]

The reason is that collapsed buildings are usually severely damaged and people are trapped

therein, which are primary targets in post-earthquake emergency response and rescue [

In complex post-earthquake scenarios, collapsed and non-collapsed buildings are generally

signiﬁcantly different in the height. Therefore, introducing elevation information to tra-

ditional high-resolution remote sensing images can provide direct evidence support for

judging whether buildings collapse or not. Even so, the acquisition of digital elevation data,

such as light detection and ranging (LiDAR), usually requires extracting ground control

points, with high computation complexity and time costs. Therefore, it is difﬁcult to meet

the timeliness requirements for the detection of collapsed buildings after earthquakes [

For this reason, it is necessary to design a reliable detection method for collapsed buildings

in the event of a lack of elevation data. In accordance with different data sources used,

detection methods for collapsed buildings can be classiﬁed into three categories as a whole:

(1) methods based on pre-earthquake and post-earthquake images; (2) methods based on

post-earthquake images; and (3) methods combining elevation data.

(1)

Methods based on pre-earthquake and post-earthquake images: Such methods can

be used to evaluate the damage degree of buildings by extracting changes of typical

features from pre-earthquake/post-earthquake images [

]. Due to the introduction

of pre-earthquake data for reference, other ground objects with features similar to

collapsed buildings that have existed before earthquakes generally can be effectively

eliminated from detection results by using such methods. In spite of this, normal

urban evolution may also produce abundant changes in addition to earthquake impact.

Furthermore, the lack of pre-earthquake data after earthquakes is often the bottleneck

that restricts the popularization and application of such methods [10–12].

(2)

Methods based on post-earthquake images: Such methods eliminate the dependence

on pre-earthquake data and have stronger universality compared with methods

based on pre-earthquake and post-earthquake images [

]. Collapsed buildings

are depicted by extracting manually deﬁned or automatically extracted features

such as spectra, texture and space and thus an appropriate classiﬁer is selected for

prediction [14]. Even so, the diversity of collapsed buildings and complexity of post-

earthquake scenarios lead to more prominent problems of different objects with the

same spectra and same object with different spectra, which requires establishing more

discriminative classiﬁcation models. Furthermore, the lack of elevation information,

as the direct evidence to determine whether buildings collapse or not, is still the main

challenge in practical application of such methods [15].

(3)

Methods combining elevation data: Based on remote sensing images, elevation infor-

mation provided by elevation data, such as LiDAR and digital elevation model (DEM),

is used in such methods as a strong basis for determining whether buildings collapse

or not [

–

]. Although remote sensing images are strongly complementary with

elevation data, it is not a common practice to specially collect and produce elevation

data only for the detection of collapsed buildings in practical application. In addition,

there is no reliable method for scanning and measuring collapsed buildings at present.

Compared with traditional machine learning, deep learning adopts a deep nonlinear

network structure to achieve an approximation of a complex function through hierarchical

learning, thus extracting advanced features [

]. In recent years, scholars have carried out

research on semantic segmentation of the deep learning technique Mask Region-Based

Convolutional Neural Network (Mask RCNN), and obtained a great deal of research

ﬁndings in remote sensing applications. For example, Li et al. [

] proposed a novel

Histogram Thresholding Mask Region-Based Convolutional Neural Network (HTMask

R-CNN), which utilized the signiﬁcant differences between old and new buildings in

grayscale histogram to improve the classiﬁcation ability of the model. Mahmoud et al. [

]

proposed an adaptive Mask RCNN framework to detect multi-scale objects in optical remote

sensing images. In this method, the standard convolutional neural network in Mask RCNN

Remote Sens. 2022,14, 1100 3 of 21

is replaced by ResNet50 to overcome the vanishing gradient problem. Zhao et al. [

]

proposed a method combining Mask RCNN with building boundary regularization, which

could produce better regularized polygons. In addition to building detection, many state-

of-the-art Mask RCNN semantic segmentations have been proposed for other applications.

For example, Bhuiyan et al. [

] developed a high-throughput mapping workﬂow to

automatically detect and classify ice-wedge polygons (IWPs). Witharana et al. [

] gauged

the inﬂuence of spectral and spatial artifacts on the prediction accuracies of CNN models

by using Mask RCNN. Despite this, at present, training is usually carried out using deep

learning methods based on training samples in speciﬁc areas, so the portability of the model

remains unclear. In the meantime, the production and manual annotation of sample sets

after earthquakes are very time consuming and laborious, which seriously restricts the

application of such methods in the detection of collapsed buildings.

In conclusion, machine learning methods based on post-earthquake images neither

rely on pre-earthquake data nor require a large number of training samples, so they have

unique advantages in usability and timeliness. In view of the lack of elevation information

in such methods, introducing the double bounce features can provide supplementary

information. Among different scattering contributions present in high-resolution synthetic

aperture radar (SAR) images, the double bounce (which is caused by the corner reﬂector

assembled by the front wall and its adjacent ground) with linear characteristics indicates

the presence of a building or other artiﬁcial target. However, the double bounce features are

inﬂuenced by the orientation angle of the buildings and the ground material, as well as the

polarization. Adamo Ferro demonstrated that the double bounce effect has a strong power

signature for buildings which have a wall on the side closest to the sensor almost parallel

to the SAR azimuth direction [

]. In polarized SAR images, the double bounce intensity of

cross polarization is usually weaker than that of co-polarization, and the double bounce

intensity decreases with the increase of the angle between the orientation of the buildings

and the polarized SAR azimuth direction. In the post-earthquake scenario, the collapse

of a building results in a reduction of the ‘ground-wall’ dihedral structure. The main

difference of the scattering mechanism before and after building collapse is the change from

the primary double-bounce scattering mechanism to the primary single-bounce scattering

mechanism. This is embodied in the real post-earthquake SAR data in which the double

bounce generally appears as a bright line parallel to the wall of a non-collapsed building.

In comparison, the double bounce of collapsed buildings is not signiﬁcant or exhibits

disorderly distributed speckle noise [

]. To this end, by taking SAR and optical satellite

images after an earthquake in Sendai, Japan in 2011 as examples, different manifestations

of the double bounce of collapsed and non-collapsed buildings are displayed in Figure 1.

Therefore, double bounce may indirectly reﬂect the elevation information and improve the

accuracy of collapsed building detection.

However, SAR images inevitably have problems, such as lack of spectral information,

complex noise, and blur degradation, so it is not reliable to detect collapsed buildings

by only relying on SAR images. Meanwhile, the spectral and spatial details contained in

high-resolution optical images are favorable for accurate location and proﬁle extraction of

buildings. Therefore, based on the combination of post-earthquake high-resolution optical

and SAR images, traditional visual features of optical images such as spectra, texture and

morphology features can be combined with double bounce features. This can provide a

new technical path for accurately and reliably detecting collapsed buildings with a lack of

elevation information. In particular, in complex environments such as urban areas, many

scattering contributions from small structures with possibly different materials interfere,

which are not considered in the currently reported theoretical models.

Remote Sens. 2022,14, 1100 4 of 21

Remote Sens. 2022, 14, 1100 4 of 23

(a) (b) (c) (d)

Figure 1. The optical images and the corresponding SAR images: (a) optical images of non-collapsed

buildings; (b) SAR images of non-collapsed buildings; (c) optical images of collapsed buildings; (d)

SAR images of collapsed buildings. (In (b,d), green boxes represent double bounce of non-collapsed

buildings, while red boxes indicate double bounce of collapsed buildings.)

To achieve the complementary advantages of high-resolution optical and SAR im-

ages, it is prioritized to establish a unified object set from multi-source data. However,

due to a great difference in imaging mechanisms between optical and SAR images, the

same object may have significantly different manifestations in two types of data, so it is

difficult to extract profile pairs belonging to the same object from heterologous images. In

addition, at present, there are few quantitative representations and analysis methods for

collapse semantic knowledge contained in double bounce. Finally, the combination of

multi-source data means that the annotation of training samples is more time consuming

and laborious. Therefore, a reliable effectiveness measure is needed to fully mine and se-

lect representative training samples, so as to improve the detection efficiency and accu-

racy of collapsed buildings.

In view of the above challenges, a non-deep learning method for collapsed buildings

combining post-earthquake high-resolution optical and SAR images was proposed in this

research. Firstly, a strategy of optical and SAR object set extraction based on inscribed

center (OpticalandSAR-ObjectsExtraction) was designed to provide unified analysis ele-

ments for the subsequent feature modeling and detection of collapsed buildings. The in-

scribed center represents the center of the circle with the largest radius inside the bound-

ary of the object. Based on this, a quantitative representation of collapse semantic

knowledge in double bounce (DoubleBounceCollapseSemantic) was constructed accord-

ing to spatial distribution of double bounce. After that, feature modeling of collapsed

buildings was performed based on traditional visual features and double bounce features.

Finally, the samples were refined based on a category uncertainty index (CUI) between

the samples to be tagged and the tagged samples to optimize the active learning process,

thus detecting collapsed buildings.

The novel contributions of the proposed method are shown as follows: (1) The pro-

posed OpticalandSAR-ObjectsExtraction overcame imaging differences between heterol-

ogous images and extracted a unified object set from optical and SAR images. (2) The

proposed DoubleBounceCollapseSemantic provided a way to quantitatively extract dou-

ble bounce features from SAR images, which could significantly improve the accuracy of

collapsed building detection. (3) The CUI was put forward to improve the training process

Figure 1.

The optical images and the corresponding SAR images: (

) optical images of non-collapsed

buildings; (

) SAR images of non-collapsed buildings; (

) optical images of collapsed buildings;

(d) SAR

images of collapsed buildings. (In (

), green boxes represent double bounce of non-

collapsed buildings, while red boxes indicate double bounce of collapsed buildings).

To achieve the complementary advantages of high-resolution optical and SAR im-

ages, it is prioritized to establish a uniﬁed object set from multi-source data. However,

due to a great difference in imaging mechanisms between optical and SAR images, the

same object may have signiﬁcantly different manifestations in two types of data, so it is

difﬁcult to extract proﬁle pairs belonging to the same object from heterologous images.

In addition, at present, there are few quantitative representations and analysis methods

for collapse semantic knowledge contained in double bounce. Finally, the combination of

multi-source data means that the annotation of training samples is more time consuming

and laborious. Therefore, a reliable effectiveness measure is needed to fully mine and select

representative training samples, so as to improve the detection efﬁciency and accuracy of

collapsed buildings.

In view of the above challenges, a non-deep learning method for collapsed buildings

combining post-earthquake high-resolution optical and SAR images was proposed in this

research. Firstly, a strategy of optical and SAR object set extraction based on inscribed center

(OpticalandSAR-ObjectsExtraction) was designed to provide uniﬁed analysis elements for

the subsequent feature modeling and detection of collapsed buildings. The inscribed center

represents the center of the circle with the largest radius inside the boundary of the object.

Based on this, a quantitative representation of collapse semantic knowledge in double

bounce (DoubleBounceCollapseSemantic) was constructed according to spatial distribution

of double bounce. After that, feature modeling of collapsed buildings was performed based

on traditional visual features and double bounce features. Finally, the samples were reﬁned

based on a category uncertainty index (CUI) between the samples to be tagged and the

tagged samples to optimize the active learning process, thus detecting collapsed buildings.

The novel contributions of the proposed method are shown as follows: (1) The pro-

posed OpticalandSAR-ObjectsExtraction overcame imaging differences between heterol-

ogous images and extracted a uniﬁed object set from optical and SAR images. (2) The

proposed DoubleBounceCollapseSemantic provided a way to quantitatively extract double

bounce features from SAR images, which could signiﬁcantly improve the accuracy of

collapsed building detection. (3) The CUI was put forward to improve the training process

of active learning support vector machines (SVMs), which was conducive to fully mining

and selecting representative training samples.

Remote Sens. 2022,14, 1100 5 of 21

2. Methodology

The proposed method mainly included four steps: building the uniﬁed optical-SAR

object set based on OpticalandSAR-ObjectsExtraction; extracting double bounce features

based on DoubleBounceCollapseSemantic; extracting traditional visual features based

on morphological attribute proﬁles (MAPs); and detecting collapsed buildings based on

improved active learning SVMs. The speciﬁc realization process is displayed in Figure 2.

Remote Sens. 2022, 14, 1100 5 of 23

of active learning support vector machines (SVMs), which was conducive to fully mining

and selecting representative training samples.

2. Methodology

The proposed method mainly included four steps: building the unified optical-SAR

object set based on OpticalandSAR-ObjectsExtraction; extracting double bounce features

based on DoubleBounceCollapseSemantic; extracting traditional visual features based on

morphological attribute profiles (MAPs); and detecting collapsed buildings based on im-

proved active learning SVMs. The specific realization process is displayed in Figure 2.

Candidate the Unified Optical-SA R Object S et Based on

OpticalandSAR-ObjectsExtraction

Post-earthquake High-resolution Optical and SAR Images

Extract Double Bounce Features Based on

DoubleBounceCollapseSemantic

Detect Collapsed Buildings

Based on Improved Active

Learning SVMs

Traversal

Initial Potential

Double Bounce

Pixels Set

Hough

Transformation

Refine Samples

Based on Category

Uncertainty Index

Affine

Transformation

Coefficient

Optical →SAR Projection

Based on Inscribed Center

Region Growing Based

on Projection Point

Ima ge

Segmentation

by Ecog nition

Ima ge

Segmentation

by ICM

Optical

Ima ge

Object Set

SAR Image

Object Set

Calculate Optical

Object Hu Moment

Inv ar iants

Calculate SAR

Object Hu Moment

Inv ar iants

Detect PDBPs

Non-collapsed

Building

Pixel Set 1

Locally

Collapsed

Building

Pixel Set 1

Completely

Collapsed

Building

Pixel Set 1

Non-collapsed

Building

Pixel Set 2

Locally

Collapsed

Building

Pixel Set 2

Completely

Collapsed

Building

Pixel Set 2

Six-

dimen

sional

Visual

Words

Ex tra ct

Traditional

Visual Features

Based on MAPs

Optical

Multi-scale

MAPs

SAR

Multi-scale

MAPs

Calculate

Mean Gray

Values

Train Multi-

Classification

SVMs Models

Final

Collapsed

Buildings

Detection

Results

Candidate Collapsed Semantic Histograms

opt-SAR

SAR-opt

Optical-SAR:

SAR-Optical:

match

Figure 2. Specific flow of the method.

Figure 2. Speciﬁc ﬂow of the method.

Remote Sens. 2022,14, 1100 6 of 21

2.1. Construction of the Unified Optical-SAR Object Set Based on OpticalandSAR-ObjectsExtraction

To construct the uniﬁed optical-SAR object set, the proposed OpticalandSAR-Objects

Extraction was mainly divided into three steps, namely image segmentation, establishment

of a coarse registration-based afﬁne transformation equation, and projection of inscribed

center of an object and region growing.

2.1.1. Image Segmentation

Firstly, two images were segmented, and the inscribed center of the object was taken

as a feature point in segmentation results to establish the coarse registration-based afﬁne

transformation equation. The optical image was segmented by utilizing the well-known

business software Ecognition to obtain the object set

Ropt

of the optical image [

]. The

segmentation was performed using the following parameters: scale parameter, 30; shape,

0.5; compactness, 0.2. Furthermore, an iterated conditional model (ICM) based on Markov

is conducive to better highlighting foreground targets including buildings in SAR image

segmentation, so this method was used to obtain the object set

RSAR

of the SAR image [

The reason why we adopted this method is that, on one hand, the method is an image

segmentation algorithm based on statistics, which is spatially constrained and has fewer

model parameters. On the other hand, this method has been successfully applied to SAR

images with different polarizations, such as single polarimetry, dual polarimetry and full

polarimetry and yields good results.

2.1.2. Establishment of the Coarse Registration-Based Afﬁne Transformation Equation

Ropt

and

RSAR

, the matched object pairs are searched as a basis for establishing the

afﬁne transformation equation. Because of invariance to translation, rotation and scaling

of moment invariants, the 7th-order Hu moment invariants are taken as measures for

similarity between objects [29]. The speciﬁc steps are demonstrated as follows:

Step 1: Based on Equation (1), moment invariants of the

th object in

Ropt

and

th object

in RSAR are calculated and all possible combinations are traversed.

dij =v

∑

n=1φi(n)−ψj(n)2(1)

where,

φi(n)

and

ψj(n)

represent the

th moment invariant of the

th object in the optical

image and nth moment invariant of the jth object in the SAR image, respectively.

Step 2: An object with the smallest moment invariant is selected from

RSAR

for each

object in

Ropt

to construct a set of matched object pairs

Ropt−SA R

. An object with the

minimum moment invariant is selected from

Ropt

for each object in

RSAR

to constitute

another set of matched object pairs RSAR−o pt .

Step 3: The same matched object pairs in

Ropt−SA R

and

RSAR−opt

are retained as the

ﬁnal set of matched object pairs Rmatch.

Step 4: Because inscribed circles of each object are bound to exist and locate inside of

the objects, the inscribed centers of each object can be calculated in

Rmatch

. On this basis,

each matched object pair can obtain a pair of matched inscribed centers (feature points),

thus obtaining the set of matched feature points

Pmatch

required for establishing the afﬁne

transformation equation.

Step 5: By combining

Pmatch

and Equation (2), the afﬁne transformation equation

between optical and SAR images can be established.

x0=a0+a1x+a2y

y0=b0+b1x+b2y(2)

2.1.3. Projection of Inscribed Centers of Objects and Region Growing

The objects in the SAR image that match with each object in the optical image are

searched. Based on coarse registration results, the inscribed centers of each object in

Ropt

Remote Sens. 2022,14, 1100 7 of 21

are directly projected into the SAR image according to the afﬁne transformation equation,

thus obtaining a set of project points in the SAR image. Based on region growing of project

points, the SAR image is divided into connected regions corresponding to each object in

Ropt [30], so as to ﬁnally acquire the uniﬁed optical-SAR object set Runi .

2.2. Extraction of Double Bounce Features Based on Double Bounce Collapse Semantic

In view of extraction of collapse semantic features contained in double bounce, this

study mainly designed two parts, namely detection of potential double bounce pixels

(PDBPs) and construction of a collapse semantic histogram.

2.2.1. Detection of PDBPs

Since double bounce is shown as a highlighted line in the SAR image, this study

ﬁrstly used Hough transform for line detection [

], in which LOG operators were used for

edge detection to obtain a set of initial potential double bounce pixels (IPDBPs). On this

basis, for any one pixel

in IPDBPs, pixels belonging to IPDBPs are searched in its eight

neighborhoods. If there is only one pixel meeting the condition, the pixel

is regarded as

an endpoint. In this case, the pixels belonging to IPDBPs are searched continuously in a

5×5

window with

as the center and the overlapped pixels in eight neighborhoods of

these pixels and eight neighborhoods of the pixel

are all taken as PDBPs. Traversing all

pixels, the ﬁnal set of PDBPs is extracted from the SAR image.

2.2.2. Construction of the Collapse Semantic Histogram

In the SAR image, a visual vocabulary based on collapse semantics was designed and

the collapse semantic histogram was constructed by combining with a spatial relationship

between

Runi

and PDBPs. When the total number of objects in

Runi

, for any one object

uni (c=1, 2, 3, . . . , N)

, the set of visual words and DoubleBounceCollapseSemantic rules

are deﬁned as follows:

•

Pixel set 1 of non-collapsed buildings. Double bounce of non-collapsed buildings

usually appears as a highlighted line at the corner of a building. Therefore, line

segments of double bounce with features of non-collapsed buildings overlap or are

adjacent to proﬁles of the objects, showing a similar curvature and direction and a

certain length. The speciﬁc search and discrimination steps are shown as follows:

Step 1: The blurred line segment

that overlaps or is adjacent to the

uni

proﬁle

is ﬁrstly searched. From any one pixel

on the proﬁle, PDBPs are searched from eight

neighborhoods of

. If there is a PDBP, deﬁned as

, PDBPs in eight neighborhoods of

are

searched. The newly searched PDBPs and

are retained and the ﬁtting line

is obtained

according to these pixels. On this basis, new PDBPs are searched continuously in the

existing newly searched eight neighborhoods of each PDBP. If they exist, the distance from

this point to

is calculated. When the distance is smaller than

, this PDBP is retained. In

a similar way, all possible pixels are traversed and all retained PDBPs constitute the blurred

line segment e

Step 2: For the next pixel

on the proﬁle, a blurred line segment corresponding to

can be obtained by repeating Step 1. All points on the proﬁle are traversed to form a

candidate blurred line-segment set

. All blurred line segments with the length larger

than Taare retained to constitute a blurred line-segment set S2.

Step 3: For foot points of two endpoints of any one line segment

LS2

in the set

on the

proﬁle of the object, the line segment

of the proﬁle between foot points is intercepted and

the line segment

LS2

which simultaneously meets the following two conditions constitutes

a blurred line-segment set

. (1) The difference in average curvatures of

LS2

and

is calculated and should be smaller than the threshold

. (2)

LS2

and

are ﬁtted by

straight lines using the ordinary least squares to calculate the slope difference of the two

straight lines which should be smaller than the threshold

is the constructed visual

word. It should be pointed out that, in order to improve the automation degree of the

proposed method, the following adaptive extraction strategy is adopted for

and

Remote Sens. 2022,14, 1100 8 of 21

Compared with collapsed buildings, double bounce of non-collapsed buildings is usually

longer and more complete. Based on this assumption, an objective function

FS3(Ta,Tb,Tc)

is constructed, representing the number of pixels extracted from

for an object under

different combinations of

and

. Let

and

be valued in intervals of [0, t],

[0, 1] and [0, 1] and tindicate the length of the diagonal of the bounding rectangle for

object

uni

. When

FS3

is the maximum,

and

constitute the optimal parameter

combination.

•

Pixel set 1 of locally collapsed buildings. In

, blurred line segments with the length

smaller than or equal to Taare retained, namely the constructed visual word.

•

Pixel set 1 of completely collapsed buildings. Except for PDBPs which have been

deﬁned as visual words, the other PDBPs located on the

uni

proﬁle or within one

pixel outside the proﬁle are the constructed visual word.

•

Pixel set 2 of non-collapsed buildings. Within pixels in

uni

proﬁle, a candidate blurred

line-segment set

inner

meeting conditions is searched from any one pixel

. Except for

different starting points and scopes of search, other steps are exactly the same as

above. Because the blurred line segments in

inner

are located inside

uni

, the

inner

directly regarded as the constructed visual word.

•

Pixel set 2 of locally collapsed buildings. In

uni

, PDBPs without being deﬁned as

visual words are deﬁned as

PDBPres

and the ratio of the number of

PDBPres

to the

total number of pixels is

. Furthermore, the ratio of the total number of PDBPs in the

SAR image to the total number of pixels is deﬁned as

ϕSAR

. If

ϕ≤ϕSAR

PDBPres

the constructed visual word.

•

Pixel set 2 of completely collapsed buildings. In

uni

, PDBPs without being deﬁned as

visual words are the constructed visual word.

Based on the above six-dimensional visual words, the collapse sematic histogram

Icsh

of the double bounce of Rc

uni can be obtained.

2.3. Extraction of Traditional Visual Features Based on MAPs

The area, diagonal, normalized moment of inertia (NMI) and standard deviation in

MAPs have been proven to have strong discrimination ability in the detection of buildings.

To this end, the traditional visual features in optical and SAR images are extracted based

on the four attributes through the proposed automatic building detection from high-

resolution remote sensing images based on joint optimization and decision fusion of MAPs

(detailed steps can refer to previous studies [

]). On this basis, the multi-scale MAPs sets

corresponding to optical and SAR images are obtained, including

MAPsopt

and

MAPsSAR

MAPsopt

, the mean gray values of

uni

in each attribute proﬁle (AP) are calculated, thus

obtaining the visual histogram

Iosh

of optical images corresponding to

uni

. Similarly, the

visual histogram Issh of SAR images can be obtained.

2.4. Detection of Collapsed Buildings Based on Improved Active Learning SVMs

In the classiﬁcation stage,

Runi

is classiﬁed into non-collapsed buildings, collapsed

buildings and others by using active learning SVMs. The decision-making function of each

SVM classiﬁer is shown as follows:

f(hk)=sign M

∑

m=1

ymαmK(xm,hk)+b!(3)

Furthermore, when annotating samples by active learning SVMs, it is difﬁcult to

annotate samples that are always on the category boundary with the greatest uncertainties.

Therefore, this study proposed the CUI, and the calculation process is demonstrated

as follows:

Remote Sens. 2022,14, 1100 9 of 21

Step 1: The probabilities of the samples

to be annotated belonging to annotated

positive samples wp

land annotated negative samples vq

lare separately calculated:

Dwp

l/hk=

∑

p=1

hhk,wp

khkkkwp

P(4)

Dvq

l/hk=

∑

q=1

hhk,vq

khkkkvq

Q(5)

where,

indicates the

p(p=1, 2 · · · P)

th sample in the

th category of positive samples;

lrepresents the q(q=1, 2 · · · Q)th sample in the lth category of negative samples.

Step 2: On this basis, the CUI of

on the

th classifier is calculated by the

following formula

CUIwp

l,vq

l=2[(D2wp

l/hk−1)×(D2vq

l/hk−1)] (6)

where, the larger the CUI is, the greater the classiﬁcation uncertainty of the sample hk.

Step 3: Based on this, the categorical decision function

fl(hk)

of the sample

calculated in accordance with Equation (3). When CUI is the minimum and

fl(hk)

is the

maximum, the sample

is annotated. The annotated samples are added into the training

samples to re-train the model. By repeating the above steps, the samples are reﬁned to

obtain the ﬁnal detection results of collapsed buildings.

3. Experiments and Evaluation

3.1. Study Area and Dataset Description

The study area is located in Sendai, Japan. An earthquake (Mw 9.0) occurred on 11

March 2011. The epicenter was located in the Paciﬁc Ocean to the east of Miyagi Prefecture,

Japan, with a focal depth of 20 km. Sendai is one of the cities that were most seriously

hit by the earthquake. The earthquake and tsunami damaged lots of buildings, including

9877 collapsed buildings.

The post-earthquake high-resolution optical images adopted in this study are IKONOS

satellite images in Sendai, Japan, which were collected on 24 March 2011 and showed a

spatial resolution of 1 m, as displayed in Figure 3a. Post-earthquake high-resolution

SAR images were TerraSAR-X satellite images for the area, collected on 23 March 2011

with the spatial resolution of 3 m. The images were acquired in HH polarization in

stripmap mode, as demonstrated in Figure 3b. In the experiments, for the difference in

resolution between optical and SAR images, the images with the lower resolution were

re-sampled in this research, so that multi-source images had the same resolution. On this

basis, this study selected three representative regions for experiments. Dataset 1 is located

in an industrial zone where buildings were large and sparsely distributed, as shown in

Figure 4a. Compared with the industrial zone, the residential area is usually the most

severely affected, and is usually the primary target of post-earthquake emergency response

and post-disaster reconstruction. To this end, the constructed Datasets 2 and 3 are both

located in the residential area, as displayed in Figure 4b,c. Buildings in this area are usually

densely distributed and neatly arranged. Due to the signiﬁcant radiometric and geometric

differences between optical and SAR data, their exact registration and high geometric

precision correction are not only very complex but also difﬁcult to obtain the desired results.

In addition, since each object is extracted separately in optical and SAR images in this

paper, only the matching set of objects need to be found in both datasets. For this reason,

the following strategy was adopted for producing the datasets: taking the cropped and

segmented optical image as a reference, we cropped the corresponding area in the SAR

image which could completely cover all the objects in the optical image according to the

visual interpretation.

Remote Sens. 2022,14, 1100 10 of 21

Remote Sens. 2022, 14, 1100 10 of 23

by the earthquake. The earthquake and tsunami damaged lots of buildings, including 9877

collapsed buildings.

The post-earthquake high-resolution optical images adopted in this study are IKO-

NOS satellite images in Sendai, Japan, which were collected on 24 March 2011 and showed

a spatial resolution of 1 m, as displayed in Figure 3a. Post-earthquake high-resolution SAR

images were TerraSAR-X satellite images for the area, collected on 23 March 2011 with the

spatial resolution of 3 m. The images were acquired in HH polarization in stripmap mode,

as demonstrated in Figure 3b. In the experiments, for the difference in resolution between

optical and SAR images, the images with the lower resolution were re-sampled in this

research, so that multi-source images had the same resolution. On this basis, this study

selected three representative regions for experiments. Dataset 1 is located in an industrial

zone where buildings were large and sparsely distributed, as shown in Figure 4a. Com-

pared with the industrial zone, the residential area is usually the most severely affected,

and is usually the primary target of post-earthquake emergency response and post-disas-

ter reconstruction. To this end, the constructed Datasets 2 and 3 are both located in the

residential area, as displayed in Figure 4b,c. Buildings in this area are usually densely

distributed and neatly arranged. Due to the significant radiometric and geometric differ-

ences between optical and SAR data, their exact registration and high geometric precision

correction are not only very complex but also difficult to obtain the desired results. In

addition, since each object is extracted separately in optical and SAR images in this paper,

only the matching set of objects need to be found in both datasets. For this reason, the

following strategy was adopted for producing the datasets: taking the cropped and seg-

mented optical image as a reference, we cropped the corresponding area in the SAR image

which could completely cover all the objects in the optical image according to the visual

interpretation.

Dataset 1

Dataset 2

Dataset 3

(a)

Dataset 1

Dataset 3 Dataset 2

2.61.951.30.650.3250

Miles

2,7602,0701,380690345 Mile s

(b)

38°16'4'' N 38°17'50'' N

38°16'33'' N 38°18'1'' N

141°0'25'' E 141°1'35'' E 141°2'40'' E 141°0'40'' E 141°1'42'' E 141°2'46'' E

Figure 3.(a,b) Study area.

(a) (b) (c)

Figure 4. The optical images for the three datasets: (a) Dataset 1; (b) Dataset 2; (c) Dataset 3.

Figure 3. (a,b) Study area.