ArticlePDF Available

Self-Incremental Learning for Rapid Identification of Collapsed Buildings Triggered by Natural Disasters

Remote Sensing

August 2023
15(15):3909

DOI:10.3390/rs15153909

License
CC BY 4.0

Authors:

Jiayi Ge

Chinese Academy of Tropical Agricultural Sciences

Hong Tang

Beijing Normal University

The building damage caused by natural disasters seriously threatens human security. Applying deep learning algorithms to identify collapsed buildings from remote sensing images is crucial for rapid post-disaster emergency response. However, the diversity of buildings, limited training dataset size, and lack of ground-truth samples after sudden disasters can significantly reduce the generalization of a pre-trained model for building damage identification when applied directly to non-preset locations. To address this challenge, a self-incremental learning framework (i.e., SELF) is proposed in this paper, which can quickly improve the generalization ability of the pre-trained model in disaster areas by self-training an incremental model using automatically selected samples from post-disaster images. The effectiveness of the proposed method is verified on the 2010 Yushu earthquake, 2023 Turkey earthquake, and other disaster types. The experimental results demonstrate that our approach outperforms state-of-the-art methods in terms of collapsed building identification, with an average increase of more than 6.4% in the Kappa coefficient. Furthermore, the entire process of the self-incremental learning method, including sample selection, incremental learning, and collapsed building identification, can be completed within 6 h after obtaining the post-disaster images. Therefore, the proposed method is effective for emergency response to natural disasters, which can quickly improve the application effect of the deep learning model to provide more accurate building damage results.

The SELF framework. The dotted line defines before and after a disaster.

…

Geographical location of images in DREAM-B+ dataset, where each rectangle contains several sampled remote sensing images in this area.

…

Example images from the DREAM-B+ dataset.

…

Location map and main data of the test areas. Yushu test area (a), and Turkey test area (b).

…

+22

Collapsed building identification under the SELF framework.

…

Figures - available from: Remote Sensing

This content is subject to copyright.

Access to this full-text is provided by MDPI.

Content available from Remote Sensing

This content is subject to copyright.

Citation: Ge, J.; Tang, H.; Ji, C.

Self-Incremental Learning for Rapid

Identiﬁcation of Collapsed Buildings

Triggered by Natural Disasters.

Remote Sens. 2023,15, 3909. https://

doi.org/10.3390/rs15153909

Academic Editors: Raffaele Albano,

Ivanka Pelivan and Reza Arghandeh

Received: 5 July 2023

Revised: 27 July 2023

Accepted: 4 August 2023

Published: 7 August 2023

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

remote sensing

Article

Self-Incremental Learning for Rapid Identiﬁcation of Collapsed

Buildings Triggered by Natural Disasters

Jiayi Ge 1,2, Hong Tang 1,2 ,* and Chao Ji 1,2

State Key Laboratory of Remote Sensing Science, Faculty of Geographical Science, Beijing Normal University,

Beijing 100875, China; 202021051203@mail.bnu.edu.cn (J.G.); jichao@mail.bnu.edu.cn (C.J.)

2Beijing Key Laboratory for Remote Sensing of Environment and Digital Cities, Faculty of Geographical

Science, Beijing Normal University, Beijing 100875, China

*Correspondence: hongtang@bnu.edu.cn

Abstract:

The building damage caused by natural disasters seriously threatens human security.

Applying deep learning algorithms to identify collapsed buildings from remote sensing images is

crucial for rapid post-disaster emergency response. However, the diversity of buildings, limited

training dataset size, and lack of ground-truth samples after sudden disasters can signiﬁcantly

reduce the generalization of a pre-trained model for building damage identiﬁcation when applied

directly to non-preset locations. To address this challenge, a self-incremental learning framework

(i.e., SELF) is proposed in this paper, which can quickly improve the generalization ability of the

pre-trained model in disaster areas by self-training an incremental model using automatically selected

samples from post-disaster images. The effectiveness of the proposed method is veriﬁed on the

2010 Yushu earthquake, 2023 Turkey earthquake, and other disaster types. The experimental results

demonstrate that our approach outperforms state-of-the-art methods in terms of collapsed building

identiﬁcation, with an average increase of more than 6.4% in the Kappa coefﬁcient. Furthermore,

the entire process of the self-incremental learning method, including sample selection, incremental

learning, and collapsed building identiﬁcation, can be completed within 6 h after obtaining the

post-disaster images. Therefore, the proposed method is effective for emergency response to natural

disasters, which can quickly improve the application effect of the deep learning model to provide

more accurate building damage results.

Keywords:

building damage; remote sensing; self-incremental learning; sample selection; disaster

emergency response

1. Introduction

The frequent occurrence of extreme natural disasters seriously threatens the safety of

human life. Timely access to the distribution information of collapsed buildings is crucial to

emergency response and post-disaster rescue efforts [

]. Currently, remote sensing technol-

ogy provides an efﬁcient solution for the accurate and rapid extraction of building damage.

As a result, post-disaster remote sensing images with high spatial resolution have become

indispensable basic data for identifying disaster damage in numerous studies [

]. Among

these, optical imagery stands out as a common and accessible source of remote sensing

data [

], with a wide variety of sensors facilitating easy data acquisition. Some studies have

also utilized radar equipment mounted on drones to scan post-disaster buildings [

], which

remain unaffected by post-disaster weather conditions and can be combined with optical

images for comprehensive analysis [

]. Moreover, LiDAR data proves useful in detecting

height changes in buildings, enabling precise extraction of collapsed parts [2].

The vast diversity of buildings in different regions presents a signiﬁcant challenge in

accurately identifying buildings and assessing their damage using a pre-trained model [

Currently, deep learning technology, particularly convolutional neural networks, has

Remote Sens. 2023,15, 3909. https://doi.org/10.3390/rs15153909 https://www.mdpi.com/journal/remotesensing

Remote Sens. 2023,15, 3909 2 of 26

achieved state-of-the-art results in the task of building damage extraction [

]. Most of

the research in this area focuses on proposing or improving a model for change detec-

tion to extract building damage information from paired bitemporal images of pre- and

post-disaster [

–

]. However, in the context of emergency response scenarios, relying

on pre-disaster imagery can signiﬁcantly impact both the effectiveness and the efﬁciency

of damage assessment. Therefore, an alternative approach worth considering is that the

distribution maps of buildings extracted from pre-disaster images should be prepared

before any disaster occurs.

The building distribution maps can ﬁlter out complex background categories and

provide key information, including building location and shape, which is obviously helpful

for building damage identiﬁcation. Currently, there are only a few studies that exclusively

utilize pre-disaster building distribution maps in combination with post-disaster imagery,

despite the availability of building footprint or rooftop data that covers the vast majority of

the world [

]. Notable examples include Open Street Map (http://www.openstreetmap.

org/, accessed on 13 October 2022) and Bing maps of Microsoft (https://github.com/

microsoft/GlobalMLBuildingFootprints, accessed on 21 February 2023) open-access data.

Admittedly, there is a real problem that these building distribution data cannot guarantee a

high update frequency at present, resulting in long time intervals between the availability

of pre-disaster building distribution maps and post-disaster images, potentially spanning

several years.

In addition, it is still difﬁcult to accurately identify buildings from post-disaster images

because the training data may come from different sensors or from different geographical

regions [

]. Therefore, simply applying a pre-trained model to post-disaster scenarios can

lead to a considerable drop in generalization performance and poor recognition results [

Transfer learning is a common solution to adapt the original model for better perfor-

mance on the target domain. Data-based transfer learning has been shown to improve the

model’s application by utilizing target domain data [

]. Hu et al. [

] demonstrated that

using post-disaster samples effectively enhances the identiﬁcation accuracy of damaged

buildings. On the other hand, incremental learning is a model-based transfer learning

approach that improves generalization in speciﬁc scenarios by adding new base learn-

ers [

]. Ge et al. [

] conﬁrmed that incremental learning signiﬁcantly saves transfer

time during emergency response, as it focuses on training only on new data containing

post-disaster information. Therefore, learning sufﬁcient post-disaster samples incremen-

tally can effectively and rapidly improve the model’s performance. In this process, the

key technology lies in selecting high-quality post-disaster samples with the assistance of

building distribution maps.

To enhance both the accuracy and efﬁciency of building damage extraction during

disaster emergency response, we propose the Self-incremental Learning Framework (SELF).

This framework utilizes post-disaster samples selected from optical remote sensing images

to rapidly improve the identiﬁcation accuracy of collapsed buildings. As illustrated in

Figure 1, the preparation involves building distribution maps and a building recognition

model before any disaster occurs. Subsequently, after the disaster event, essential samples

of disaster imagery are automatically selected based on the knowledge of building distri-

bution and predicted probability maps generated by the pre-trained model. The model’s

generalization ability is then swiftly improved through self-supervised training using these

selected samples in an incremental learning manner. This process enables us to obtain

reliable building damage results efﬁciently and effectively.

Remote Sens. 2023,15, 3909 3 of 26

Remote Sens. 2023, 15, x FOR PEER REVIEW 3 of 28

Figure 1. The SELF framework. The doed line deﬁnes before and after a disaster.

This paper is organized as follows. The literature review related to this study is sum-

marized in Section 2. The data and methods are introduced in Sections 3 and 4, respec-

tively. The experimental results are shown in Section 5, and a discussion is conducted in

Section 6. Some conclusions are drawn in Section 7.

2. Related Work

2.1. Building Damage Identiﬁcation Methods

Most studies have focused on using the paired images, i.e., pre- and post-disaster

images, to identify building damage [8,17]. Durnov [18] proposed a change detection

method utilizing a Siamese structure that achieved top-ranking results in a competition

focused on building damage identiﬁcation. Subsequently, several similar change detec-

tion models were introduced [3,7], with a primary focus on optimizing the model struc-

ture. However, these methods necessitate the use of bitemporal images for damage detec-

tion, thus limiting the eﬃciency of disaster emergency response due to the reliance on pre-

disaster images. Additionally, methods that combine multi-source images with various

auxiliary data have been employed to extract high-precision disaster results [5]. For in-

stance, Wang et al. [2] employed multiple types of data, including LiDAR and optical im-

ages, to extract the collapsed areas through the changes of the information of building

height and corner points. However, the reality is that many types of speciﬁc data may be

diﬃcult to obtain in the short time after a disaster.

Methods that solely rely on post-disaster images aim to eﬃciently identify damaged

buildings [19,20]. Based on the morphological and spectral characteristics of post-earth-

quake buildings, Ma et al. [21] proposed a method for depicting collapsed buildings using

only post-disaster high-resolution images. Munsif et al. [22] achieved a lightweight CNN

model, occupying just 3 MB, which can be deployed on Unmanned Aerial Vehicles (UAVs)

with limited hardware resources, by utilizing several data augmentation techniques to

enhance the eﬃciency and accuracy of multi-hazard damage identiﬁcation. Nia et al. [23]

introduced a deep model based on ground-level post-disaster images and demonstrated

that using semantic segmentation results as the foreground positively impacted building

damage assessment. Miura et al. [20] developed a collapsed building identiﬁcation

method using a CNN model and post-disaster aerial images, and achieved a damage dis-

tribution that was basically consistent with the inventories in earthquakes. However, ex-

isting studies have shown that the separability between collapsed buildings and the back-

ground is relatively low [24]. Solely using post-disaster information often falls short in

accurately locating the damaged areas.

Figure 1. The SELF framework. The dotted line deﬁnes before and after a disaster.

This paper is organized as follows. The literature review related to this study is

summarized in Section 2. The data and methods are introduced in Sections 3and 4,

respectively. The experimental results are shown in Section 5, and a discussion is conducted

in Section 6. Some conclusions are drawn in Section 7.

2. Related Work

2.1. Building Damage Identiﬁcation Methods

Most studies have focused on using the paired images, i.e., pre- and post-disaster

images, to identify building damage [

]. Durnov [

] proposed a change detection

method utilizing a Siamese structure that achieved top-ranking results in a competition

focused on building damage identiﬁcation. Subsequently, several similar change detection

models were introduced [

], with a primary focus on optimizing the model structure.

However, these methods necessitate the use of bitemporal images for damage detection,

thus limiting the efﬁciency of disaster emergency response due to the reliance on pre-

disaster images. Additionally, methods that combine multi-source images with various

auxiliary data have been employed to extract high-precision disaster results [

]. For

instance, Wang et al. [

] employed multiple types of data, including LiDAR and optical

images, to extract the collapsed areas through the changes of the information of building

height and corner points. However, the reality is that many types of speciﬁc data may be

difﬁcult to obtain in the short time after a disaster.

Methods that solely rely on post-disaster images aim to efﬁciently identify dam-

aged buildings [

]. Based on the morphological and spectral characteristics of post-

earthquake buildings, Ma et al. [

] proposed a method for depicting collapsed buildings

using only post-disaster high-resolution images. Munsif et al. [

] achieved a lightweight

CNN model, occupying just 3 MB, which can be deployed on Unmanned Aerial Vehicles

(UAVs) with limited hardware resources, by utilizing several data augmentation techniques

to enhance the efﬁciency and accuracy of multi-hazard damage identiﬁcation. Nia et al. [

]

introduced a deep model based on ground-level post-disaster images and demonstrated

that using semantic segmentation results as the foreground positively impacted building

damage assessment. Miura et al. [

] developed a collapsed building identiﬁcation method

using a CNN model and post-disaster aerial images, and achieved a damage distribution

that was basically consistent with the inventories in earthquakes. However, existing stud-

ies have shown that the separability between collapsed buildings and the background is

relatively low [

]. Solely using post-disaster information often falls short in accurately

locating the damaged areas.

It is not easy to meet both the accuracy and efﬁciency requirements under emergency

conditions by relying on bitemporal images or only post-disaster images. Therefore, com-

Remote Sens. 2023,15, 3909 4 of 26

bining key pre-disaster knowledge (such as pre-disaster building distribution maps, a

pre-trained model for buildings identiﬁcation, and so on) with post-disaster images to

identify damaged buildings quickly and accurately is a solution that is being developed

in some studies [

]. For example, Galanis et al. [

] introduced the DamageMap model

for wildﬁre disasters, which leverages pre-disaster building segmentation results and post-

disaster aerial or satellite imagery for a classiﬁcation task to determine whether buildings

are damaged. At present, there are few studies that make full use of pre-disaster building

distribution maps. Even though the building distribution data may not strictly correspond

to each building in the post-disaster images, it can still provide much effective information

about the location and shape of the buildings. Therefore, it is a promising way to devise

methods to better apply the pre-disaster information in the future disaster response tasks.

2.2. Transfer Learning Methods

When a pre-trained model is directly applied to a target domain with signiﬁcantly

different features from the training data, there can be a considerable drop in accuracy.

Transfer learning is used to address this practical problem. Current transfer learning

methods can be categorized into the following three categories: (1) Data-based transfer

learning [

] usually uses some samples of the target domain to enhance the model’s

performance in target applications. An example is the self-training method [

], which

improves the model’s generalization ability by automatically generating pseudo-labels.

(2) Feature-based transfer learning [

] transforms the data of two domains into the

same feature space, reducing the distance between the features of the source domain and

the target domain, such as domain adversarial networks [

]. (3) Model-based transfer

learning [

] usually adds new layers or integrates new base learners to optimize the

original model, such as incremental learning [16].

Transfer learning in building damage extraction tasks aims to improve the performance

of models in post-disaster scenes. However, these methods encounter challenges in practical

applications, such as the scarcity of post-disaster samples, variations in image styles, and

the unique features of buildings themselves. Hu et al. [

] conducted a comparison of

three transfer learning methods for post-disaster building recognition and discovered that

utilizing samples from disaster areas can signiﬁcantly boost the recognition accuracy for

various types of disasters. On the other hand, Lin et al. [

] proposed a novel method to

ﬁlter historical data relevant to the target task from earthquake cases, aiming to improve

the reliability of classiﬁcation results.

In addition to transfer learning, data augmentation is often used to improve the gener-

alization performance of the models [

], including applying various transformations to

existing images, such as rotations, ﬂips, or zooms, so that the model becomes more robust.

Data synthesis is another valuable strategy that can address data scarcity by combining

real data with computer simulations or generative models [

]. In fact, these methods

can be combined with transfer learning to provide more precise and timely disaster infor-

mation in emergency missions. Ge et al. [

] employed the generative network to transfer

the style of remote sensing images under an incremental learning framework, and used

data augmentation strategy to train the models, which improved the accuracy of building

damage recognition.

2.3. Contributions of This Research

However, there is insufﬁcient exploration on how to obtain and utilize important

samples from post-disaster images efﬁciently and effectively. The aim of this paper is to

ﬁll this gap. The main contributions of this paper are twofold. (1) A knowledge-guided

sample selection method is present, which uses a pre-trained model and pre-disaster

building distribution maps to assist in sample selection from post-disaster images. (2) A

self-incremental learning method is proposed by assembling self-training and incremental

learning, which uses selected samples to realize the growth of the original model to quickly

improve the accuracy of building damage extraction.

Remote Sens. 2023,15, 3909 5 of 26

3. Data

3.1. Training Data: DREAM-B+

The DREAM-B+ dataset [

] is a large-scale building dataset comprising sampled

remote sensing images and corresponding labels from over 100 cities worldwide. The

dataset consists of 18,876 image tiles, each captured with RGB bands and having a high

spatial resolution of either 0.5 m or 0.3 m. Each image tile has a size of 1024 ×1024 pixels.

There are two categories in the ground-truth: building and background. The location of the

images in this dataset is shown in Figure 2, and some examples are showcased in Figure 3.

Remote Sens. 2023, 15, x FOR PEER REVIEW 5 of 28

self-incremental learning method is proposed by assembling self-training and incremental

learning, which uses selected samples to realize the growth of the original model to

quickly improve the accuracy of building damage extraction.

3. Data

3.1. Training Data: DREAM-B+

The DREAM-B+ dataset [6,16] is a large-scale building dataset comprising sampled

remote sensing images and corresponding labels from over 100 cities worldwide. The da-

taset consists of 18,876 image tiles, each captured with RGB bands and having a high spa-

tial resolution of either 0.5 m or 0.3 m. Each image tile has a size of 1024 × 1024 pixels.

There are two categories in the ground-truth: building and background. The location of

the images in this dataset is shown in Figure 2, and some examples are showcased in Fig-

ure 3.

The DREAM-B+ dataset is split into two sets for training and validation purposes.

Speciﬁcally, 90% of the dataset is allocated for training a building recognition model,

which serves as the prepared model in stage 1 before any disaster occurs. The remaining

10% of the dataset is used as a validation set to assess and validate the training process.

Figure 2. Geographical location of images in DREAM-B+ dataset, where each rectangle contains

several sampled remote sensing images in this area.

Figure 2.

Geographical location of images in DREAM-B+ dataset, where each rectangle contains

several sampled remote sensing images in this area.

Remote Sens. 2023, 15, x FOR PEER REVIEW 6 of 28

Figure 3. Example images from the DREAM-B+ dataset.

3.2. Tes t Data

The Yushu earthquake (Mw 6.9) occurred on April 14, 2010, with the epicenter very

close to the urban area. This earthquake eventually resulted in about 14,700 deaths and

many densely distributed houses were destroyed [35]. As shown in Figure 4a, the hardest-

hit urban region of Yushu is used as an emergent disaster event to test both the eﬀective-

ness and eﬃciency of the proposed method. We obtained the post-disaster aerial images

of this area with a resolution of 0.5 m. Due to the lack of available satellite images before

the event, the building distribution map was visually interpreted from the pre-disaster

images captured in 2004, and cross-validated by multiple domain experts in order to min-

imize the uncertainty of the map.

The Turkey earthquake (Mw 7.8) occurred on 6 February 2023, with the epicenter at

37.15°N, 36.95°E. This earthquake killed more than 40,000 people in Turkey and Syria. We

obtained the post-disaster remote sensing images captured by Worldview-3, which have

a spatial resolution of 0.3 m. As shown in Figure 4b, the Islahiye town serves as the second

test area, which is close to the epicenter and has been severely aﬀected. The pre-disaster

building distribution map are from Microsoft’s products [36], and the ground-truth of col-

lapsed buildings are obtained by visual interpretation. The data details of Yushu and Tur-

key test areas are shown in Table 1.

Figure 3. Example images from the DREAM-B+ dataset.

Remote Sens. 2023,15, 3909 6 of 26

The DREAM-B+ dataset is split into two sets for training and validation purposes.

Speciﬁcally, 90% of the dataset is allocated for training a building recognition model, which

serves as the prepared model in stage 1 before any disaster occurs. The remaining 10% of

the dataset is used as a validation set to assess and validate the training process.

3.2. Test Data

The Yushu earthquake (Mw 6.9) occurred on April 14, 2010, with the epicenter very

close to the urban area. This earthquake eventually resulted in about 14,700 deaths and

many densely distributed houses were destroyed [

]. As shown in Figure 4a, the hardest-

hit urban region of Yushu is used as an emergent disaster event to test both the effectiveness

and efﬁciency of the proposed method. We obtained the post-disaster aerial images of this

area with a resolution of 0.5 m. Due to the lack of available satellite images before the

event, the building distribution map was visually interpreted from the pre-disaster images

captured in 2004, and cross-validated by multiple domain experts in order to minimize the

uncertainty of the map.

Figure 4.

Location map and main data of the test areas. Yushu test area (

), and Turkey test area (

The Turkey earthquake (Mw 7.8) occurred on 6 February 2023, with the epicenter at

37.15

◦

N, 36.95

◦

E. This earthquake killed more than 40,000 people in Turkey and Syria. We

obtained the post-disaster remote sensing images captured by Worldview-3, which have a

spatial resolution of 0.3 m. As shown in Figure 4b, the Islahiye town serves as the second

test area, which is close to the epicenter and has been severely affected. The pre-disaster

building distribution map are from Microsoft’s products [

], and the ground-truth of

collapsed buildings are obtained by visual interpretation. The data details of Yushu and

Turkey test areas are shown in Table 1.

Table 1. Details of the test data.

Cases Data Source Bands Acquisition Time Resolution

Yushu

Post-disaster image Aerial platform RGB April 2010 0.5 m

Pre-disaster image Quickbird 6 November 2004 0.6 m

Pre-disaster building distribution map Visual interpretation / 0.5 m

Turkey Post-disaster image Worldview-3 RGB February 2023 0.3 m

Pre-disaster building distribution map Microsoft / 2023

Remote Sens. 2023,15, 3909 7 of 26

4. Methodology

4.1. Overview

The purpose of the proposed self-incremental learning framework (SELF) aims to

rapidly enhance the identiﬁcation accuracy of a pre-trained model by selecting and utilizing

new samples from post-disaster images. The speciﬁc application process of the framework

is shown in Figure 5. First, we need to prepare both the building distribution map and a

pre-trained model (i.e., stage 1 model) for building identiﬁcation. When the post-disaster

images are available, the stage 1 model is then used to produce the probability maps of

buildings on the post-disaster images. To improve the accuracy of identifying post-disaster

buildings, the framework employs the knowledge-guided sample selection (K-SS) method

to select new samples from the post-disaster images. The new samples then incrementally

learned a new model, i.e., the stage 2 model, through an end-to-end gradient boosting

algorithm (i.e., EGB-A). The stage 2 model is speciﬁcally designed to identify buildings from

post-disaster images. Finally, pixel-level collapsed buildings are identiﬁed by comparing

both pre- and post-disaster building maps.

Remote Sens. 2023, 15, x FOR PEER REVIEW 8 of 28

Figure 5. Collapsed building identiﬁcation under the SELF framework.

4.2. Knowledge-Guided Sample Selection Method

As presented in Table 2, the pixels in both the pre-disaster building distribution maps

and the post-disaster images are classiﬁed into two categories: building (positive class)

and background (negative class). The post-disaster category “building” consists of build-

ings that have not collapsed, other buildings that consist of new buildings that only ap-

pear in post-disaster images, and some buildings that were missed in the building distri-

bution maps. The post-disaster category “background” refers to pixels of both collapsed

buildings and the pre-disaster background.

Table 2. Categories of pre- and post-disaster data.

Data Category Class Label Detailed Category

Pre-disaster Building Positive /

Background Negative

Post-disaster

Building Positive

Not collapsed building

Other building

Background Negative Collapsed building

Original background

The location and shape information of each building provided by the pre-disaster

building distribution maps should be fully utilized in the process of sample selection. The

ﬁrst core idea of the K-SS sample selection method is to analyze each building object indi-

vidually. In existing studies, the entropy-based sample selection methods often screen an

entire image or region, such as selecting the top 10% of the image with the highest proba-

bility value as positive samples. We believe that conducting a detailed analysis for each

individual building can beer consider the capabilities of the model for buildings with

various features. In addition, buildings and their nearby background pixels are relatively

critical samples, because the pixels near the classiﬁcation boundary are often easily con-

fused by the model, such as the edge of buildings and their junction with the background.

Therefore, another idea of the K-SS method is to use the contrast between the probability

Figure 5. Collapsed building identiﬁcation under the SELF framework.

4.2. Knowledge-Guided Sample Selection Method

As presented in Table 2, the pixels in both the pre-disaster building distribution maps

and the post-disaster images are classiﬁed into two categories: building (positive class) and

background (negative class). The post-disaster category “building” consists of buildings

that have not collapsed, other buildings that consist of new buildings that only appear in

post-disaster images, and some buildings that were missed in the building distribution

maps. The post-disaster category “background” refers to pixels of both collapsed buildings

and the pre-disaster background.

Remote Sens. 2023,15, 3909 8 of 26

Table 2. Categories of pre- and post-disaster data.

Data Category Class Label Detailed Category

Pre-disaster Building Positive /

Background Negative

Post-disaster

Building Positive Not collapsed building

Other building

Background Negative Collapsed building

Original background

The location and shape information of each building provided by the pre-disaster

building distribution maps should be fully utilized in the process of sample selection.

The ﬁrst core idea of the K-SS sample selection method is to analyze each building object

individually. In existing studies, the entropy-based sample selection methods often screen

an entire image or region, such as selecting the top 10% of the image with the highest

probability value as positive samples. We believe that conducting a detailed analysis for

each individual building can better consider the capabilities of the model for buildings

with various features. In addition, buildings and their nearby background pixels are

relatively critical samples, because the pixels near the classiﬁcation boundary are often

easily confused by the model, such as the edge of buildings and their junction with the

background. Therefore, another idea of the K-SS method is to use the contrast between the

probability values of the building and its surrounding area as the basis for selecting samples.

Speciﬁcally, the probability values within a certain range, including buildings, are counted,

and threshold segmentation is performed to maximize the variance between classes.

The complete K-SS sample selection method is shown in Algorithm 1. Please note that

Figure 6might be helpful for understanding the algorithm in a more intuitive way. One of

the important steps is to double the minimum enclosing rectangle of each building object

and use the Otsu algorithm [

] to perform threshold segmentation on the probability map

in the enlarged area. The Otsu algorithm has the advantages of fast calculation speed and is

not affected by image contrast. The principle of Otsu is to maximize the variance between

classes and automatically generate the best segmentation threshold:

T=Otsu({P})(1)

where

{P}

is the set of image pixel values in the region to be segmented. The Otsu

algorithm binarizes

{P}

and returns a threshold

. Pixels whose value are greater than

are classiﬁed as foreground (i.e., not collapsed buildings and other buildings); otherwise,

they are background (i.e., collapsed buildings and original background). In addition, three

speciﬁc modules are designed for sample selection of categories: not collapsed buildings,

collapsed buildings, and other buildings, respectively.

Building selection module is utilized to select samples of the category of not collapsed

buildings and their surrounding background. The rationale behind designing this module

is that the probability value of the building predicted by the stage 1 model is generally

higher than the surrounding background. The Otsu algorithm can roughly distinguish

the foreground and the background. Combining the threshold segmentation results and

the pre-disaster building information, pixels with high conﬁdence are selected as positive

samples—-that is, take the intersection of the post-disaster threshold segmentation results

and pre-disaster building distribution maps in the same category. To minimize the inclusion

of erroneous samples, other pixels are ignored because it is difﬁcult to determine their

actual classes.

Collapsed building selection module is designed to select samples of collapsed build-

ings and their surrounding background. In general, the features of building ruins are close

to those of the background, so the probability value of the collapsed buildings predicted by

the model is close to that of their surrounding areas. If there is no obvious contrast between

the foreground and background probability values, the ratio of the two categories after

Remote Sens. 2023,15, 3909 9 of 26

threshold segmentation is likely to be unbalanced. For not collapsed buildings, the areas of

the foreground and background after segmentation should be relatively similar, because

we doubled the minimum enclosing rectangle of each building. The very unbalanced

area ratio of the two classes gives us greater conﬁdence that the building has collapsed.

Here, we assume that if the area ratio of two categories is more than four times, it is very

unbalanced. Similarly, some samples of collapsed buildings are selected in combination

with the pre-disaster building distribution maps, and other pixels are ignored.

There may be missing buildings in the pre-disaster building distribution maps or

newly built buildings. Background screening module serves the purpose of ﬁltering out

possible buildings in the background area of the building distribution maps. First, ﬁnd the

average value

of the post-disaster probability corresponding to the pixels of the building

category before the disaster is calculated. Then, ignore the pixels whose probability value

is greater than

in the background category before the disaster, and the remaining areas

have a great conﬁdence that they belong to the background category in all the pre- and

post-disaster data.

As depicted in Figure 6, after the screening by using the three modules, the ﬁnal

effective samples are labeled as positive or negative samples, and other pixels are ignored

or invalid.

Algorithm 1 The K-SS method for post-disaster sample selection.

Deﬁnition:

The minimum enclosing rectangle of N

building objects : r1,r2, . . ., rN.

Expand the area of rnto get : R1,R2, . . ., RN. (AreaRn=2Arearn).

In probability map : the value of pixels {Pi}, and the average value of the pixels

corresponding to the building category pre-disaster : Pb

Sample selection:

for n=1 to Ndo:

T=Otsu(Pi∈Rn)

As for Pi∈Rn:

Building selection:

(1)if Pi>Tand corresponds to the pre-disaster building category:

Not collapsed.

(2)if Pi<Tand corresponds to the pre-disaster background category:

Background.

(3) other regions: Ignored.

Collapsed building selection:

when Numb er o f Pi>T

Numb er o f Pi<T<1

4or Numb er o f Pi>T

Numb er o f Pi<T>4:

(1) if Pi<T: Collapsed.

(2) other regions: Ignored.

end for.

Background screening:

As for the regions except R1SR2S. . .SRN:

(1) if Pi<Pb: Background.

(2) other regions: Ignored (may be buildings).

Output: Positive samples: Not collapsed.

Negative samples: Collapsed and background.

Invalid samples: Ignored.

Remote Sens. 2023,15, 3909 10 of 26

Remote Sens. 2023, 15, x FOR PEER REVIEW 11 of 28

Figure 6. Schematic of the K-SS method. Buildings that existed in the pre-disaster images but not in

the post-disaster images were considered as collapsed buildings, and vice versa were considered

other buildings. The blue and red boxes are the minimum enclosing rectangles and enlarged rectan-

gular areas of pre-disaster building objects, respectively.

4.3. Incremental Learning Using the EGB-A

The end-to-end gradient boosting (EGB) algorithm achieves incremental learning by

integrating multiple base learners together [16]. The new base learner is trained on the

newly collected data based on all existing base learners. This method has certain ad-

vantages in the disaster emergency process because it can utilize post-disaster data to ur-

gently train a base learner and incorporate it into the original model in order to achieve

rapid transfer learning for speciﬁc applications. The EGB-A method [6] is an improved

version of the EGB for the building damage classiﬁcation task, which alleviates the

knowledge forgeing problem and optimizes the ability of adaptive learning. The training

algorithm of the EGB-A method is shown in Algorithm 2. For additional application de-

tails of the method, we recommend referring to the papers of Ge et al. [6] and Yang and

Tang [ 16].

Algorithm 2 Training algorithm of EGB-A [4].

Input: Training data, X={x

,…,x}, and labels, Y={

,…,

}; base learner,

𝑓

(x;𝜃);

learning rate of base learner, 𝑣; and softmax function, 𝜎.

1: 𝐹(x)=𝜎

𝑓

(x;𝜃)

𝑓

(x)=𝑎𝑟𝑔𝑚𝑖𝑛

𝐿

, 𝐹(x)

3: for 𝑚=1 to 𝑀 do

4: 𝐹(𝑥)=𝜎

(

𝑓

(𝑥)+∑𝑣⋅

𝑓

(𝑥)



 )

𝑓

(𝑥;𝜃

)=𝑎𝑟𝑔𝑚𝑖𝑛

,,…, 𝐿(𝑦,𝐹

(𝑥))

6: end for

Output: 𝐹(𝑥)= 𝜎

󰇡∑𝑣⋅





𝑓

(𝑥;𝜃)󰇢, 𝑣=1

In the SELF framework, the EGB-A method is used to incrementally train a new base

learner based on the existing stage 1 model, utilizing the selected post-disaster samples.

This process results in an ensemble model with two base learners in stage 2. One of the

signiﬁcant advantages of this approach is that the training process does not need to reuse

Figure 6.

Schematic of the K-SS method. Buildings that existed in the pre-disaster images but not in

the post-disaster images were considered as collapsed buildings, and vice versa were considered other

buildings. The blue and red boxes are the minimum enclosing rectangles and enlarged rectangular

areas of pre-disaster building objects, respectively.

4.3. Incremental Learning Using the EGB-A

The end-to-end gradient boosting (EGB) algorithm achieves incremental learning by

integrating multiple base learners together [

]. The new base learner is trained on the

newly collected data based on all existing base learners. This method has certain advantages

in the disaster emergency process because it can utilize post-disaster data to urgently train

a base learner and incorporate it into the original model in order to achieve rapid transfer

learning for speciﬁc applications. The EGB-A method [

] is an improved version of the

EGB for the building damage classiﬁcation task, which alleviates the knowledge forgetting

problem and optimizes the ability of adaptive learning. The training algorithm of the

EGB-A method is shown in Algorithm 2. For additional application details of the method,

we recommend referring to the papers of Ge et al. [6] and Yang and Tang [16].

Algorithm 2 Training algorithm of EGB-A [4].

Input: Training data, X ={x0, . . . , xM}, and labels, Y =y0, . . . , yM; base learner, f(x; θ);

learning rate of base learner, v; and softmax function, σ.

1: F0(x0)=σ(f0(x0;θ0))

2: f0(x0)=argminθ0Ly0,F0(x0)

3: for m=1 to Mdo

4: Fm(xm)=σ(fm(xm)+∑m−1

i=0vi·fi(xm))

5: fm(xm;θm)=argminθm,v0,...,vm−1L(ym,Fm(xm))

6: end for

Output: FM(x)=σ(∑M

i=0vi·fi(x;θ)),vM=1

In the SELF framework, the EGB-A method is used to incrementally train a new base

learner based on the existing stage 1 model, utilizing the selected post-disaster samples.

This process results in an ensemble model with two base learners in stage 2. One of

the signiﬁcant advantages of this approach is that the training process does not need

to reuse pre-disaster datasets (e.g., DREAM-B+), which can save valuable time during

emergency response.

The network architecture of each base learner in the SELF framework is based on

U-NASNetMobile [

], as illustrated in Figure 7. It combines the neural architecture search

structure in NASNet [

] and the upsampling module in the classic U-Net model [

]

to perform semantic segmentation tasks. The U-NASNetMobile is suitable for ensemble

models and disaster scenarios due to its small number of parameters and fast training

speed [16].

Remote Sens. 2023,15, 3909 11 of 26

Remote Sens. 2023, 15, x FOR PEER REVIEW 12 of 28

pre-disaster datasets (e.g., DREAM-B+), which can save valuable time during emergency

response.

The network architecture of each base learner in the SELF framework is based on U-

NASNetMobile [16], as illustrated in Figure 7. It combines the neural architecture search

structure in NASNet [38] and the upsampling module in the classic U-Net model [39] to

perform semantic segmentation tasks. The U-NASNetMobile is suitable for ensemble

models and disaster scenarios due to its small number of parameters and fast training

speed [16].

Figure 7. The structure of U-NASNetMobile [16].

4.4. Experimental Seings and Evaluation Metrics

The Adam optimizer [40] and the cosine learning rate annealing schedule [41] are

employed to update the weights of the model. The default batch size is 4, and the maxi-

mum learning rate is 3 × 10−4. During the training process, the parameters of the laer base

learner are initialized with the parameters of the previous base learner to speed up the

convergence. In addition, traditional data augmentation methods are also applied to pre-

vent overﬁing, including brightness variation, ﬂipping, and random rotation. The exper-

iments were run on the hardware device of NVIDIA Tesla K80 GPU.

For the identiﬁcation of post-disaster buildings, the IoU metric [42] of the building

category is employed to evaluate the accuracy. The F1 score, recall, precision, and OA

(overall accuracy) are employed as reference evaluation metrics:

IoU = Prediction⋂GroundTruth

Prediction⋃GroundTruth (2)

F score = 2TP

2TP + FP + FN (3)

Recall = TP

TP + FN (4)

Precision = TP

TP + FP (5)

OA = TP + TN

TP + TN + FP + FN (6)

Figure 7. The structure of U-NASNetMobile [16].

4.4. Experimental Settings and Evaluation Metrics

The Adam optimizer [

] and the cosine learning rate annealing schedule [

] are

employed to update the weights of the model. The default batch size is 4, and the maximum

learning rate is 3

−4

. During the training process, the parameters of the latter base

learner are initialized with the parameters of the previous base learner to speed up the

convergence. In addition, traditional data augmentation methods are also applied to

prevent overﬁtting, including brightness variation, ﬂipping, and random rotation. The

experiments were run on the hardware device of NVIDIA Tesla K80 GPU.

For the identiﬁcation of post-disaster buildings, the IoU metric [

] of the building

category is employed to evaluate the accuracy. The F1 score, recall, precision, and OA

(overall accuracy) are employed as reference evaluation metrics:

IoU =Prediction TGroundTruth

Prediction SGroundTruth (2)

F1score =2TP

2TP +FP +FN (3)

Recall =TP

TP +FN (4)

Precision =TP

TP +FP (5)

OA =TP +TN

TP +TN +FP +FN (6)

where TP, FP, TN, and FN are the pixel numbers of true positive, false positive, true negative,

and false negative, respectively.

For the building damage extraction result, the Kappa metric [

] is employed to

represent the evaluation accuracy. In addition, the OA, PA (producer accuracy of collapsed

buildings), and UA (user accuracy of collapsed buildings) are provided for reference:

Kappa =p0−pe

1−pe(7)

Remote Sens. 2023,15, 3909 12 of 26

PA =a1Tb1

(8)

UA =a1Tb1

(9)

pe=a1×b1+a2×b2+a3×b3+a4×b4

n2(10)

where

is equal to the value of OA, that is, the number of pixels correctly classiﬁed

divided by the total number of pixels;

and

are the real pixels of collapsed

buildings, not collapsed buildings, other buildings, and the background, respectively;

and

are the predicted pixels of collapsed buildings, not collapsed buildings, other

buildings, and background, respectively; and nis the total number of pixels.

5. Experimental Results

On the one hand, we quantitatively evaluate the results of our method in Section 5.1.

On the other hand, the difference in building damage extraction is qualitatively highlighted

in Section 5.2. Furthermore, failure examples are analyzed in Section 5.3.

5.1. Quantitative Evaluation

5.1.1. Post-Disaster Building Recognition

Table 3presents the post-disaster building recognition accuracy on the test data using

the stage 1 and stage 2 model, respectively. The results show that utilizing the selected

post-disaster samples leads to a signiﬁcant improvement in the IoU value after incremental

learning. In the Yushu case, the IoU increased by 14%, and in the Turkey case, it increased

by 7.23%. This improvement can be primarily attributed to the substantial increase in the

recall metric of building recognition, although the precision might be slightly sacriﬁced.

The optimized model can identify more buildings that have not collapsed after the disaster,

which is the premise to ensure a better effect of building damage extraction.

Table 3.

Post-disaster building recognition accuracy in test cases predicted by the pre-trained model

(stage 1 model) and the incrementally learned model (stage 2 model) using selected samples.

Cases Stages IoU F1 Score Recall Precision OA

Yushu Stage 1 0.4286 0.6001 0.4953 0.7609 0.9603

Stage 2 0.5686 0.7249 0.7108 0.7397 0.9676

Turkey Stage 1 0.4998 0.6665 0.5971 0.7542 0.9788

Stage 2 0.5721 0.7278 0.6705 0.7957 0.9822

5.1.2. Building Damage Extraction

Table 4presents the building damage extraction accuracy on the test cases of stage 1

and stage 2. The Kappa coefﬁcient and OA represent the comprehensive situation of the four

categories of collapsed buildings, not collapsed buildings, other buildings, and background,

and PA and UA speciﬁcally measure the accuracy of collapsed buildings. Due to the

signiﬁcant improvement in the accuracy of post-disaster buildings in stage 2, the damage

results of Yushu and Turkey cases become more reliable, with Kappa values reaching 0.8267

and 0.7688, respectively. The UA metric shows that the accuracy of collapsed building

identiﬁcation has increased signiﬁcantly in the Yushu case. The UA value of the Turkey

case is relatively low because the not collapsed buildings identiﬁed by the model have

incomplete edges, resulting in some extra collapsed building pixels. In addition, the number

of collapsed buildings in the Turkey case was less compared to the Yushu case. Therefore,

the inﬂuence of these misclassiﬁed pixels on the UA value will be more obvious.

Remote Sens. 2023,15, 3909 13 of 26

Table 4.

Building damage extraction accuracy in test cases predicted by the pre-trained model (stage

1 model) and the incrementally learned model (stage 2 model) using selected samples.

Cases Stages Kappa OA PA UA

Yushu stage 1 0.7379 0.9303 0.9126 0.6931

stage 2 0.8267 0.9676 0.9021 0.7998

Turkey stage 1 0.7281 0.9788 0.8656 0.2741

stage 2 0.7688 0.9880 0.8387 0.2852

Overall, the proposed building damage extraction method is feasible. The K-SS method

provides key post-disaster samples, which can effectively improve the performance of the

model in disaster areas to obtain high-precision results.

5.2. Qualitative Analysis

As shown in Figure 8, we have visualized the results of post-disaster building and

damage recognition in partial areas of Yushu to intuitively analyze the signiﬁcance of

incremental learning using post-disaster samples. It is evident that the pre-trained model

was directly applied to post-disaster images, that some buildings were missed, and that

the effect of damage extraction needs to be improved furthermore. The proposed sample

selection method effectively identiﬁes buildings that were not recognized in stage 1. The

model learned the features of these samples in an incremental manner of EGB-A, which

can keep the buildings that have been correctly identiﬁed in stage 1 as much as possible,

and continue to optimize the recognition results. In the white boxes, we can see that the

results in stage 2 have been signiﬁcantly improved, and most of the missed buildings

were identiﬁed.

Figure 9displays the results of post-disaster building and damage identiﬁcation at

different stages in some areas of the Turkey case. Comparing the results, it is evident that

the improved stage 2 model using post-disaster samples can more completely identify

building edges and small buildings with white roofs. As a result, the model, after self-

incremental learning, can predict a more accurate distribution of post-disaster buildings.

Referring to the ground-truth, the stage 2 model using the selected samples has achieved

more reliable building damage results.

Remote Sens. 2023, 15, x FOR PEER REVIEW 15 of 28

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 8. Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at diﬀerent stages in the Yushu test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (a) Pre-disaster image; (b)

post-disaster image; (c) selected samples; (d) stage 1; (e) stage 2; (f) ground-truth; (g) stage 1; (h)

stage 2; (i) ground-truth.

(a) (b) (c)

Figure 8. Cont.

Remote Sens. 2023,15, 3909 14 of 26

Remote Sens. 2023, 15, x FOR PEER REVIEW 15 of 28

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 8. Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at diﬀerent stages in the Yushu test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (a) Pre-disaster image; (b)

post-disaster image; (c) selected samples; (d) stage 1; (e) stage 2; (f) ground-truth; (g) stage 1; (h)

stage 2; (i) ground-truth.

(a) (b) (c)

Figure 8.

Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at different stages in the Yushu test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (

) Pre-disaster image;

(

) post-disaster image; (

) selected samples; (

) stage 1; (

) stage 2; (

) ground-truth; (

) stage 1;

(h) stage 2; (i) ground-truth.

Remote Sens. 2023, 15, x FOR PEER REVIEW 15 of 28

(d) (e) (f)

(g) (h) (i)

Figure 8. Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at diﬀerent stages in the Yushu test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (a) Pre-disaster image; (b)

post-disaster image; (c) selected samples; (d) stage 1; (e) stage 2; (f) ground-truth; (g) stage 1; (h)

stage 2; (i) ground-truth.

(a) (b) (c)

(d) (e) (f)

Remote Sens. 2023, 15, x FOR PEER REVIEW 16 of 28

(g)

(h)

(i)

Figure 9. Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at diﬀerent stages in the Turkey test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (a) Pre-disaster image; (b)

post-disaster image; (c) selected samples; (d) stage 1; (e) stage 2; (f) ground-truth; (g) stage 1; (h)

stage 2; (i) ground-truth.

The ﬁnal damage extraction results (stage 2) of the entire Yushu test area are shown

in Figure 10. We can see that the distribution of collapsed buildings identiﬁed by the

model is similar to the ground-truth. From the map, it is evident that the buildings in the

southwest of the urban area, speciﬁcally subﬁgure (1) in Figure 10, have sustained severe

damage, with extensive areas of ruins. In contrast, the buildings near the center exhibit

less concentrated collapse and seem to have experienced relatively less damage. At a sub-

tle level, the results of building damage extraction have more red parts, indicating that the

model has identiﬁed some undamaged buildings as collapsed buildings. Overall, the pro-

posed method obtains results that are basically consistent with the real situation at the

macro level.

The ﬁnal damage extraction results (stage 2) of the entire Turkey test area are shown

in Figure 11. On the whole, there are not many collapsed buildings, and the building dam-

age results show more collapsed pixels than the ground-truth. There is an area of concen-

trated damage in the middle of the town, which is the enlarged subﬁgure (1). It can be

seen that some edge pixels of intact buildings are misclassiﬁed as collapsed because there

is still space for improvement in the recall value of post-disaster building identiﬁcation.

The collapsed buildings can basically be completely extracted.

Figure 9.

Comparison of post-disaster building identiﬁcation (second row) and damage results

(third row) at different stages in the Turkey test area. Results before optimization (stage 1), results

after optimization (stage 2). The white boxes indicate noteworthy details. (

) Pre-disaster image;

(

) post-disaster image; (

) selected samples; (

) stage 1; (

) stage 2; (

) ground-truth; (

) stage 1;

(h) stage 2; (i) ground-truth.

Remote Sens. 2023,15, 3909 15 of 26

The ﬁnal damage extraction results (stage 2) of the entire Yushu test area are shown

in Figure 10. We can see that the distribution of collapsed buildings identiﬁed by the

model is similar to the ground-truth. From the map, it is evident that the buildings in

the southwest of the urban area, speciﬁcally subﬁgure (1) in Figure 10, have sustained

severe damage, with extensive areas of ruins. In contrast, the buildings near the center

exhibit less concentrated collapse and seem to have experienced relatively less damage. At

a subtle level, the results of building damage extraction have more red parts, indicating

that the model has identiﬁed some undamaged buildings as collapsed buildings. Overall,

the proposed method obtains results that are basically consistent with the real situation at

the macro level.

Remote Sens. 2023, 15, x FOR PEER REVIEW 17 of 28

Figure 10. Building damage results of the Yushu case extracted by the SELF method (top) and the

corresponding ground-truth (boom). To highlight collapsed areas, we combined the categories of

not collapsed buildings and other buildings in blue. The orange boxes show a severely damaged

area. (1) The region enlarged from the building damage results; (2) The region enlarged from the

ground-truth.

Figure 10.

Building damage results of the Yushu case extracted by the SELF method (

top

) and the

corresponding ground-truth (

bottom

). To highlight collapsed areas, we combined the categories of

not collapsed buildings and other buildings in blue. The orange boxes show a severely damaged

area. (1) The region enlarged from the building damage results; (2) The region enlarged from the

ground-truth.

The ﬁnal damage extraction results (stage 2) of the entire Turkey test area are shown in

Figure 11. On the whole, there are not many collapsed buildings, and the building damage

results show more collapsed pixels than the ground-truth. There is an area of concentrated

damage in the middle of the town, which is the enlarged subﬁgure (1). It can be seen that

Remote Sens. 2023,15, 3909 16 of 26

some edge pixels of intact buildings are misclassiﬁed as collapsed because there is still

space for improvement in the recall value of post-disaster building identiﬁcation. The

collapsed buildings can basically be completely extracted.

Remote Sens. 2023, 15, x FOR PEER REVIEW 18 of 28

Figure 11. Building damage results of the Turkey case extracted by the SELF method (left) and the

corresponding ground-truth (right). The orange boxes show a severely damaged area. (1) The region

enlarged from the building damage results; (2) The region enlarged from the ground-truth.

5.3. Failure Example Analysis

Despite the improvements achieved by the stage 2 model using the SELF method,

there are still some recognition errors. As shown in the ﬁrst row of Figure 12, although the

post-disaster samples are correctly selected, some buildings are still missed in the recog-

nition results. This is related to the lack of buildings with similar features in the training

set, and smaller buildings are generally harder to identify. The second row shows a case

where the K-SS sample selection method fails. The reason is that although the building is

damaged, there are still some building roof features that lead to a higher activation value

in this area. The recognition results in this example are not aﬀected by the wrong samples,

indicating that impure sample does not necessarily reduce the recognition eﬀect, and we

need to avoid overﬁing when using these samples for training. In addition, it is usually

diﬃcult for the model to identify roofs covered by shadows of high-rise buildings (the

third line in Figure 12). To address this problem, it may be feasible to design data enhance-

ment strategies or use generative networks to remove shadows.

(a) (b) (c) (d)

Figure 11.

Building damage results of the Turkey case extracted by the SELF method (

left

) and the

corresponding ground-truth (

right

). The orange boxes show a severely damaged area. (1) The region

enlarged from the building damage results; (2) The region enlarged from the ground-truth.

5.3. Failure Example Analysis

Despite the improvements achieved by the stage 2 model using the SELF method,

there are still some recognition errors. As shown in the ﬁrst row of Figure 12, although

the post-disaster samples are correctly selected, some buildings are still missed in the

recognition results. This is related to the lack of buildings with similar features in the

training set, and smaller buildings are generally harder to identify. The second row shows a

case where the K-SS sample selection method fails. The reason is that although the building

is damaged, there are still some building roof features that lead to a higher activation value

in this area. The recognition results in this example are not affected by the wrong samples,

indicating that impure sample does not necessarily reduce the recognition effect, and we

need to avoid overﬁtting when using these samples for training. In addition, it is usually

difﬁcult for the model to identify roofs covered by shadows of high-rise buildings (the third

line in Figure 12). To address this problem, it may be feasible to design data enhancement

strategies or use generative networks to remove shadows.

Remote Sens. 2023,15, 3909 17 of 26