ArticlePDF Available

An optimal search for neural network parameters by Salp swarm optimization algorithm: a landslide application

Authors:

Abstract and Figures

This study aims at investigating the balance between exploration and exploitation search capability of a newly developed Salp swarm optimization algorithm (SSA) for fine-tuning parameters of a three-hidden-layer neural network. The landslide study was selected as a thematic application, and a mountainous area of Vietnam was chosen as a case study. A training dataset with thirteen predictor variables and historical landslide occurrences from the study area were used to train and validate the model. The experiments showed an improvement in several statistic measurements such as Root mean square error = 0.3732, Overall accuracy = 79.35%, Mean absolute error = 0.3075, and Area under Receiver operating characteristic = 0.886 in comparison to conventional benchmark methods. Based on the results, the use of SSA would enhance the search efficiency and could be used as an alternative optimizer for a multiple hidden layer neural network for landslide application as well as for other natural hazard analysis.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=trsl20
Remote Sensing Letters
ISSN: 2150-704X (Print) 2150-7058 (Online) Journal homepage: https://www.tandfonline.com/loi/trsl20
An optimal search for neural network parameters
using the Salp swarm optimization algorithm: a
landslide application
Huu-Duy Nguyen, Vu-Dong Pham, Quoc-Huy Nguyen, Van-Manh Pham, Minh
Hai Pham, Van Manh Vu & Quang-Thanh Bui
To cite this article: Huu-Duy Nguyen, Vu-Dong Pham, Quoc-Huy Nguyen, Van-Manh Pham,
Minh Hai Pham, Van Manh Vu & Quang-Thanh Bui (2020) An optimal search for neural network
parameters using the Salp swarm optimization algorithm: a landslide application, Remote Sensing
Letters, 11:4, 353-362, DOI: 10.1080/2150704X.2020.1716409
To link to this article: https://doi.org/10.1080/2150704X.2020.1716409
Published online: 06 Feb 2020.
Submit your article to this journal
View related articles
View Crossmark data
An optimal search for neural network parameters using the
Salp swarm optimization algorithm: a landslide application
Huu-Duy Nguyen
a
, Vu-Dong Pham
b
, Quoc-Huy Nguyen
b
, Van-Manh Pham
a
,
Minh Hai Pham
c
, Van Manh Vu
d
and Quang-Thanh Bui
b
a
Faculty of Geography, VNU University of Science, Ha Noi, Viet Nam;
b
Center for Applied Research in Remote
Sensing and GIS (CarGIS), Faculty of Geography, VNU University of Science, Ha Noi, Viet Nam;
c
Vietnam
Institute of Geodesy and Cartography, Ha Noi, Viet Nam;
d
Faculty of Environmental Science, VNU University
of Science, Ha Noi, Viet Nam
ABSTRACT
This study aims at investigating the balance between exploration and
exploitation search capability of a newly developed Salp swarm opti-
mization algorithm (SSA) for ne-tuning parameters of a three-hidden-
layer neural network. The landslide study was selected as a thematic
application, and a mountainous areaofVietnamwaschosenasacase
study. A training dataset with thirteen predictor variables and historical
landslide occurrences from the study area were used to train and
validate the model. The experiments showed an improvement in
several statistic measurements such as Root mean square
error = 0.3732, Overall accuracy = 79.35%, Mean absolute error = 0.3075,
and Area under Receiver operating characteristic = 0.886 in compar-
ison to conventional benchmark methods. Based on the results, the
use of SSA would enhance the search eciency and could be used as
an alternative optimizer for a multiple hidden layer neural network for
landslide application as well as for other natural hazard analysis.
ARTICLE HISTORY
Received 11 October 2019
Accepted 4 January 2020
1. Introduction
Landslides are the specic formof gravitational mass movement that is a geological disaster
caused by natural, human activities and occur mainly on slopes of mountainous areas
(Motagh et al. 2013). It is, therefore, necessary for local governments to map areas that
are susceptible to landslides for damage mitigation and rescue plans. The susceptibility
studies of landslides have been widely investigated by a variety of methods, among which
the uses of machine learning are found robust in improving mapping accuracy. Examples of
these methods can be referenced from the works of (Chen et al. 2019; Ghorbanzadeh et al.
2019;Heetal.2019)byusingarticial neural network, support vector machine, fuzzy weight
of evidence logistic regression, kernel logistic regression, logistic model trees, naïve Bayes,
neural-fuzzy and the application of meta-heuristic optimization algorithms in searching for
optimal parameters of classiers (Pham et al. 2019; Nguyen et al. 2019)
The non-free-lunch theorem (Wolpert and Macready 1997) states that no model can
solve all problems because ofthe complexities of those, and the search for potential models
CONTACT Quang-Thanh Bui qthanh.bui@gmail.com Center for Applied Research in Remote Sensing and GIS
(CarGIS), Faculty of Geography, VNU University of Science, 334 Nguyen Trai, Ha Noi, Viet Nam
REMOTE SENSING LETTERS
2020, VOL. 11, NO. 4, 353362
https://doi.org/10.1080/2150704X.2020.1716409
© 2020 Informa UK Limited, trading as Taylor & Francis Group
to address reality applications, or specically to landslide problems, are therefore needed.
This study investigated a novel method named SSA-MLNN that combines the Salp swarm
optimization algorithm (SSA) for ne-tuning multiple hidden layer neural network (MLNN)
parameters for the landslide analysis. Sinho, a mountainous district in a mountainous
province (Lai Chau) in Vietnam (Figure 1), was selected as the case study as the landslides
in 2018 reportedly caused signicant human loss and infrastructure damages.
2. Data and method
2.1. Historical landslides and predictor variables
The machine learning application for the evaluation and construction of the landslide map
requires knowledge of landslide in the past, including the type, place, and date of occurrence
Figure 1. Study area (a), historical landslides, and predictor variables in the remaining gures.
354 H.-D. NGUYEN ET AL.
(Jaafari et al. 2019). In this study, the historical locations of landslides were collected from eld
surveys during the last several years, which were extractable from http://www.canhbaotruo
tlo.vn/. The landslide locations from the website are results from a national project which is
coordinated by Vietnam Ministry of Natural resources and Environment. These landslide
locations were veried/updated by the authors in eldsurveysandbysatellitedata.The
detected locations were, in some cases, represented by points (for small areas) or by polygons
if the areas are large enough. The proposed model was based on point dataset, and therefore
for those polygons, the centre points were extracted and combined with other points to build
up historic landslide sets. Since susceptibility mapping is a binary application so that a similar
number of non-landslide was also randomly dened across the study area. Totally, 784 points,
including 392 historical landslides and 392 non-landslide, were used for training and valida-
tion of the proposed model.
In landslide analysis, it is essential to select suitable conditioning factors for the assessment.
Fromtheliteraturereview,threegroupsof variables were used for the analysis. The rst
elevation-derived group includes the ASTER Digital elevation model (DEM), which was down-
loadable from https://earthexplorer.usgs.gov/,andDEMderivable parameters such as slope,
aspect, curvature, Compound topographic index (CTI). The second set of variables relates to
the distribution of rivers or streams, including distance to river, river density, and Stream
power index (SPI). The third consists of satellite-derived indices such as Normalized dierence
vegetation index (NDVI), Normalized dierence build-up index (NDBI), Normalized dierence
moisture index (NDMI), and Normalized dierence water index (NDWI). These indices are
good indicators for land cover conditions, and they were calculated using Landsat 8
(Operational Land Imager instrument) imagery. The last predictor variable is rainfall, which
is one of the leading causes of landslides in the mountain region because precipitation can
weaken slope surfaces. Heavy rain may reactivate landslide movements that have occurred in
the past (Wang and Sassa 2006). The rainfall layer was interpolated from 8 national meteor-
ological stations in the study area, and it represents an accumulated rain during the rainy
season in 2018. These variables were preliminarily processed and converted into a similar data
format that is 30 m x 30 m raster in WGS84/UTM zone 48N.
2.2. Neural network optimized Salp swarm optimization algorithm
The neural network is a common machine method that has been widely used for both
classication and regression applications. The network learns from the training dataset and
remembers the contribution of each predictor variable after every iteration. Several previously
published works have veried the potential replacement of gradient descend by meta-
heuristic algorithms in ne-tuning parameters of the network. This study investigated the
potential use of a newly developed optimizer SSA to search for optimal weights of a MLNN
(Figure 2). From the trial and error process, the structure of this network was determined to
have one input layer, three hidden layers with 10, 9, 8 nodes (neurons), respectively, and one
output layer. Structurally, three hundred (300) weights connect the input layer to hidden
layers, between hidden layers, and between hidden layers to output. This network produces
a susceptible value Pibetween [0 1] range which is used with the observed value Oito
formulate the objective function (Root mean square error RMSE) for the SSA algorithm.
The brief description is shown in (Figure 2)
REMOTE SENSING LETTERS 355
SSA is used as the optimizer for the proposed neural network. It is a bio-inspired
algorithm to mimic the swarming behaviours of the salp chain and was mathematically
investigated by the work of (Mirjalili et al. 2017). Like other swarm intelligence methods, n
articial salp population (xi=[x1
i;x2
i... :xd
i,i¼1... n) are randomly initialized in a d-
dimensional search space between lower bound (liÞand upper bound (ui). The salps move
around, searching for the location of food (fÞin the search space. The swarm is led by
a leader (x1), and the others are considered as the followers who gradually (to avoid
stagnation) update their position according to their neighbour salps and consequently to
the leader (Equation 2). The leader updates itself towards the food sources by
(Equation 1), which is also altered (exploration step) during each iteration of the optimiza-
tion process by using the following equation:
x1¼fþc1uili
ðÞc2þli
ðÞ
fg
c30
x1¼fc1uili
ðÞc2þli
ðÞ
fg
c3<0 Equation1
xi¼1
2xiþxi1
ðÞ Equation2
Where is the population size, iindicates the i
th
salp, is used to balance the exploration and
exploitation search; c2;c3are random numbers; lis the current iteration, and Lis the
predened maximum iteration number.
The best position of source food is remembered during the updating processes and is
used as the optimal solution of the objective function. In this study, SSA is used to search
for optimal parameters of MLNN (d= 300 connecting weights), which are encoded as the
300-dimensional search space in SSA. The initial steps initialize a swarm of 30 salps, each
of which is positioned in the search space as mentioned above. The algorithm iterates
searching for optimal positions of salps, and this process terminates until the iterations
reach maximum pre-dened times (1000 was selected in this case) or when the desirable
RMSE is achieved. The position of the optimal salp (300-dimensional values in SSA or 300
connecting weights in MLNN) is used to generate the landslide susceptible map for the
entire study area.
Figure 2. Neural network optimized Salp optimization algorithm. m is the number of validation data.
356 H.-D. NGUYEN ET AL.
3. Results and discussion
3.1. Conditioning factors using Mean Gini Accuracy and Mean Decrease Accuracy
Data exploration plays a vital role in evaluating the inuences of predictor variables in
landslide analysis processes. From the literature review, currently, there is no known broad
guidance to help select appropriate landslide inuence factors. In general, these factors are
selected from eld observation, landslide type analysis, and data availability, which are
considered the most essential elements (Chen et al. 2018). This step is to evaluate the
relationships between the factors and the historic landslide locations. Chen et al. in 2017
(Chen et al. 2017;Buietal.2015) showed that the inuences of conditional factors are not the
same for each model, in which some elements have signicant contributions to the predic-
tion. However, some might deteriorate the overall accuracy and should be ltered out during
the preliminary step.
In this article, condition factors were ranked by using the Random Forest algorithm
with Mean Gini Accuracy and Mean Decrease Accuracy indicators. These techniques are
considered to be one of the best methods for prioritizing variable inuence levels and
widely used by researchers. It assigns a weighting to each factor to distinguish the
prediction ability. Factors have higher points that are more important for models. While
the elements have the points equal to zero, which do not contribute to the models.
Among the 13 factors (Figure 3), the topographic-derived factors received the highest
rank values in reference to the location of landslide occurrences. The ranking order
continues with rain, river density, NDWI, CTI, distance to a river, NDVI, slope, aspect, SPI,
NDBI, NDMI, and curvature. The orders of the factors came along with data analysis results
from previous studies (Pham et al. 2016; Pham, Prakash, and Bui 2018), in which DEM,
Rainfall, NDVI are the most critical factors of landslide occurrence.
3.2. Model performance comparison
Usually, when RMSE is used as the objective function for optimization algorithms, over-
tting issues might occur. This problem occurs when machine learning models produce
higher RMSE with the validation dataset, even though the model is trained to best ttothe
training dataset. Since it is hard to collect more landslide locations due to time, budget
Figure 3. Ranking of predictor variables (a) Mean decrease in accuracy coecient (b) Mean decrease in
Gini coecient.
REMOTE SENSING LETTERS 357
constraints, proper resampling methods should be applied to eliminate the problem of
over-tting. 10-fold cross-validation has been used successfully used in previous works (Bui
et al. 2019a; Shahabi et al. 2014) by averaging results from all folds, and this method was also
used for resampling training data. Besides, the prescreening, normalization of the training
dataset, the selection of dropout rate at 20%, and the limitation of the searching boundary
were also applied before and during the training. The determination of RMSE desirable
value, number of iterations, neural network dropout rates, which were considered as
a manner to avoid over-tting, were dened after several trials.
The performance of this model was evaluated and compared to several benchmarked
methods such as Random Forest (RF), Random Subspace (RS) and Bagging, which have been
commonly used in landslide analysis (He et al. 2019; Pham, Prakash, and Bui 2018). Table 1
shows several statistical measures to evaluate the performance of proposed models in
comparison to Random Forest (RF), Random Subspace (RS), and Bagging. The experiments
ended up with SSA-MLNN at RMSE =0.3732, Mean absolute error (MAE) = 0.3075, Overall
accuracy (OA) = 79.35%, Area under Receiver operating characteristic curve (AUC) = 0.886, and
RF (RMSE = 0.3879, MAE = 0.3294, OA = 78.61%, AUC =0.868), RS (RMSE = 0.4274, MAE =0.3930,
OA = 74.21%, AUC = 0.810) and Bagging (RMSE = 0.4216, MAE = 0.3566, OA = 74.32%,
AUC = 0,809). The results of SSA MLNN are satisfactory in comparison to the latest works
of (Nguyen et al. 2019) with the use of multiboot based naïve Bayes trees (AUC = 0.824), (Pham
et al. 2019) by using ensemble methods (the best AUC = 0.836), and radial basis function
(AUC = 0.881) in the study of (He et al. 2019). Although these works were implemented with
dierent dataset, it could be seen that SSA MLNN perform well over several methods in
referenced landslide susceptibility mapping studies.
The performance of the model was visualized by plotting the False positive rate on the
x-axis against the True positive rate on the y-axis for each method, as showed in Figure 4.
The AUC measures the perfection of the algorithms with considerable values range between
0.5 and 1. If the AUC equals one, which shows that the model is perfect. The results show
that the SSA-MLNN was more ecient than the other three models (AUC = 0.886), then RF
(AUC = 0.868), RS (AUC = 0.81), and the last was Bagging (AUC = 0.809). The most optimal
model could be used to generate landslide susceptibility for the entire study area.
Figure 4(a) visualizes the search mechanism of SSA, in which the x-axis shows iterative
order and y-axis plots the average of RMSE values (from 10 cross-validations) during the
search operation. From the graph, apart from a sudden jump (which is caused by
randomization of initial salp positions) from the beginning, RMSE gradually decreases.
This variation reects the movement of the rst salp, in which exploration and exploita-
tion process is balanced by a random c1as described in the previous section. It could be
noticed that the search mechanisms make meta-heuristic algorithms unique as men-
tioned in previous works of (Bui et al. 2019b,2019c,2019d), and the verication of new
algorithms in more diverse applications is therefore needed.
Table 1. Statistical measurements of the proposed method and benchmarked classiers.
Classier RMSE MAE OA (%) AUC
Random Forest 0.3879 0.3294 78.61 0.868
Random Subspace 0.4274 0.3930 74.21 0.810
Bagging 0.4216 0.3566 74.32 0.809
SSA-MLNN 0.3732 0.3075 79.35 0.886
358 H.-D. NGUYEN ET AL.
After being validated, the SSA-MLNN was used to produce the landslide susceptibility
map for the entire study area. The process was implemented by feeding the whole study
area (which was represented as a raster) with associated 13 variables through the SSA-
MLNN. The output values between [01] range were reclassied into ve classes, namely
Very low, Low, Moderation, High and Very high as shown in Figure 5. Based on the analysis
of the landslide susceptibility map, more than 22.1% of the area is in the very low risk, 32% in
the low risk, 22.4% in the moderate risk, 13.1% in the high risk, and 10.4% in the very high
risk. The Southeast and Southwest area is highly sensitive by the landslide hazard because of
the deforestation process and the construction of the infrastructure. This susceptible map is
essential for assisting land use decision-makers and proposing risk management measures
for the protection of the population through landslide predictions in the future.
Figure 4. Performance of SSA-MLNN: (a) variation of RMSE in SSA MLNN; (b) ROC curves and AUC
values of SSA-MLNN and benchmarked methods.
Figure 5. The nal landslide-susceptibility map of the study area.
REMOTE SENSING LETTERS 359
3.3. Discussion
The prediction of the surface sensitivity to landslide hazard is essential for territorial
planning, especially in the mountainous regions. The combination of advanced geospatial
techniques and machine learning allows the production of landslide susceptibility maps
more accurate and rapid. Accuracies and rapidness are the ultimate objectives of the
previous and ongoing works on the analysis of natural hazards (Akgun et al. 2012; Bui
et al. 2019a,2020). The rst thing to mention is the collection of predictor variables and
preliminary processing of those. DEM, Rain, NDWI, NDVI, River density, NDBI, CTI, Distance
to river, NDMI, SPI, Slope, Aspect, and Curvature were selected because they are free and
collectable from the global portal and that means the proposed model can be reprodu-
cible in other places. Moreover, the laying of the susceptibility map over these variables
would provide useful information on which value range of each variable is most vulner-
able to landslides. So that the inherent problem solution for landslide mitigation can be
coordinated.
Another concern should be focused on how the training dataset is normalized
because the original data are measured in various units (degree, percentages, mm).
Data can be regrouped into ordinal classes as in (Bui et al. 2019a), which causes loss
of data detail or can be normalized into [0 1] value range. The second method was
used in this study, with the argument that the distribution of original data
unchanged and the susceptible values will be estimated based on the numerical
values of input data. In this study, we compared four machine learning methods:
Random forest, Random subspace, Bagging, and SSA-MLNN with this normalized
dataset. SSA-MLNN outperformed the others with higher AUC (0.886), smaller RMSE
(0.3732). Statistically, the dierence between results from SSA and the others were
signicant by using Wilcoxon signed ranked test.
Even though there are numerous studies on landslide analysis, but they dier from each
other on algorithms, datasets, or even structures/congurations of specic methods .i.e.,
customized neural network to twithspecic problems. In this study, the structure of the
neural network (the number of hidden layers, the number of neurons in each layer) was
determined based on previous works and based on the trial-and-error process with
a training dataset from the study area. However, the No Free Lunchtheorem states that
no model that ts all problems, and this also means that the neural network structure is also
subject to change under dierent input datasets. The future work might be a focus on both
the adaptive determination of neural network structure and ne-tuning of its parameters.
This approach will be more problem independent but will require a more substantial
computation capability.
Since RMSE is used as the objective function for SSA, or generally for meta-heuristic
optimization algorithms, the over-tting issue should be taken into consideration. This
study applied several techniques (k-fold cross-validation, the setting of drop-out rate,
limitation of the search boundary) to minimize its impacts, but some other methods can
also be tried, such as the inclusion of more training data from dierent geographic
locations, data augmentation. In which the collection of more training data plays
a crucial role in examining the performance of any classication method. The more data
is tested, the more reliable machine learning models will be, and decision-makers have
more accurate data to rely on for their management activities.
360 H.-D. NGUYEN ET AL.
4. Conclusion
This study investigates the potential use of Salp swarm optimization algorithm in ne-tuning
neural network parameters for landslide susceptibility mapping in a mountainous area of
Vietnam. The experiment was successful when the proposed hybrid model outperformed
other benchmarked methods in all statistical measurements, specically (RMSE = 0.3732,
MAE = 0.3075, OA = 79.35, AUC = 0.886). Since the Wilcoxon signed ranked test was also
conducted, that means the model is a more accurate method, and it can be used as an
alternative solution for landslide study, or a potential method for other natural hazard analysis.
The innovation of machine learning and geoinformation technology makes the extrac-
tion of knowledge from spatial data more accurate and rapid, and it is particularly useful
in natural hazard management and post-disaster responses. This study veried a newly
developed optimization algorithm for landslide susceptibility mapping, but the fast-
growing of new meta-heuristic optimization algorithms provide more opportunity for
the investigation of new classication methods in natural hazard analysis. However,
methods might not be widely applicable if training dataset is limited to specic areas.
In this regard, more experiments with diverse training data are, therefore, crucial for the
verication of the applicability of new methods in solving more extensive problems.
Funding
This research is funded by Asia Research Center, Vietnam National University - Hanoi and Korea
Foundation for Advanced Studies under grant number [CA.19.8A].
ORCID
Quang-Thanh Bui http://orcid.org/0000-0002-5059-9731
References
Akgun, A., E. A. Sezer, H. A. Nefeslioglu, C. Gokceoglu, and B. Pradhan. 2012.An Easy-to-use MATLAB
Program (Mamland) for the Assessment of Landslide Susceptibility Using a Mamdani Fuzzy
Algorithm.Computers & Geosciences 38 (1): 2334. doi:10.1016/j.cageo.2011.04.012.
Bui, D. T., T. Tuan, H. Klempe, B. Pradhan, and I. Revhaug. 2015.Spatial Prediction Models for
Shallow Landslide Hazards: A Comparative Assessment of the Ecacy of Support Vector
Machines, Articial Neural Networks, Kernel Logistic Regression, and Logistic Model Tree.
Landslides 118. doi:10.1007/s10346-015-0557-6.
Bui, Q.-T., Q.-H. Nguyen, X. L. Nguyen, V. D. Pham, H. D. Nguyen, and V.-M. Pham. 2020.Verication of
Novel Integrations of Swarm Intelligence Algorithms into Deep Learning Neural Network for Flood
Susceptibility Mapping.Journal of Hydrology 581: 124379. doi:10.1016/j.jhydrol.2019.124379.
Bui,Q.-T.,Q.-H.Nguyen,V.M.Pham,M.H.Pham,andA.T.Tran.2019a.Understanding Spatial Variations
of Malaria in Vietnam Using Remotely Sensed Data Integrated into GIS and Machine Learning
Classiers.Geocarto International 34 (12): 13001314. doi:10.1080/10106049.2018.1478890.
Bui, Q.-T., Q.-H. Nguyen, V. M. Pham, V. D. Pham, M. H. Tran, T. T. H. Tran, H. D. Nguyen, X. L. Nguyen,
and H. M. Pham. 2019b.A Novel Method for Multispectral Image Classication by Using Social
Spider Optimization Algorithm Integrated to Fuzzy C-Mean Clustering.Canadian Journal of
Remote Sensing 45 (1): 4253. doi:10.1080/07038992.2019.1610369.
Bui, Q.-T., M. V. Pham, Q.-H. Nguyen, L. X. Nguyen, and H. M. Pham. 2019c.Whale Optimization
Algorithm and Adaptive Neuro-Fuzzy Inference System: A Hybrid Method for Feature Selection
REMOTE SENSING LETTERS 361
and Land Pattern Classication.International Journal of Remote Sensing 40 (13): 50785093.
doi:10.1080/01431161.2019.1578000.
Bui, Q.-T., M. Pham Van, N. T. T. Hang, Q.-H. Nguyen, N. X. Linh, P. M. Hai, T. A. Tuan, and P. Van Cu.
2019d.Hybrid Model to Optimize Object-based Land Cover Classication by Meta-heuristic
Algorithm: An Example for Supporting Urban Management in Ha Noi, Viet Nam.International
Journal of Digital Earth 12 (10): 11181132. doi:10.1080/17538947.2018.1542039.
Chen, W., H. Shahabi, A. Shirzadi, H. Hong, A. Akgun, Y. Tian, J. Liu, A.-X. Zhu, and S. Li. 2019.Novel
Hybrid Articial Intelligence Approach of Bivariate Statistical-methods-based Kernel Logistic
Regression Classier for Landslide Susceptibility Modeling.Bulletin of Engineering Geology and
the Environment 78 (6): 43974419. doi:10.1007/s10064-018-1401-8.
Chen, W., H. Shahabi, S. Zhang, K. Khosravi, A. Shirzadi, K. Chapi, B. Pham, et al. 2018.Landslide
Susceptibility Modeling Based on GIS and Novel Bagging-Based Kernel Logistic Regression.
Applied Sciences 8: 2540. doi:10.3390/app8122540.
Chen, W., X. Xie, J. Peng, J. Wang, Z. Duan, and H. Haoyuan. 2017.GIS-based Landslide Susceptibility
Modelling: A Comparative Assessment of Kernel Logistic Regression, Naïve-Bayes Tree, and
Alternating Decision Tree Models.Geomatics, Natural Hazards and Risk 8: 950973.
doi:10.1080/19475705.2017.1289250.
Ghorbanzadeh, O., T. Blaschke, K. Gholamnia, S. R. Meena, D. Tiede, and J. Aryal. 2019.Evaluation of
Dierent Machine Learning Methods and Deep-learning Convolutional Neural Networks for
Landslide Detection.Remote Sensing 11 (2): 196. doi:10.3390/rs11020196.
He, Q., H. Shahabi, A. Shirzadi, S. Li, W. Chen, N. Wang, H. Chai, et al. 2019.Landslide Spatial
Modelling Using Novel Bivariate Statistical Based Naïve Bayes, RBF Classier, and RBF Network
Machine Learning Algorithms.Science of the Total Environment 663: 115. doi:10.1016/j.
scitotenv.2019.01.329.
Jaafari, A., M. Panahi, B. T. Pham, H. Shahabi, D. T. Bui, F. Rezaie, and S. Lee. 2019.Meta Optimization
of an Adaptive Neuro-fuzzy Inference System with Grey Wolf Optimizer and Biogeography-based
Optimization Algorithms for Spatial Prediction of Landslide Susceptibility.CATENA 175: 430445.
doi:10.1016/j.catena.2018.12.033.
Mirjalili, S., A. H. Gandomi, S. Z. Mirjalili, S. Saremi, H. Faris, and S. M. Mirjalili. 2017.Salp Swarm
Algorithm: A Bio-inspired Optimizer for Engineering Design Problems.Advances in Engineering
Software 114: 163191. doi:10.1016/j.advengsoft.2017.07.002.
Motagh, M., H.-U. Wetzel, S. Roessner, and H. Kaufmann. 2013.A TerraSAR-X InSAR Study of
Landslides in Southern Kyrgyzstan, Central Asia.Remote Sensing Letters 4 (7): 657666.
doi:10.1080/2150704X.2013.782111.
Nguyen, P. T., T. Tuyen, A. Shirzadi, B. Pham, H. Shahabi, E. Omidvar, A. Amini, et al. 2019.
Development of a Novel Hybrid Intelligence Approach for Landslide Spatial Prediction.
Applied Sciences 9: 2824. doi:10.3390/app9142824.
Pham, B. T., B. Pradhan, D. T. Bui, I. Prakash, and M. B. Dholakia. 2016.A Comparative Study of
Dierent Machine Learning Methods for Landslide Susceptibility Assessment: A Case Study of
Uttarakhand Area (India).Environmental Modelling & Software 84: 240250. doi:10.1016/j.
envsoft.2016.07.005.
Pham, B. T., I. Prakash, and D. T. Bui. 2018.Spatial Prediction of Landslides Using a Hybrid Machine
Learning Approach Based on Random Subspace and Classication and Regression Trees.
Geomorphology 303: 256270. doi:10.1016/j.geomorph.2017.12.008.
Pham, B. T., A. Shirzadi, H. Shahabi, E. Omidvar, S. K. Singh, M. Sahana, D. T. Asl, B. B. Ahmad,
N. K. Quoc, and S. Lee. 2019.Landslide Susceptibility Assessment by Novel Hybrid Machine
Learning Algorithms.Sustainability 11 (16): 4386. doi:10.3390/su11164386.
Shahabi, H., S. Khezri, B. B. Ahmad, and M. Hashim. 2014.Landslide Susceptibility Mapping at
Central Zab Basin, Iran: A Comparison between Analytical Hierarchy Process, Frequency Ratio and
Logistic Regression Models.CATENA 115: 5570. doi:10.1016/j.catena.2013.11.014.
Wang, H., and K. Sassa. 2006.Rainfall-induced Landslide Hazard Assessment Using Articial Neural
Networks.Earth Surface Processes and Landforms 31: 235247. doi:10.1002/esp.1236.
Wolpert, D. H., and W. G. Macready. 1997.No Free Lunch Theorems for Optimization.IEEE
Transactions on Evolutionary Computation 1 (1): 6782. doi:10.1109/4235.585893.
362 H.-D. NGUYEN ET AL.
... With a third of its land area consisting of hills and mountains, Vietnam is one of the countries most affected. According to data from the General Department of Disaster Reduction (Ministry of Agriculture and Rural Development), between 2000 and 2015, Vietnam was affected by 250 flash floods and landslides, causing 779 deaths Nguyen et al., 2020). ...
... In many cases, authors select all factors available in the study area, and then use algorithms to assess the importance of each element, as well as their permutability. In this study, the RF method was used to determine the importance of factors (Chang et al., 2022;Nguyen et al., 2020;Pham et al., 2020). The results showed that topographic factors (slope, direction and elevation) were the most important in determining T A B L E 3 The performance of the eight models proposed. ...
Article
Landslides lead to widespread devastation and significant loss of life in mountainous regions around the world. Susceptibility assessments can provide critical data to help decision‐makers, for example, local authorities and other organizations, mitigating the landslide risk, although the accuracy of existing studies needs to be improved. This study aims to assess landslide susceptibility in the Thua Thien Hue province of Vietnam using deep neural networks (DNNs) and swarm‐based optimization algorithms, namely Adam, stochastic gradient descent (SGD), Artificial Rabbits Optimization (ARO), Tuna Swarm Optimization (TSO), Sand Cat Swarm Optimization (SCSO), Honey Badger Algorithm (HBA), Marine Predators Algorithm (MPA) and Particle Swarm Optimization (PSO). The locations of 945 landslides occurring between 2012 and 2022, along with 14 conditioning factors, were used as input data to build the DNN and DNN‐hybrid models. The performance of the proposed models was evaluated using the statistical indices receiver operating characteristic curve, area under the curve (AUC), root mean square error, mean absolute error (MAE), R ² and accuracy. All proposed models had a high accuracy of prediction. The DNN‐MPA model had the highest AUC value (0.95), followed by DNN‐HBA (0.95), DNN‐ARO (0.95), DNN‐Adam (0.95), DNN‐SGD (0.95), DNN‐TSO (0.93), DNN‐PSO (0.9) and finally DNN‐SCSO (0.83). High‐precision models have identified that the majority of the western region of Thua Thien Hue province is very highly susceptible to landslides. Models like the aforementioned ones can support decision‐makers in updating large‐scale sustainable land‐use strategies.
... Previous to the application of the modeling algorithm, the identification of explanatory factors is relevant. It is based on literature review, expert knowledge, availability of information, and data exploration [20]. Some examples of data exploration techniques are Logistic Regression (LR) [21][22][23], heuristics based on expert opinion [24,25], Discriminant Analysis [26], Markov Chain [18], machine learning techniques [27], and Exploratory Factor Analysis (EFA). ...
... This study shows the importance of NDWI for the characterization of landslide occurrence. It is consistent with Zhang et al. [125], Maqsoom et al. [126], and Nguyen et al. [20]. The water infiltrates when soil moisture uptake exceeds its water-holding capacity, resulting in subsurface runoff. ...
Article
Full-text available
Landslides are one of the natural phenomena with more negative impacts on landscape, natural resources, and human health worldwide. Andean geomorphology, urbanization, poverty, and inequality make it more vulnerable to landslides. This research focuses on understanding explanatory landslide factors and promoting quantitative susceptibility mapping. Both tasks supply valuable knowledge for the Andean region, focusing on territorial planning and risk management support. This work addresses the following questions using the province of Azuay-Ecuador as a study area: (i) How do EFA and LR assess the significance of landslide occurrence factors? (ii) Which are the most significant landslide occurrence factors for susceptibility analysis in an Andean context? (iii) What is the landslide susceptibility map for the study area? The methodological framework uses quantitative techniques to describe landslide behavior. EFA and LR models are based on a historical inventory of 665 records. Both identified NDVI, NDWI, altitude, fault density, road density, and PC2 as the most significant factors. The latter factor represents the standard deviation, maximum value of precipitation, and rainfall in the wet season (January, February, and March). The EFA model was built from 7 latent factors, which explained 55% of the accumulated variance, with a medium item complexity of 1.5, a RMSR of 0.02, and a TLI of 0.89. This technique also identified TWI, fault distance, plane curvature, and road distance as important factors. LR's model, with AIC of 964.63, residual deviance of 924.63, AUC of 0.92, accuracy of 0.84, and Kappa of 0.68, also shows statistical significance for slope, roads density, geology, and land cover factors. This research encompasses a time-series analysis of NDVI, NDWI, and precipitation, including vegetation and weather dynamism for landslide occurrence. Finally, this methodological framework replaces traditional qualitative models based on expert knowledge, for quantitative approaches for the study area and the Andean region.
... The value of NDVI is inversely proportional to the probability of the occurrence of natural hazards. In recent years in the study area, the forested area has rapidly diminished, leading to an increase in the number and intensity of natural hazards such as flooding and landslides (Nguyen, 2022c;Nguyen et al., 2020). The NDVI value in the study area ranged from −0.3 to 0.8. ...
... NDBI describes the density of infrastructure in a region and is a key factor in predicting natural hazard susceptibility because construction influences soil water permeability, flow velocity, and geomorphological structure (Nguyen, 2022c, Nguyen et al., 2020. The value of NDBI ranged from −0.41 to 0.7 in the study area. ...
Article
Natural hazards constitute a diverse category and are unevenly distributed in time and space. This hinders predictive efforts, leading to significant impacts on human life and economies. Multi‐hazard prediction is vital for any natural hazard risk management plan. The main objective of this study was the development of a multi‐hazard susceptibility mapping framework, by combining two natural hazards—flooding and landslides—in the North Central region of Vietnam. This was accomplished using support vector machines, random forest, and AdaBoost. The input data consisted of 4591 flood points, 1315 landslide points, and 13 conditioning factors, split into training (70%), and testing (30%) datasets. The accuracy of the models' predictions was evaluated using the statistical indices root mean square error, area under curve (AUC), mean absolute error, and coefficient of determination. All proposed models were good at predicting multi‐hazard susceptibility, with AUC values over 0.95. Among them, the AUC value for the support vector machine model was 0.98 and 0.99 for landslide and flood, respectively. For the random forest model, these values were 0.98 and 0.98, and for AdaBoost, they were 0.99 and 0.99. The multi‐hazard maps were built by combining the landslide and flood susceptibility maps. The results showed that approximately 60% of the study area was affected by landslides, 30% by flood, and 8% by both hazards. These results illustrate how North Central is one of the regions of Vietnam that is most severely affected by natural hazards, particularly flooding, and landslides. The proposed models adapt to evaluate multi‐hazard susceptibility at different scales, although expert intervention is also required, to optimize the algorithms. Multi‐hazard maps can provide a valuable point of reference for decision makers in sustainable land‐use planning and infrastructure development in regions faced with multiple hazards, and to prevent and reduce more effectively the frequency of floods and landslides and their damage to human life and property.
... The network parameter (Parameters) [84] is the sum of the parameter quantities of all convolutional layers and fully-connected layers in the network, which represents the size of the network model. ...
Article
Full-text available
Infrared ship target segmentation is the important basis of infrared guided weapon in the sea-air context. Typically, accurate infrared ship target segmentation relies on a large number of pixel-level labels. However, it is difficult to obtain them. To this end, we present a method of Semi-supervised Infrared Ship Target Segmentation with Dual Branch (SeISTS-DB), which utilizes a small amount of labeled data and a large amount of unlabeled data to train model and improve segmentation performance. There are three main contributions. First, we design a target segmentation branch to generate the pseudo labels for unlabeled data. It consists of a dual learning network and a segmentation network. The dual learning network generates pseudo labels with weights for unlabeled data. The segmentation network is trained using both labeled data and unlabeled data with pseudo labels to achieve target segmentation of infrared ship, obtaining the preliminary segmentation results. Secondly, we introduce an error segmentation pixel correction branch, which contains a student network and a teacher network, to modify the pixel category error of the preliminary segmentation map. Finally, the outputs of the two branches are combined to obtain the final segmentation result. The SeISTS-DB is compared with other fully-supervised and semi-supervised methods on the infrared ship images dataset. Experimental results demonstrate that when the labeled data accounts for 1/8 of the training data, the mean Intersection over Union (mIou) is respectively improved by 15.35% and 6.19% at most. Besides, it is also compared with other methods on the public IRSTD-1k dataset, when the proportion of labeled images is 1/8, the mIoU is respectively improved by 11.76% at most compared to the state-of-the-art semi-supervised methods, demonstrating its effectiveness.
... Finally, the extrapolation problem is considered a major problem when using machine learning to predict the flood risk or estimate the flood depth. Several studies have tried different techniques to solve this problem such as augmenting training data in different geographic locations Nguyen et al. 2020). However, this method is very expensive and is not feasible in areas that are difficult to access and in regions with limited data. ...
Article
Full-text available
Flood prediction is an important task, which helps local decision-makers in taking effective measures to reduce damage to the people and economy. Currently, most studies use machine learning to predict flooding in a given region; however, the extrapolation problem is considered a major challenge when using these techniques and is rarely studied. Therefore, this study will focus on an approach to resolve the extrapolation problem in flood depth prediction by integrating machine learning (XGBoost, Extra-Trees (EXT), CatBoost (CB), and light gradient boost machines (LightGBM)) and hydraulic modeling under MIKE FLOOD. The results show that the hydraulic model worked well in providing the flood depth data needed to build the machine learning model. Among the four proposed machine learning models, XGBoost was found to be the best at solving the extrapolation problem in the estimation of flood depth, followed by EXT, CB, and LightGBM. Quang Binh province was hit by floods with depths ranging from 0 to 3.2 m. Areas with high flood depths are concentrated along and downstream of the two major rivers (Gianh and Nhat Le – Kien Giang).
... Redundant factors can heighten model complexity, leading to diminished performance (Nguyen et al. 2020a). Although there isn't a universal guide for factor selection, our study identified 11 conditioning factors based on the geo-environmental conditions of the study area, and a comprehensive literature review. ...
Article
Full-text available
Globally, coastal erosion significantly impacts the socio-economic conditions and infrastructure development of coastal regions, with Vietnam facing considerable challenges due to its extensive coastline. This study focuses on developing innovative hybrid machine learning models, namely BLWL and CGLWL, which combine Locally Weighted Learning (LWL) and two optimization techniques, namely Bagging and Cascade Generalization, respectively. Quang Nam Province in Vietnam consistently affected by coastal erosions, serves as the case study. For model development, a set of historical coastal erosions and the affecting factors, such as magnitude of horizontal flow (sea currents), wave height, wave direction, distance to fault, geology, river density, elevation, curvature, aspect, slope degree, and topographic wetness index were collected and used for generation of the database. For the selection and prioritization of affecting coastal erosion factors, Correlation Attribute Evaluation (CAE) method was used. Performance of the models was evaluated using standard statistical measures: Accuracy Assessment (ACC), Sensitivity (SST), Specificity (SPF), Root Mean Squared Errors (RMSE), Kappa (K), Positive Predictive Value (PPV), and Negative Predictive Value (NPV), and Area Under the ROC Curve (AUC). Results indicated that the BLWL model (AUC: 0.978) was the best, followed by CGLWL (AUC: 0.968) and LWL (AUC: 0.963) models in accurately predicting coastal erosion susceptible areas. Therefore, it can be concluded that BLWL is a promising tool for the development of coastal erosion susceptibility maps, facilitating effective planning and management to mitigate the impact of coastal erosion. Keywords Coastal erosion · Machine learning · Locally weighted learning · GIS · Vietnam
... To reduce the problem's effects, several techniques were applied in this study, such as setting a dropout rate and limiting the search range. However, other methods can also be tried, such as collecting training data from different geographical locations (Nguyen et al. 2020). ...
Article
Full-text available
Flood damage is becoming increasingly severe in the context of climate change and changes in land use. Assessing the effects of these changes on floods is important, to help decision-makers and local authorities understand the causes of worsening floods and propose appropriate measures. The objective of this study was to evaluate the effects of climate and land use change on flood susceptibility in Thua Thien Hue province, Vietnam, using machine learning techniques (support vector machine (SVM) and random forest (RF)) and remote sensing. The machine learning models used a flood inventory including 1,864 flood locations and 11 conditional factors in 2017 and 2021, as the input data. The predictive capacity of the proposed models was assessed using the area under the curve (AUC), the root mean square error (RMSE), and the mean absolute error (MAE). Both proposed models were successful, with AUC values exceeding 0.95 in predicting the effects of climate and land use change on flood susceptibility. The RF model, with AUC = 0.98, outperformed the SVM model (AUC = 0.97). The areas most susceptible to flooding increased between 2017 and 2021 due to increased built-up area. HIGHLIGHTS Machine learning algorithms were applied for flood susceptibility modeling.; The RF model had the highest AUC value (0.98).; The areas highly flood susceptibility increased between 2017 and 2021.;
... RMSE and MAE are two popular statistical indices for analyzing landslide susceptibility model performance. They measure the errors between the prediction value and the observation value (Pham et al. 2019b;Nguyen et al. 2020;Pham et al. 2020), as in following equation: ...
Article
Full-text available
Understanding the negative effects of climate change and changes to land use/land cover on natural hazards is an important feature of sustainable development worldwide, as these phenomena are inextricably linked with natural hazards such as landslides. The contribution of this study is an attempt to develop a state-of-the-art method to assess the effects of climate change and changes in land use/land cover on landslide susceptibility in the Tra Khuc river basin in Vietnam. The method is based on machine learning and remote sensing algorithms, namely radial basis function neural networks–search and rescue optimization (RBFNN–SARO), radial basis function neural network–queuing search algorithm (RBFNN–QSA), radial basis function neural network–life choice-based optimizer (RBFNN–LCBO), radial basis function neural network–dragonfly optimization (RBFNN–DO). All proposed models performed well, with AUC value of >0.9. The RBFNN–QSA model performed best, with an AUC value of 0.98, followed by RBFNN–SARO (AUC = 0.97), RBFNN–LCBO (AUC = 0.95), RBFNN–DO (AUC = 0.93), and support vector machine (SVM; AUC = 0.92). The results show that both climate and land use/land cover change greatly in the future: Precipitation increases 18% by 2030 and 25.1% by 2050; the total production forest, protected forest and built-up area change considerably between 2010 and 2050. These changes influence landslide susceptibility: The area of high and very high landslide susceptibility decrease by approximately 100 and 300 km² respectively in the study area from 2010 to 2050. The findings of this study can support decision-makers in formulating appropriate strategies to reduce damage from landslides, such as limiting construction in areas where future landslides are predicted. Although this study applies to a particular region of Vietnam, the findings can be applied in other mountainous regions around the world.
Article
Full-text available
Recently, machine learning models have received huge attention for environmental risk modeling. One of these applications is landslide susceptibility mapping which is a necessary primary step for dealing with the landslide risk in prone areas. In this study, a conventional machine learning model called multi-layer perceptron (MLP) neural network is built upon advanced optimization algorithms to achieve a firm prediction of landslide susceptibility in Ardal County, West of Iran. The used geospatial dataset consists of fourteen conditioning factors and 170 landslide events. The used optimizers are electromagnetic field optimization (EFO), symbiotic organisms search (SOS), shuffled complex evolution (SCE), and electrostatic discharge algorithm (ESDA) that contribute to tuning MLP’s internal parameters. The competency of the models is evaluated using several statistical methods to provide a comparison among them. It was discovered that the EFO-MLP and SCE-MLP enjoy much quicker training than SOS-MLP and ESDA-MLP. Further, relying on both accuracy and time criteria, the EFO-MLP was found to be the most efficient model (time = 1161 s, AUC = 0.879, MSE = 0.153, and R = 0.657). Hence, the landslide susceptibility map of this model is recommended to be used by authorities to provide real-world protective measures within Ardal County. For helping this, a random forest-based model showed that Elevation, Lithology, and Land Use are the most important factors within the studied area. Lastly, the solution discovered in this study is converted into an equation for convenient landslide susceptibility prediction.
Article
Full-text available
The present study entails an artificial intelligence-based framework for landslide risk analysis of a highway infrastructure in the Himalayan region. In total, 241 landslide polygons that were inventoried for the study area. The spatial component of landslide susceptibility map was prepared by incorporating drainage density, TWI, geology, elevation and slope gradient as major contributing factors, in the certainty factor–random forest (CF-RF) hybrid model with accuracy of 0.928. The landslide hazard analysis was carried out by multiplying landslide spatial and temporal probabilities. The landslide vulnerability analysis of the highway stretch was carried out by integrating the elements at risk. The built-up area was extracted by using U-Net deep learning algorithm with an accuracy of 0.964. The landslide risk map of the highway stretch prepared by the multiplication of landslide hazard and vulnerability maps depicts that 16.78% and 6.25% of the study area falls in high and very high-risk zones, respectively.
Article
Full-text available
Landslides have multidimensional effects on the socioeconomic as well as environmental conditions of the impacted areas. The aim of this study is the spatial prediction of landslide using hybrid machine learning models including bagging (BA), random subspace (RS) and rotation forest (RF) with alternating decision tree (ADTree) as base classifier in the northern part of the Pithoragarh district, Uttarakhand, Himalaya, India. To construct the database, ten conditioning factors and a total of 103 landslide locations with a ratio of 70/30 were used. The significant factors were determined by chi-square attribute evaluation (CSEA) technique. The validity of the hybrid models was assessed by true positive rate (TP Rate), false positive rate (FP Rate), recall (sensitivity), precision, F-measure and area under the receiver operatic characteristic curve (AUC). Results concluded that land cover was the most important factor while curvature had no effect on landslide occurrence in the study area and it was removed from the modelling process. Additionally, results indicated that although all ensemble models enhanced the power prediction of the ADTree classifier (AUCtraining = 0.859; AUCvalidation = 0.813); however, the RS ensemble model (AUCtraining = 0.883; AUCvalidation = 0.842) outperformed and outclassed the RF (AUCtraining = 0.871; AUCvalidation = 0.840), and the BA (AUCtraining = 0.865; AUCvalidation = 0.836) ensemble model. The obtained results would be helpful for recognizing the landslide prone areas in future to better manage and decrease the damage and negative impacts on the environment.
Article
Full-text available
We proposed an innovative hybrid intelligent approach, namely, the multiboost based naïve bayes trees (MBNBT) method for the spatial prediction of landslides in the Mu Cang Chai District of Yen Bai Province, Vietnam. The MBNBT, which is an ensemble of the multiboost (MB) and naïve bayes trees (NBT) base classifier, has rarely been applied for landslide susceptibility mapping around the world. For the modeling, we selected 248 landslide locations in the hilly terrain of the study area. Fifteen landslide conditioning factors were selected for the construction of the database based on the one-R attribute evaluation (ORAE) technique. Model validation was done using statistical metrics, namely, sensitivity, specificity, accuracy, mean absolute error (MAE), root mean square error (RMSE), and the area under the receiver operating characteristics curve (AUC). Performance of the hybrid model was evaluated and compared with popular soft computing benchmark models, namely, multiple perceptron neural network (MLPN), Support Vector Machines (SVM), and single NBT. Results indicated that the proposed MBNBT (AUC = 0.824) model outperformed the popular models, namely, the MLPN (AUC = 0.804), SVM (AUC = 0.804), and NBT (AUC = 0.800) models. Analysis of the model results also suggested that the MB meta classifier Appl. Sci. 2019, 9, 2824 2 of 26 ensemble model could enhance the prediction power of the NBT model. Therefore, the MBNBT is a suitable method for the assessment of landslide susceptibility in landslide prone areas.
Article
Full-text available
There is a growing demand for detailed and accurate landslide maps and inventories around the globe, but particularly in hazard-prone regions such as the Himalayas. Most standard mapping methods require expert knowledge, supervision and fieldwork. In this study, we use optical data from the Rapid Eye satellite and topographic factors to analyze the potential of machine learning methods, i.e., artificial neural network (ANN), support vector machines (SVM) and random forest (RF), and different deep-learning convolution neural networks (CNNs) for landslide detection. We use two training zones and one test zone to independently evaluate the performance of different methods in the highly landslide-prone Rasuwa district in Nepal. Twenty different maps are created using ANN, SVM and RF and different CNN instantiations and are compared against the results of extensive fieldwork through a mean intersection-over-union (mIOU) and other common metrics. This accuracy assessment yields the best result of 78.26% mIOU for a small window size CNN, which uses spectral information only. The additional information from a 5 m digital elevation model helps to discriminate between human settlements and landslides but does not improve the overall classification accuracy. CNNs do not automatically outperform ANN, SVM and RF, although this is sometimes claimed. Rather, the performance of CNNs strongly depends on their design, i.e., layer depth, input window sizes and training strategies. Here, we conclude that the CNN method is still in its infancy as most researchers will either use predefined parameters in solutions like Google TensorFlow or will apply different settings in a trial-and-error manner. Nevertheless, deep-learning can improve landslide mapping in the future if the effects of the different designs are better understood, enough training samples exist, and the effects of augmentation strategies to artificially increase the number of existing samples are better understood.
Article
Full-text available
Estimation of landslide susceptibility is still an ongoing requirement for land use management plans. Here, we proposed two novel intelligence hybrid models that rely on an adaptive neuro-fuzzy inference system (ANFIS) and two metaheuristic optimization algorithms, i.e., grey wolf optimizer (GWO) and biogeography-based optimization (BBO), for obtaining a reliable estimate of landslide susceptibility. Sixteen causative factors and 391 historical landslide events from a landslide-prone portion of the State of Uttarakhand, northern India, were used to generate a geospatial database. The ANFIS model was employed to develop an initial landslide susceptibility model that was then optimized using the GWO and BBO algorithms. This resulted in two novel models, i.e., ANFIS-BBO and ANFIS-GWO, that benefited from an intelligent approach to automatically and properly adjust the best parameters of the base ANFIS model for the prediction of landslide susceptibilities. The robustness of the models was verified through a large number of runs using different splits of training and validation datasets. Although few differences observed between the predictive capability of the models (AUC:ANFIS-BBO = 0.95; RMSE:ANFIS-BBO = 0.316 vs. ACU:ANFIS-GWO = 0.94; RMSE:ANFIS-GWO = 0.322), the Wilcoxon signed-rank test indicated a significant difference between the model performances in both training and validation datasets. Overall, our proposed models demonstrated an improved prediction of landslides compared to those achieved in previous studies with other methods. Therefore, these novel models can be recommended for modeling landslide susceptibility, and the modelers can easily tailor their use based on their individual circumstances.
Article
Full-text available
Landslides cause a considerable amount of damage around the world every year. Landslide susceptibility assessments are useful for the mitigation of the associated potential risks to local economic development, land use planning, and decision makers. The main aim of this study was to present a novel hybrid approach of bagging (B)-based kernel logistic regression (KLR), named the BKLR model, for spatial prediction of landslides in the Shangnan County, China. We first selected 15 conditioning factors for landslide susceptibility modeling. Then, the prediction capability of all conditioning factors was evaluated using the least square support vector machine method. Model validation and comparison were performed based on the area under the receiver operating characteristic curve and several statistical-based indexes, including positive predictive rate, negative predictive rate, sensitivity, specificity, kappa index, and root mean square error. Results indicated that the BKLR ensemble model outperformed and outclassed the KLR and the benchmark support vector machine model. Our findings overall confirmed that a combination of the meta model with a decision tree classifier based on a functional algorithm can decrease the over-fitting and variance problems of data, which could enhance the prediction power of the landslide model. The resultant susceptibility maps could be useful for hazard mitigation in the study area and other similar landslide-prone areas.
Article
This study proposed and compared several novel hybrid models that combined swarm intelligence algorithms and Deep Learning Neural Network for flood susceptibility mapping. Lai Chau, a province in the northwest mountainous region of Vietnam was chosen as a case study since it had recently undergone severe flashflood in 2018. For this purpose, numerical predictor variables such as topographically derived factors (Digital Elevation Model, Aspect, Slope, Curvature, Topographic Wetness Index), climatic variables (Rain), and hydrological variables (stream density, stream power index, distance to river) and multiple remote sensing indices (Normalized Difference Vegetation Index, Normalized Difference Buildup Index) were used. These predictor variables were selected because they are globally collectible and reproducible. The performances of these models were evaluated by using common statistical indicators, namely Root Mean Square Error, Mean Absolute Error, Overall Accuracy and Area under Receiving Operating Characteristics, and the statistical test of differences. The results showed that the proposed swarm intelligence models outperformed benchmarked methods, namely Particle Swarm Optimization, Support Vector Machine, Random Forest in almost all comparing indicators. It is suggested that proposed models are more robust than the classifiers, which were used for benchmarking and they are good alternatives for flood susceptibility mapping given the availability of dataset.
Article
In remote sensing, Fuzzy C-Means clustering (FCM) is a robust method in determining membership grades of a pixel belonging to 1 or more classes. This paper proposes a novel approach by using the social spider optimization (SSO) algorithm in solving the search for optimal cluster centers in FCM. Hanoi, the capital of Vietnam, was chosen as a case study because of its spatial complexity. Multispectral satellite datasets of Landsat 8, Sentinel 2A and SPOT 7 were used. The experiment started with the segmentation process, followed by an examination of the model, then the results were compared with several conventional clustering methods. For accuracy assessment, the FCM minimizing objective functions, user and producer accuracies and overall accuracy were used. The results showed that SSO significantly improved the performance of FCM and outperformed the benchmarked classifiers or other common optimization algorithms. It could be concluded that the model was successfully deployed in the study area and could be suggested as an alternative solution for urban pattern detection. In a broader sense, classification methods will be enriched with the active and fast-growing contribution of nature-inspired algorithms.
Article
Landslides are major hazards for human activities often causing great damage to human lives and infrastructure. Therefore, the main aim of the present study is to evaluate and compare three machine learning algorithms (MLAs) including Naïve Bayes (NB), radial basis function (RBF) Classifier, and RBF Network for landslide susceptibility mapping (LSM) at Longhai area in China. A total of 14 landslide conditioning factors were obtained from various data sources, then the frequency ratio (FR) and support vector machine (SVM) methods were used for the correlation and selection the most important factors for modelling process, respectively. Subsequently, the resulting three models were validated and compared using some statistical metrics including area under the receiver operating characteristics (AUROC) curve, and Friedman and Wilcoxon signed-rank tests The results indicated that the RBF Classifier model had the highest goodness-of-fit and performance based on the training and validation datasets. The results concluded that the RBF Classifier model outperformed and outclassed (AUROC = 0.881), the NB (AUROC = 0.872) and the RBF Network (AUROC = 0.854) models. The obtained results pointed out that the RBF Classifier model is a promising method for spatial prediction of landslide over the world.
Article
Adaptive Neuro-Fuzzy Inference System (ANFIS) is a robust method in solving non-linear classification by employing a human-readable interpretation manner. This paper verified a hybrid model, named WANFIS, where Whale Optimization Algorithm (WOA) was used for feature selection and tuning parameters of the ANFIS for land-cover classification. Hanoi, the capital of Vietnam, was selected as a case study, because of its complex surface morphology. The model was trained and validated with different data sets, which were subsets of the segmented objects from SPOT 7 satellite data (1.5 m in panchromatic and 6 m multiple spectral bands). The image segmentation was carried out by using PCI Geomatics software (evaluation version), and output objects with associated spectral, shape, and metric information were selected as input data to train and validate the proposed model. For accuracy assessment, the performance of the model was compared to several benchmarked classifiers by using standard statistical indicators such as Receiver Operator Characteristics, Area under ROC, Root Mean Square Error, Absolute Mean Error, Kappa index, and Overall accuracy. The results showed that WANFIS outperformed the other in almost all training data sets for both operations. It could be concluded that the examination of the classification model in different training data sizes is significant, and the proper determination of predictor variables and training sizes would improve the quality of classification of remotely sensed data.