Landslide susceptibility map in the FR model: (a) Enlarged area of the valley; (b) Enlarged area along the river.

Landslide susceptibility map in the FR model: (a) Enlarged area of the valley; (b) Enlarged area along the river.

Source publication
Article
Full-text available
To compare the random forest (RF) model and the frequency ratio (FR) model for landslide susceptibility mapping (LSM), this research selected Yunyang Country as the study area for its frequent natural disasters; especially landslides. A landslide inventory was built by historical records; satellite images; and extensive field surveys. Subsequently;...

Similar publications

Article
Full-text available
Discharge forecasting is a key component for early warning systems and extremely useful for decision makers. Forecasting models require accurate rainfall estimations of high spatial resolution and other geomorphological characteristics of the catchment, which are rarely available in remote mountain regions such as the Andean highlands. While radar...

Citations

... In comparative studies of various machine learning methods, RF has found widespread adoption in the field of geological hazard susceptibility assessment. This is due to its strengths in handling high-dimensional data, evaluating variable importance, and its accuracy and stability in model construction (Trigila et al. 2015;Hong et al. 2017;Wang et al. 2020). ...
... The number of factors influencing landslide susceptibility is extensive, and the primary factors vary by region (Carrara et al. 2008;Xie et al. 2017;Wang et al. 2020;Berhane et al. 2021;Sun et al. 2021). Although this study has considered 23 landslide factors, whether these factors are sufficient to meet the needs for landslide susceptibility prediction in all regions remains a question worth considering. ...
Article
Full-text available
This study proposed an interpretable model that combines Random Forest (RF), Optuna hyperparameter optimization, and SHapley Additive exPlanations (SHAP) to achieve optimal landslide susceptibility evaluation and provide explanations in the northwest region of Yunnan Province in China. First, an inventory of 4447 landslides and 23 related factors was considered for the landslide susceptibility assessment. Subsequently, a hyperparameter-optimized RF model was developed using the Optuna framework and the training dataset to generate landslide susceptibility maps. The performance of the models were evaluated using accuracy (ACC), precision (PPV), recall (TPR), F1-score (F1), and the Area Under the Curve (AUC) based on the Receiver Operating Characteristic. Furthermore, the interpretability of the model was enhanced through the implementation of SHAP. The proposed model demonstrated outstanding performance on the test set, achieving an ACC of 0.7792, PPV of 0.7448, TPR of 0.8769, F1 of 0.8055, and an AUC of 0.8387. The interpretability analysis revealed that elevation, population density, distance from roads, and normalized difference vegetation index were the primary factors influencing landslide occurrences in the study area. This study provides a comprehensive framework for evaluating landslide susceptibility in specific regions and offers invaluable insights for the prevention and management of landslide disasters.
... where FR is the frequency ratio, is landslide pixel number within the class of parameter, ∑ is the total landslide pixel number of parameter, is pixel number within class of parameter, ∑ is the total pixel number of parameter [24], [27], [30], [31]. After calculating the frequency ratio value, the frequency ratio value is normalized into the range 0-1 using min-max normalization [27], [30]. ...
... where FR is the frequency ratio, is landslide pixel number within the class of parameter, ∑ is the total landslide pixel number of parameter, is pixel number within class of parameter, ∑ is the total pixel number of parameter [24], [27], [30], [31]. After calculating the frequency ratio value, the frequency ratio value is normalized into the range 0-1 using min-max normalization [27], [30]. The normalized value is the value of the pixel to be computed. ...
... The frequency ratio (FR) determines the correlation between landslide points and the factors that cause landslides. If a class has an FR value > 1, then the class has a high correlation with landslide events, whereas if a class has an FR value <1, then this class has a low correlation with landslide events [30]. The frequency ratio values for each class of the thirteen factors that cause landslides are shown in Table 2 In aspect factor, all classes have FR>1 values, indicating a high probability of landslides occurring in these classes based on the aspect parameter. ...
Article
Full-text available
Java Island holds the highest record of landslide events in Indonesia. In 2021, the Bogor area, consisting of the city and regency of Bogor, recorded the highest number of landslides. These events further impact the fatalities, damage, and loss to society. Landslide mitigation should be considered to reduce the risk caused by landslide hazards. In this regard, a landslide susceptibility analysis is one of the fundamental steps in mitigation measures that can support policymakers in response to landslide disaster risk reduction. The location of landslide possibilities can be identified by mapping landslide susceptibility. Therefore, this study aims to produce a landslide susceptibility map (LSM) using a statistical frequency ratio method and logistic regression. The number of landslide inventories used in the model is about 822 events. To apply the model, the present study evaluates 13 influencing factors consisting of elevation, slope angle, slope aspect, slope curvature, Topographic Wetness Index (TWI), distance to river, lithological, distance to fault, soil type, annual rainfall, Normalized Difference Vegetation Index (NDVI), land use land cover (LULC), and road distance. The model performance is further evaluated using Area Under the ROC Curve (AUC). The frequency ratio (FR) and logistic regression (LR) models produce satisfactory results and have high predictions of future landslide occurrences with a score of 0.8317 and 0.8817, respectively.
... But FR is deemed a conventional method for mapping flood susceptibility due to its simple formula and straightforward process, albeit time-consuming and laborious (Sahana et al. 2020). Due to the rapid progress in machine learning (ML) and computation power, the FR model has become less ideal over time (Wang et al. 2020b). To date, an increasing number of researchers use more sophisticated empirical ML modelss to assess natural hazards risks. ...
... In a study by Wang et al. (2020b), the performances of the FR and RF models for landslide susceptibility mapping were compared in Yunyang County, China, and it was found that the RF performed better compared to FR. Elmahdy et al. (2020) used RF and FR model to map land subsidence and sinkholes susceptibility in Al Ain area, UAE. RF was found to be highly accurate and required less time to generate results. ...
... This finding is consistent with previous studies that assessed different natural hazards susceptibilities. For instance, many researchers observed that the RF model outperformed the statistical models in landslide susceptibility assessment (Akinci and Zeybek 2021;Wang et al. 2020b), land subsidence and sinkholes susceptibility mapping (Elmahdy et al. 2020), seismic vulnerability assessment (Han et al. 2020), wildfire susceptibility mapping (Oliveira et al. 2012), groundwater potentiality mapping (Thanh et al. 2022) and debris flow susceptibility mapping (Liang et al. 2020). ...
Article
Full-text available
Machine learning (ML) models, particularly decision tree (DT)-based algorithms, are being increasingly utilized for flood susceptibility mapping. To evaluate the advantages of DT-based ML models over traditional statistical models on flood susceptibility assessment, a comparative study is needed to systematically compare the performances of DT- based ML models with that of traditional statistical models. New Orleans, which has a long history of flooding and is highly susceptible to flooding, is selected as the test bed. The primary purpose of this study is to compare the performance of multiple DT-based ML models namely DT, Adaptive Boosting (AdaBoost), Gradient Boosting (GdBoost), Extreme Gradient Boosting (XGBoost) and Random Forest (RF) models with a traditional statistical model known as Frequency Ratio (FR) model in New Orleans. This study also aims to identify the main drivers contributing to flooding in New Orleans using the best performing model. Based on the most recent Hurricane Ida-induced flood inventory map and nine crucial flood conditioning factors, the models’ accuracies are tested and compared using multiple evaluation metrics. The findings of this study indicate that all DT-based ML models perform better compared to FR. The RF model emerges as the best model (AUC = 0.85) among all DT-based ML models in every evaluation metrics. This study then adopts the RF model to simulate flood susceptibility map (FSM) of New Orleans and compares it with the prediction of FR model. The RF model also demonstrates that low elevation and higher precipitation are the main factors responsible for flooding in New Orleans. Therefore, this comparative approach offers a significant understanding about the advantages of advanced ML models over traditional statistical models in local flood susceptibility assessment.
... During the last decade, various techniques and approaches have been developed to study landslide susceptibility, including qualitative [6], statistical [7][8][9][10], numerical [1,11], and through the use of machine learning [12][13][14][15][16]. Machine learning is now becoming more widely used in landslide prevention, as it can provide optimal, accurate, efficient, and effective results with proper conditioning. ...
... Other research has discussed a hybrid intelligent strategy for landslide-susceptibility mapping in the Bijar area of Kurdistan Province (Iran) based on a naive Bayes trees (NBT) and random subspace (RS) ensemble [20]. Generally, the success of a method depends on the geographical nature of each area that is studied [15]. Therefore, developing and comparing existing methods in determining landslide susceptibility in a research area are not impossible. ...
Article
Full-text available
Landslides have produced several recurrent dangers, including losses of life and property, losses of agricultural land, erosion, population relocation, and others. Landslide mitigation is critical since population and economic expansion are rapidly followed by significant infrastructure development, increasing the risk of catastrophes. At an early stage in landslide-disaster mitigation, landslide-risk mapping must give critical information to help policies limit the potential for landslide damage. This study will utilize the comparative frequency ratio (FR) and random forest (RF) techniques; they will be utilized to properly investigate the distribution of flood vulnerability in the Sumedang area. This study has identified 12 criteria for developing a landslide-susceptibility model in the research region based on the features of past disasters in the research area. The FR and RF models scored 88 and 81% of the AUC value, respectively. Based on the McNemar test, the FR and RF models featured the same performance in determining the landslide-vulnerability level performances in Sumedang. They performed well in assessing landslides in the research region; therefore, they may be used as references in landslide prevention and references in future regional development plans by the stakeholders.
... Further DT classifiers take randomly selected predictor variables instead of all to grow trees. Relate to a single DT, the collection of such DT classifiers aids in effectively maintaining data diversity and provides stability in the models learning process (Wang et al., 2020). Eventually, all the DT's casts a unit vote and by combining them the final output can be obtained. ...
Article
Full-text available
Landslides are significant and recurring hazards in the Himalayan region, necessitating the need for landslide susceptibility zonation (LSZ) to identify landslide probable areas. This study applied three bi-variate models namely certainty factor (CF), evidential belief function (EBF), and weight of evidence (WofE), with two soft-computing models, namely artificial neural network (ANN) and random forest (RF) to predict landslides in parts of the Kalimpong region. Ensembles combining soft-computing and bi-variate models were examined and compared with traditional bi-variate models for LSZ prediction accuracy. To improve ANN and RF model performance, three non-landslide scenarios were also assessed. Nine model architectures (CF, EBF, WofE, ANN-CF, ANN-EBF, ANN-WofE, RF-CF, RF-EBF, RF-WofE) were designed to derive landslide susceptibility index (LSI) values. These LSI values were classified into five susceptible zones using the success rate curve method derived class boundaries. Model prediction performance was evaluated using the area under curve (AUC) of receiver operating characteristic curve, standard error (SE), 95% confidence interval (CI), and chi-square (χ2)-based measures. The results indicate ensemble models achieved better prediction accuracy (average AUC of 0.834) compared to bi-variate models (average AUC of 0.815), with RF-CF (AUC = 0.843, SE = 0.0092, 95% of CI = 0.825 to 0.862, and χ2 = 1478.61) and ANN-CF (AUC = 0.842, SE = 0.0093, 95% of CI = 0.824 to 0.860, and χ2 = 1435.89) models outperformed all. Additionally, the assessment found ~ 34% area is highly susceptible to landslides. It is envisaged that the present attempt will be helpful for better land use planning in the investigated area.
... Curvature can describe the variations, and reflect the complexity of the ground. Curvature affects the local water velocity, soil erosion, and deposition (Ohlmacher, 2007;Wang et al., 2020). Fig. 6b depicts the impact of curvature on landslide occurrence, exhibiting a rising and then falling trend. ...
Article
Boosting algorithms have been widely utilized in the development of landslide susceptibility mapping (LSM) studies. However, these algorithms possess distinct computational strategies and hyperparameters, making it challenging to propose an ideal LSM model. To investigate the impact of different boosting algorithms and hyperparameter optimization algorithms on LSM, this study constructed a geospatial database comprising 12 conditioning factors, such as elevation, stratum, and annual average rainfall. The XGBoost (XGB), LightGBM (LGBM), and CatBoost (CB) algorithms were employed to construct the LSM model. Furthermore, the Bayesian optimization (BO), particle swarm optimization (PSO), and Hyperband optimization (HO) algorithms were applied to optimizing the LSM model. The boosting algorithms exhibited varying performances, with CB demonstrating the highest precision, followed by LGBM, and XGB showing poorer precision. Additionally, the hyperparameter optimization algorithms displayed different performances, with HO outperforming PSO and BO showing poorer performance. The HO-CB model achieved the highest precision, boasting an accuracy of 0.764, an F1-score of 0.777, an area under the curve (AUC) value of 0.837 for the training set, and an AUC value of 0.863 for the test set. The model was interpreted using SHapley Additive exPlanations (SHAP), revealing that slope, curvature, topographic wetness index (TWI), degree of relief, and elevation significantly influenced landslides in the study area. This study offers a scientific reference for LSM and disaster prevention research. This study examines the utilization of various boosting algorithms and hyperparameter optimization algorithms in Wanzhou District. It proposes the HO-CB-SHAP framework as an effective approach to accurately forecast landslide disasters and interpret LSM models. However, limitations exist concerning the generalizability of the model and the data processing, which require further exploration in subsequent studies.
... Normalized Difference Vegetation Index (NDVI) refers to the active vegetation biomass or forest cover ( Fig. 7.a). Landslides usually occur on bare land and grasslands (Wang et al., 2020). Road construction is one of the factors controlling slope stability, with the hypothesis that landslides occur more frequently along the road. ...
Article
Full-text available
Landslides occur when masses of rock, debris or soil move due to various factors and processes that cause land movement. The Taba Penanjung-Kepahiang route is one of the areas in Bengkulu Province that is highly prone to landslides. This causeway is the only fastest land route connecting the Bengkulu-Kepahiang area. In recent years, the road area has often been cut off due to landslides and fallen trees, which have caused road access to be cut off and obstructed and claimed lives. This study uses a Machine Learning (ML) and GIS approach with Variable Frequency Ratio using 16 independent factors obtained from the spatial database and DEM, which correlate with landslide events. This research aims to gain an in-depth understanding of the factors that cause landslides. In addition, the research focus is the development of a Disaster Mitigation Model to design and implement effective strategies to reduce the risk and impact of landslide disasters through in-depth analysis The dependent factor is the location of the landslide from the historical landslide area for the last five years, with a distribution of 70/30%. Furthermore, frequency ratio is used to analyze the correlation between conditioning factors and historical landslides. Then, the independent and dependent factors were normalized to create a landslide susceptibility map. Frequency Ratio (FR) indicates the likelihood of an event occurring, with drainage density (FR= 0.69), shear wave velocity (Vs30) (FR= 0.66), slope (FR= 0.60), and rainfall (FR= 0.55). The output of the processed data is in the table below.
... The knowledge-based models follow qualitative approach to define the rank among causal factors using Analytical Hierarchy Process (AHP) for the weight formulation, based on experts' knowledge and experiences (Panchal and Shrivastava, 2022). Therefore, it is difficult to evaluate the outcome objectively (Castellanos Abella and Van Westen, 2008;Wang et al, 2020). On the contrary, statistical models as quantitative methods have been applied successfully for LSM assessment during the last two decades (Merghadi et al, 2020). ...
Article
Full-text available
Landslide is one of the most highly frequent natural hazards that can bring serious casualties. One of the most susceptible landslide regions in Indonesia is Bogor area (the Regency and City of Bogor), which records the highest landslide events in the Province of West Java, Indonesia. An assessment of landslide susceptibility is one of the mitigation measures that can spatially model the zone of landslide hazard. Recently, the Landslide Susceptibility Mapping (LSM) model has been developed by using Machine Learning (ML) algorithms. However, there is still no agreement yet on which ML technique is the most appropriate for LSM. Accordingly, this paper aims to explore and compare the 7 ML algorithms for generating the most promising LSM. The LSM uses the available 13 landslide causal factors and a dataset consisting of 822 authorized landslide records and 822 prepared non-landslide points. The resulting LSMs are classified into 5 susceptibility levels, and evaluated through the Area Under Curve (AUC) of the Receiver Operating Curve (ROC) and statistical indices (sensitivity, specificity, precision, F1-score, and accuracy). The resulting LSMs present that: (1) the very-high (VH) class has the largest area percentage in all LSM models, (2) generally, the 7 MLs perform excellent for achieving >90% AUC value, except for the Decision Tree (DT) (87.84%) in model classification, (3) moreover, the overall accuracy (ACC) reflects that Random Forest (RF) outperforms the other MLs in model prediction. With this promising result, ML-based LSM models can be promoted as one of the mitigation measures for landslide disaster management.
... It also helps in assessing the relative importance of each class with respect to landslides. It is a widely used method for the evaluation and prediction of landslides 57 . The following steps were used for calculating frequency ratio (FR): ...
Article
Full-text available
Landslides are a significant natural disaster causing damage to many mountainous regions worldwide including the Indian Himalayan region. In the East Sikkim district of the Eastern Himalayas, the most used bivariate frequency ratio (FR) model was utilized with high-resolution satellite imagery to understand the susceptibility of the region to landslides. Conditioning factors such as slope aspect, slope angle, slope curvature, drainage density, land use and land cover (LULC), normalized difference vegetation index (NDVI), lithology, and geomorphology were considered in the analysis. LULC is the most crucial factor contributing to landslide susceptibility with a normalized FR value of 14.1. Slope and geomorphology followed closely with values of 12.5 and 11.8 respectively. In contrast, the least important factors were slope aspect and lithology with values of 8.7 and 9.3 respectively. These results can be used to prioritize landslide conditioning factors (LCF) and generate a final landslide susceptibility map (LSM). By adding the values of all LCFs, a landslide susceptibility index was obtained, and the LSM was zoned into high, medium, and low susceptibility classes covering 23.4%, 44.4%, and 32.2% of the study area respectively. The validity of the method used was confirmed using a receiver operating characteristic curve which yielded an accuracy of 78%. The findings highlight the importance of LULC, slope, and geomorphology as critical factors in landslide susceptibility in the East Sikkim district of the Eastern Himalayas.
... The performance comparison of various algorithms has been extensively investigated. For instance, Wang et al. (2020) compared the RF model with the frequency ratio (FR) model for LSM and determined that the RF model was more suitable for assessing karst landslide susceptibility in Yunyang County due to its higher reliability and stability. ...
Article
The aim of the present study was to assess differences in the conditioning factors and the performance of landslide susceptibility mapping (LSM), employing the SHapley Additive exPlanations (SHAP) model to gain profound insights into the intrinsic decision‐making mechanism of LSM in diverse landforms. Two typical karst erosion landforms were selected as the research areas. Based on 15 conditioning factors, LSMs for the two areas were developed using the Bayesian optimization random forest (RF) and eXtreme Gradient Boosting (XGBoost). The SHAP model was used to explore the landslide formation mechanisms from both global and local perspectives. The results show that the area under the curve (AUC) values of the XGBoost models were 0.791 and 0.761, and the AUC values of the RF models were 0.844 and 0.817, in the two different landform areas, respectively. The RF model's accuracy was higher than that of the XGBoost model in both regions. In the low‐elevation hills area, the primary three conditioning factors were identified as slope, topographic relief and distance from the river. Conversely, in the microrelief and mesorelief low mountain area, the predominant conditioning factors were elevation, distance from the river and distance from the road. Both karst landform areas exhibited a high sensitivity to the distance from the river, indicating its significant interaction with other factors contributing to landslide occurrences. Notably, the RF model demonstrated superior performance compared to the XGBoost model, rendering it a more suitable choice for conducting landslide susceptibility mapping research in karst erosion landform areas. In the present study, a comprehensive explanatory framework based on the RF‐SHAP model was proposed, which enables both global and local interpretation of landslides in various karst landscapes. Such an approach explores the intrinsic decision‐making mechanism of the model, enhancing the transparency and realism of landslide susceptibility prediction results.