The schematic diagram of Random Forest structure.

The schematic diagram of Random Forest structure.

Source publication
Article
Full-text available
Dam behavior prediction is a classic problem in the monitoring of dam structure. To obtain accurate results, different researchers have established various models. However, the models of predecessors rarely studied the nonlinear characteristics of dam displacement data and the abnormal values of monitoring data. It means that abnormal values will c...

Citations

... We had to adapt the concept of dynamics introduced by Bernard et al. (2012) to time evolution by adopting an approach derived from the available best practices (Brownlee, 2017). We used a sliding windows approach which is very similar to the one applied by Su et al. (2021). A graphical representation of the proposed model is presented in Fig. 1. ...
Article
Full-text available
In recent years, sustainable finance has undergone a very consistent expansion boosted by regulatory and cultural changes due to the greater awareness of the positive role of sustainable finance in mitigating the negative externalities of economic activities. In this context, the incorporation of Environmental, Social, and Corporate Governance (ESG) factors into companies’ business models represents an incentive to invest in innovation for growth, attract and cultivate young talents, and seek new strategies to reduce business risks. Hence, the adoption of good ESG practices can be the source of long-term competitive advantages. In the scientific literature, numerous studies try to assess whether the risk-return trade-off of sustainable financial products improves with increasing ESG ratings, but the absence of convergence in the empirical results leaves some open issues that need further investigation. The main goal of this work is to contribute to the debate on the assumption of an existing relationship between ESG compliance and the financial performance of companies. Accepting that ESG ratings are a reliable indicator of ESG practices, we aim to verify if ESG ratings do have informative content useful to predict a company’s future financial performance in terms of: (i) Return on Equity, (ii) Return on Investments, and (iii) systematic risk. We test our hypothesis by using a machine-learning approach. By analyzing the set of EuroStoXX600 Index companies for the period 2016–2021 via Random Forest and Sliding Windows Random Forest Models, we verify whether the integration of ESG criteria into corporate businesses helps to explain the financial performance of listed companies over time and, thus, to identify a strategic tool for all investors.
... Ensemble methods use more than one classifier to improve how well malware can be found [43]. Researchers have suggested using ensemble methods to find malware in IoT devices because these methods are more reliable and accurate than individual classifiers. ...
... ANNs and the Genetic Algorithm [38] 99.47 0.53 Hybrid SVM model [39] 99.11 1.2 Decision Tree based IDS [40] 99.8 1.2 Ensemble the Decision Tree model [41] 99.2 0.8 Random Forest [42] 99.5 0.5 Improved Random Forest model [43] 99.6 0.4 ...
Preprint
Full-text available
The Internet of Things (IoT) has experienced significant growth in recent years, with IoT connections surpassing those of traditional connected devices in the last few years. However, with this growth comes an increased risk of IoT security breaches as cybercriminals take advantage of lax security measures at the endpoint level. One of the main challenges in IoT security is the lack of proper protocols, policies, and procedures to protect these devices from malware and malicious software. This review article focuses on the recent advancements in machine learning-based malware detection techniques for IoT devices. We discuss the challenges of detecting malware in IoT devices, including the limited resources and processing power, as well as the diversity of device types and operating systems. We also review recent machine learning-based malware detection techniques for IoT devices, including deep learning, ensemble learning, and transfer learning, and evaluate their efficiency. This review aims to provide a comprehensive understanding of the current state-of-the-art machine learning-based malware detection techniques for IoT devices, highlighting the potential and limitations of these techniques and the role of analytics in future research directions.
... Bagging significantly reduces the variance and prediction bias of the output target for algorithms that have high variance. Moreover, RFs are ideal for nonlinear data [96] and they often perform outstandingly in prediction tasks. However, one drawback of RFs is that the ease of comprehension associated with individual DTs is lost when numerous DTs are aggregated in the RFs [97]. ...
Article
Full-text available
An efficient approach for improving the predictive understanding of dynamic mechanical system variability is developed in this work. The approach requires low model assessment time through the fitting of surrogate models. ML-based surrogate algorithms for finite element analysis (FEA) are developed in this study to accelerate FEA and prevent rerunning complex simulations. The research begins with an overview of the recent novelties in ML algorithms applied to finite element (FE) and other physics-based computational schemes. To predict the time-varying response variables, that is, the displacement of a two-dimensional truss structure, a surrogate FE technique based on ML algorithms is developed. In this work, several ML regression algorithms, including decision trees (DTs) and deep neural networks, are developed, and their efficacies are compared. In this study, the ML-based surrogate FE models are able to effectively predict the response of the truss structure in two dimensions over the entire structure. Extreme gradient-boosting DTs provide more precise outcomes and outperform other ML algorithms.
... In specific, trees that are developed exceptionally profound tend to memorize profoundly unpredictable designs: they overfit their preparing sets, i.e. have low bias, but exceptionally tall change. Random forests are a way of averaging different profound decision trees, prepared on diverse parts of the same preparing set, with the objective of decreasing the variance 6 . This comes at the cost of a little increment within the predisposition and a few misfortune of interpretability, but for the most part, enormously boosts the execution within the final model. ...
... Other areas of earth monitoring: An RF algorithm has also been used in the prediction of dam displacement [49]. While in [50], SVR was used in monitoring the urban heat island (UHI) effect which has been widely studied because of its impacts on the environment and human well-being. ...
... In some of the selected studies, ML models vs. non-ML models performance were compared (as seen in Fig. 7). The ML models have been compared with several conventional non-ML models: regression model [14,80,101,159,160], brute force approach [143], traditional statistical approaches [60,94,[161][162][163][164], classical KF [129], Bayes-optimal rule [118], least square (LS)-based approach [40], Saastamoinen model [110], autoregressive model and a traditional LEO propagation model (EKF-STAN) [146], conventional wind speed retrieval method [43], Maximum-Likelihood Power-Distortion (PD-ML) [165], BERNESE 5.2 [114], CYGNSS [44], Hydrostaticseasonal-time (HST) model [49], Statistical Theta method [51][52][53]166], MAPGEO2004 geoid model [73], GNSS-IR soil moisture [58], Autoregressive (AR) and Autoregressive Moving Average (ARMA) [167], ERA-Interima global atmospheric reanalysis (now ERA5 reanalysis) [107], Empirical linear algorithms (LRM and LLM) [59], International Reference Ionosphere (IRI) 2016 model [168], NeQuick and IRI-2001 global TEC model [169][170][171], EKF-based integration scheme [172], CODE GIMs (Global Ionospheric Maps) [173], autoregressive integrated moving average (ARIMA), and quadratic polynomial (QP) models [174], least square regression algorithms (LSR) and bi-harmonic spline (BHS) [105], linear interpolation method (LIN) and inverse distance weighted interpolation method (IDW) [112], Kalman filter [138,139], polynomial model [93,175], IRI-2001 model [176], conventional systems (RAIM) [126,177], EGNOS [103], and IRI-2012 model [178]. ...
... Furthermore, 2.82% (6 of 213) of studies compared one or more non-ML model with one or more ML model [49,71,101,168,236,237], while 2.82% (6 of 213) of studies implemented a hybrid between ML and non-ML algorithm [82,118,149,[238][239][240]. These hybrid implementation studies claim their performance is better than ML only and non-ML only implementations. ...
Article
Full-text available
In terms of the availability and accuracy of positioning, navigation, and timing (PNT), the traditional Global Navigation Satellite System (GNSS) algorithms and models perform well under good signal conditions. In order to improve their robustness and performance in less than optimal signal environments, many researchers have proposed machine learning (ML) based GNSS models (ML models) as early as the 1990s. However, no study has been done in a systematic way to analyze the extent of the research on the utilization of ML models in GNSS and their performance. The aim of this research is to perform a systematic review of the type of ML models utilized in GNSS use cases, their performance with respect to accuracy, their comparison with other models (ML and non-ML), and their GNSS application context. In this study, we perform a systematic review of studies from 2000 to 2021 in the literature that utilizes machine learning techniques in GNSS use cases. We assess the performance of the machine learning techniques in the existing literature on their application to GNSS. Furthermore, the strengths and weaknesses of machine learning techniques are summarized. In this paper, we have identified 213 selected studies and ten categories of machine learning techniques. The results prove the acceptable performance of machine learning techniques in several GNSS use cases. In most cases, the models using the machine learning techniques in these GNSS use cases outperform the traditional GNSS models. ML models are promising in their utilization in GNSS. However, the application of ML models in the industry is still limited. Thus, more effort and incentives are needed to facilitate the utilization of ML models in the PNT context. Therefore, based on the findings of this review, we provide recommendations for researchers and guidelines for practitioners.
... Belmokre et al. [27] applied the RF algorithm to predict dam displacement on the basis of temperature field calculated by the one-dimensional deterministic model, and compared the prediction results with the statistical model and artificial neural network model to verify the prediction performance of the algorithm. Su et al. [28] proposed an improved RF model based on the sliding time window strategy to predict the dam displacement. Gu et al. [29] established an evaluation model of influencing factors of concrete dam performance by using evidence theory and RF algorithm, and verified the mining ability of the model for influencing factors of dam deformation in practical engineering application. ...
... Gu et al. [29] established an evaluation model of influencing factors of concrete dam performance by using evidence theory and RF algorithm, and verified the mining ability of the model for influencing factors of dam deformation in practical engineering application. It can be seen from previous studies that the parameters of RF are the key factor to determine the prediction accuracy of the algorithm, but the common method is to determine the parameters by the trial-and-error (TAE) method in the parameter range set according to current experience [28]. The TAE method is highly subjective, and it is difficult to achieve satisfactory prediction accuracy in the actual application process. ...
Article
Full-text available
Deformation prediction is an important part of concrete dam safety monitoring. In recent years, the random forest (RF) algorithm has attracted more and more attention in the field of dam safety monitoring because of its fast speed and strong generalization ability. However, the performance of RF is easily affected by many factors, such as the drift of measured value in displacement and the inappropriate setting of parameters of RF. To solve the above problems, the indicator variable model (IVM) is used to identify and eliminate the drift of measured values in this paper, and the sand cat swarm optimization (SCSO) is applied to optimize RF for the first time. On the grounds of this, a deformation prediction system of a concrete dam based on the IVM and RF algorithm optimized by SCSO is proposed. The case study shows that IVM can correct the interference of monitoring data accurately, and the maximum error rate is less than 3%; in the aspect of parameter optimization of RF, the results of the SCSO algorithm are obviously better than those of the TAE method and PSO algorithm, and the corresponding OOB error is the minimum; in terms of prediction performance, compared with TAE-RF, PSO-RF, LSTM and SVM, SCSO-RF has higher accuracy and stronger stability, and its SSE and MSE are reduced by at least 91%, MAE and RMSE are reduced by at least 71%, and R2 is very close to 1. The results of study provide a new method for the automatic online evaluation of dam safety performance.
... To overcome the shortcomings of local optimization methods, global optimization methods based on various bionic intelligent algorithms have developed rapidly in recent years, and different models, such as genetic algorithms (GA), ant colony optimization (ACO), honey bee (HB) algorithms, artificial neural networks (ANNs), support vector machines (SVMs), random forest (RF), have been successfully applied to solve various complex problems in real life [31][32][33][34][35][36][37][38]. Compared to traditional local optimization methods, the intelligent global optimization algorithm reduces the dependence on the selection of the initial value, overcomes the disadvantage that optimization results easily fall into local extreme values, has strong robustness to different types of problems, and is easy to implement with parallel computing [39]. ...
Article
Full-text available
The mechanical parameter inversion model is an essential part of ensuring dam health; it provides a parametric basis for assessing the safe operational behavior of dams using numerical simulation techniques. Due to the complicated nonlinear mapping relationship between the roller compacted concrete (RCC) dam's mechanical parameters and various environmental quantities, as well as conventional statistical models, machine learning methods, and neural networks fail to consider the inputs of fuzzy uncertainty factors. Therefore, the accuracy, efficiency, and stability of inversion models are usually affected by their modeling methods. In this paper, a novel hybrid model for mechanical parameter inversion of an RCC dam is proposed, which uses a radial basis function neural network (RBFNN) to establish the nonlinear mapping relationship between the dam mechanical parameters and the environmental quantities, and the modified particle swarm optimization (PSO) algorithm is used to find the optimal parameters of the model. The modified PSO algorithm makes the inertia weight ω dynamically adjust with the number of iterations to improve the randomness and diversity of the particle population, and population crossover and mutation are introduced to improve the global search ability and convergence speed of the algorithm. The proposed hybrid model is verified and comparatively analyzed by four typical mathematical test functions, and the results show that the proposed model exhibits good performance in parameter inversion accuracy, convergence speed, stability and robustness. Finally, the model is applied to the mechanical parameter inversion analysis of an RCC gravity dam in Henan Province in China. The results show that the proposed model is feasible and reasonable for practical engineering applications, and the relative error between the results obtained by inputting the inverted parameters into the numerical model and monitoring data was within 10%. The methodology derived from this study can provide technical support and a reference for the mechanical parameter inversion analysis of similar dam projects.
... A series of tree-based methods, such as random forest [36], and XGBoost [37,38], have been used for dam behavior prediction. Compared with deep learning techniques, tree-based techniques have some significant advantages, such as higher interpretability for prediction results and strong processing capability for unbalanced data. ...
Article
Full-text available
Dam deformation is an intuitive and reliable monitoring indicator for dam structural response. With the increase in the service life of the project, the structural response and environmental quantity data collected by the structural health monitoring (SHM) system show a geometric growth trend. The traditional hydraulic-seasonal-time (HST) model shows poor performance in dealing with massive monitoring data due to the multidimensional data collinearity problem and the inaccurate temperature field simulations. To address these problems, this study proposes a data-driven dam deformation monitoring model for dealing with massive monitoring data based on the light gradient boosting tree (LGB) and Bayesian optimization (BO) algorithm. The proposed BO–LGB method can mine the underlying relationship between temperature changes and dam deformation instead of simple harmonic functions. Moreover, LGB is used to simulate the relationship between high-dimensional environmental quantity data and dam displacement changes, and the BO algorithm is used to determine the optimal hyperparameter selection of LGB based on massive monitoring data. A concrete dam in long-term service was used as the case study, and three typical dam displacement monitoring points were used for model training and validation. The experimental results have indicated that the method can properly consider the collinearity in variables, and has a good balance in modeling accuracy and efficiency when dealing with high-dimensional large-scale dam monitoring data. Moreover, the proposed method can explain the contribution difference between different input variables to select the factors with a more significant influence on modeling.
... Based on the evaluation criteria, the accuracy and effectiveness of the proposed model are verified and evaluated. Simulation results show that the proposed model can capture long-term features and provide better predictions based on short-term monitoring data [10]; Wang, L, Mao, Y, Cheng, Y, and Liu, Y propose a single-node evaluation model based on the multiple correlation sequence (SAM) to improve the accuracy of single-node evaluation. At the same time, LREA can evaluate the operational status of dams by considering changes in credibility and multi-node coordination. ...
Article
Full-text available
In view of the dynamics of the dam safety monitoring data, the sensitivity to time and space, and the nonlinearity, it has been proposed to use the firefly algorithm to search to determine the delay order and the number of hidden layer units and combine them with nonlinear autoregressive algorithms. The algorithms are combined to obtain the FA-NAR algorithm dam deformation prediction model, which is compared with the traditional BP algorithm prediction results, combined with the Xiaolangdi dam deformation monitoring data for prediction, and the dam deformation data predicted by the dynamic neural network have a better convergence effect and a more accurate prediction result. It provides a certain reference basis for perfecting dam safety monitoring.
... Dam safety monitoring has been addressed using several machine learning architectures such as support vector machines (SVMs) (Gul et al. 2021), random forests (RFs) (Zaimes et al. 2019;Su et al. 2021), decision trees (Zounemat-Kermani et al. 2017), artificial neural networks (ANNs) (De Granrut et al. 2019;Hadiyan et al. 2020;, multilayer perceptrons (MLPs) (Barkhordari and Entezari Zarch 2015;Rehamnia et al. 2021), and so on. One particularly architecture is that of the long short-term memory (LSTM), which is an improved recurrent neural network (RNN) model (Hochreiter and Schmidhuber 1997). ...
Article
Deformations in dam structures can have a critical impact on dam safety and life. Accurate methods for dam deformation prediction and safety evaluation are thus highly needed. Dam deformations can be predicted based on many factors. The analysis of these influences on the deformation of the dam reveals a problem that deserves further attention: dam deformation lags behind environmental factors of the water level and temperature as well as the time lag of the temporal dam deformation data. In this paper, a hybrid deep learning model is proposed to enhance the accuracy of dam deformation forecasting based on lag indices of these factors. In particular, dam deformations are predicted using deep networks based on gated recurrent units (GRUs), which can effectively capture the temporal characteristics of dam deformation. In addition, an improved particle swarm optimization (IPSO) algorithm is used for optimizing the GRU hyper-parameters. Furthermore, the complete ensemble empirical mode decomposition with adaptive noise algorithm (CEEMDAN) and the partial autocorrelation function (PACF) are exploited to select the lag factor indices. The accuracy and effectiveness of the proposed CEEMDAN-PACF-IPSO-GRU hybrid model were evaluated and compared with those of other existing models in terms of four different evaluation indices (MAE, MSE, R 2 , and RMSE) and using 9-year historical data for the case of a pulp-masonry arch dam in China. The experimental results show that our model outperforms other models in terms of the deformation prediction accuracy (R 2 increased by 0.16%-9.74%, while the other indices increased by 14.55% to reach 96.69%), and hence represents a promising framework for general analysis of dam deformations and other types of structural behavior.