Figure - available from: Water Resources Management
This content is subject to copyright. Terms and conditions apply.
Illustration of SVM regression

Illustration of SVM regression

Source publication
Article
Full-text available
This article addresses the determination of velocity profile in small streams by employing powerful machine learning algorithms that include artificial neural networks (ANNs), support vector machine (SVMs), and k-nearest neighbor algorithms (k-NN). Therefore, this study also aims to present a reliable and low-cost method for predicting velocity pro...

Similar publications

Conference Paper
Full-text available
The EEG of epileptic patients often contains sharp waveforms called "spikes", occurring between seizures. Detecting such spikes is crucial for diagnosing epilepsy. In this paper, we develop a convolutional neural network (CNN) for detecting spikes in EEG of epileptic patients in an automated fashion. The CNN has a convolutional architecture with fi...
Article
Full-text available
Evaluation of cognitive workload finds its application in many areas, from educational program assessment through professional driver health examination to monitoring the mental state of people carrying out jobs of high responsibility, such as pilots or airline traffic dispatchers. Estimation of multilevel cognitive workload is a task usually reali...
Preprint
Full-text available
Mammograms are commonly employed in the large scale screening of breast cancer which is primarily characterized by the presence of malignant masses. However, automated image-level detection of malignancy is a challenging task given the small size of the mass regions and difficulty in discriminating between malignant, benign mass and healthy dense f...
Preprint
Full-text available
In this paper, we describe our method for the ISIC 2019 Skin Lesion Classification Challenge. The challenge comes with two tasks. For task 1, skin lesions have to be classified based on dermoscopic images. For task 2, dermoscopic images and additional patient meta data have to be used. A diverse dataset of 25000 images was provided for training, co...
Preprint
Full-text available
Abstract. The number of personal weather stations (PWS) with data available online through the internet is increasing gradually in many parts of the world. The purpose of this study is to investigate the applicability of these data for the spatial interpolation of precipitation for high intensity events of different durations. Due to unknown errors...

Citations

... This can be seen as a quintessential form of knowledge discovery, as no assumptions are required to perform these algorithms on unknown datasets. Furthermore, this is strongly related to machine learning that has been applied successfully in the hydrological context (Genc and Dag, 2016;Patel and Ramachandran, 2015). Consequently, the resulting product has many similarities with a neuro-fuzzy system or Adaptive Neuro-Fuzzy Inference System (ANFIS) that has been applied in works such as (Mousavi et al., 2007). ...
... This can be seen as a quintessential form of knowledge discovery, as no assumptions are required to perform these algorithms on unknown datasets. Furthermore, this is strongly related to machine learning that has been applied successfully in the hydrological context (Genc and Dag, 2016;Patel and Ramachandran, 2015). Consequently, the resulting product has many similarities with a neuro-fuzzy system or Adaptive Neuro-Fuzzy Inference System (ANFIS) that has been applied in works such as (Mousavi et al., 2007). ...
Chapter
Floods mostly vary from one region to another, and their severity is determined by a variety of factors, including unpredictable weather patterns and heavy rainfall occurrences (Pham Van and Nguyen-Van, 2020; Soulard et al., 2020). Although floods are common in many places of India during monsoon seasons, the Ganga basin is particularly vulnerable (Bhatt et al., 2021; Meena et al., 2021). There are a lot of areas in the state of Bihar that get flooded due to the swelling of rivers in neighboring Nepal (Lal et al., 2020; Soulard et al., 2020; Wagle et al., 2020). This appealed to the attention of the present research. The Ganga basin spans China, Nepal, India, and Bangladesh (Agnihotri et al., 2019; Ahmad and Goparaju, 2020; Prakash et al., 2017; Sinha and Tandon, 2014). The global emergence of COVID-19 has stopped all the activities, and it debuted as the deadliest disease with the longest nationwide lockdown. These caused enormous disruption in all aspects of people’s livelihood. Besides, major obstacles got accumulated due to the effect of the flooding event during July 2020. It added misery to the people and livelihood of the people, who were trying to control the spread of COVID-19. These results in disaster-risk mitigation to other sectors. The only way to have an effective and prompt response is to have real-time information provided by space-based sensors. Using a cloud-based platform like Google earth engine (GEE), an automated technique is employed to analyze the flood inundation with Synthetic Aperture Radar (SAR) images. The study exhibits the potential of automated techniques along with algorithms applied to larger datasets on cloud-based platforms. The results present flood extent maps for the lower Ganga basin, comprising areas of the Indian subcontinent. Severe floods destroyed several parts of Bihar and West Bengal affecting a large population. This study offers a prompt and precise estimation of inundated areas to facilitate a quick response for risk assessment, particularly at times of the COVID-19. The three states (Bihar, Jharkhand, and West Bengal), collectively known as the Lower Ganga Basin, are home to more than 30% of the population (Prakash et al., 2017). Rapid population growth and settlements resulted in changes in land use, increased soil erosion, increased siltation, and other related variables that augmented flood severity (Li et al., 2020; Pham Van and Nguyen-Van, 2020). However, floods became the most frequent disaster in recent times, what compounded the problem was the COVID-19 pandemic (Kr€amer et al., 2021; Lal et al., 2020). As a result, new measures were needed to manage the spread of COVID-19 as well as flood mitigation (Wang et al., 2020; Zoabi et al., 2021). Although ground data and field measurements are considered to be more accurate, they are time and money consuming. Furthermore, field surveys were impossible to conduct during this period, since social distancing has become the norm, linked with significant health concerns and trip expenditures ( Jian et al., 2020; Lattari et al., 2019). Flood mitigation strategies that are ineffective may result in more human deaths, property damage, and more spread of COVID-19 (Cornara et al., 2019; Shen et al., 2019). It had disastrous impacts in 149 districts throughout Bihar, Assam, West Bengal. Since the movement was halted owing to a sudden shutdown, the only way out was to employ robust flood control techniques based on real-time information (Das et al., 2018; Dong et al., 2020; Tang et al., 2016). The dramatic increase in flood occurrence in these locations prompted specialists to implement more structured and effective flood management to address the issues, while also adhering to all COVID-19 norms and regulations (Min et al., 2020; Wang et al., 2019).
... Rainfall-runoff forecasting using ANNs was utilized in the early 1990s (Daniell 1991;Halff et al. 1993). The ANN methodology as a black box model is employed in many studies (Kisi et al. 2012;Gao et al. 2020;Genç and Dağ 2016;Sahraei et al. 2021) to forecast daily streamflow as a function of daily precipitation, snowmelt, and temperature. (Lafdani et al. 2013b) compared performances of ANN, statistical regression and a simple conceptual model and demonstrated that ANN model not only enables a more system-atic approach, but also decreases the length of calibration data collection and reduces computational time for model calibration. ...
... Then, a kernel is used to solve linear regression in the transformed feature space. More information on SVM can be found in (Genç and Dağ 2016;Sahraei et al. 2021). In this study, the employed kernel in SVM is Polynomial kernels. ...
Article
Full-text available
Streamflow forecasting plays a key role in improvement of water resource allocation, management and planning, flood warning and forecasting, and mitigation of flood damages. There are a considerable number of forecasting models and techniques that have been employed in streamflow forecasting and gained importance in hydrological studies in recent decades. In this study, the main objective was to compare the accuracy of four data-driven techniques of Linear Regression (LR), Multilayer Perceptron (MLP), Support Vector Machine (SVM), and Long Short-Term Memory (LSTM) network in daily streamflow forecasting. For this purpose, three scenarios were defined based on historical precipitation and streamflow series for 26 years of the Kentucky River basin located in eastern Kentucky, US. Statistical criteria including the coefficient of correlation (R), Nash-Sutcliff coefficient of efficiency (E), Nash-Sutcliff for High flow (EH), Nash-Sutcliff for Low flow (EL), normalized root mean square error (NRMSE), relative error in estimating maximum flow (REmax), threshold statistics (TS), and average absolute relative error (AARE) were employed to compare the performances of these methods. The results show that the LSTM network outperforms the other models in forecasting daily streamflow with the lowest values of NRMSE and the highest values ofEH,EL, and R under all scenarios. These findings indicated that the LSTM is a robust data-driven technique to characterize the time series behaviors in hydrological modeling applications.
... For confluence simulation and runoff forecasting, corresponding machine learning models are also established. Onur Gen used the artificial neural network, support vector machine, and k-nearest neighbor algorithm to predict river velocity distribution (Genç and Dağ 2016). However, in practical applications, some difficulties remain in selecting the parameters of these models, and the selected parameters highly influence the model prediction accuracy. ...
Article
Full-text available
Daily inflow forecasts provide important decision support for the operations and management of reservoirs. Accurate and reliable forecasting plays an important role in the optimal management of water resources. Numerous studies have shown that decomposition integration models have good prediction capacity. Considering the nonlinearity and unsteady state of daily incoming flow data, a hybrid model of adaptive variational mode decomposition (VMD) and bidirectional long- and short-term memory (Bi-LSTM) based on energy entropy was developed for daily inflow forecast. The model was analyzed using the mean absolute error (MAE), the root means square error (RMSE), Nash–Sutcliffe efficiency coefficient (NSE), and correlation coefficient (r). A historical daily inflow series of the Baozhusi Hydropower Station, China, is investigated by the proposed VMD-BiLSTM with hybrid models. For comparison, BP, GRNN, ELMAN, SVR, LSTM, Bi-LSTM, EMD-LSTM, and VMD-LSTM, were adopted and analyzed for evaluation and analyzed. We found that the proposed model, with MAE = 38.965, RMSE = 64.783, and NSE = 95.7%, was superior to the other models. Therefore, the hybrid model is robust and efficient for forecasting highly nonstationary and nonlinear streamflow. It can be used as the preferred data-driven tool to predict the daily inflow flow, which can ensure the safe operation of hydropower stations in reservoirs. As an interdisciplinary field spanning both machine learning and hydrology, daily inflow forecasting can become an important breakthrough in the application of deep learning to hydrology.
... The number of neurons in the hidden layer is an important parameter of the ANN-BP, which should be determined. Thus, several numbers of hidden neurons (1-10) were tested, and the best number of hidden neurons was selected using a 10-fold cross validation on the training data set (Genç and Dağ 2016;Rezaie-Balf et al. 2017). The random initialization of the weights in ANNs can result in different outputs of the networks for identical numbers of neurons. ...
Article
Full-text available
Streamflow estimation plays a significant role in water resources management, especially for flood mitigation, drought warning, and reservoir operation. Hence, the current study examines the prediction capability of three well-known machine learning algorithms (Support Vector Regression (SVR), Artificial Neural Network with backpropagation (ANN-BP), and Extreme Learning Machine (ELM)) for the monthly and daily streamflows of four rivers in the United States. For model development, three main predictor variables (P, Tmax, and Tmin) and their antecedent values were considered. The SVM-RFE feature selection method was used to select the most appropriate predictor variable.The performance of the developed models was tested using four evaluation statistics. The results indicate that (1) except some improvements, the accuracy of all models decreases at the daily scale compared to that at the monthly scale; (2) the SVR has the best performance among the three models at the monthly and daily scales, while the ANN-BP model has the worse performance; (3) the ELM has better generalization performance than the ANN-BP for streamflow simulation at the monthly and daily scales; and (4) all models fail to predict the streamflow for the Carson River as a snowmelt-dominated basin. Generally, findings of the current study indicate that the SVR model produces better results than the ELM and ANN-BP for streamflow simulation at the monthly and daily scales.
... The full understanding of the hydraulic properties of rivers, on the other hand, can be achieved by studying the distribution of the velocity (the speed and direction with which water flows) in the cross-section of the river (Cheng & Gartner, 2003). However, the existent methods are insufficient to measure the velocity profile across the river in a timely manner, thus causing time and money loss (Cheng & Gartner, 2003;Genc & Dag, 2016a). Furthermore, acquiring the velocity samples during flood events can be life-threatening to undertake. ...
... Machine Learning algorithms were successfully applied in extracting novel information and nontrivial patterns in a variety of fields, such as medicine (Dag et al., 2016;Simsek, Kursuncu et al., 2020); finance (Sevim et al., 2014;Simsek et al., 2018); water resource management (Genc & Dag, 2016b, 2016a; and many other fields (Kasie et al., 2017;Rios-Morales et al., 2011;Zuccaro, 2010). However, such (information-extraction) processes not only necessitate the domain-expertise but also expertise in machine learning, in addition to the software/coding skills. ...
Article
This study aims to develop a decision support tool for identifying the point velocity profiles in rivers. The tool enables managers to make timely and accurate decisions, thereby eliminating a substantial amount of time, cost, and effort spent on measurement procedures. In the proposed study, three machine learning classification algorithms, Artificial Neural Networks (ANN), Classification & Regression Trees (C&RT) and Tree Augmented Naïve Bayes (TAN) along with Multinomial Logistic Regression (MLR), are employed to classify the point velocities in rivers. The results showed that ANN has outperformed the other classification algorithms in predicting the outcome that was converted into 10 ordinal classes, by achieving the accuracy level of 0.46. Accordingly, a decision support tool incorporating ANN has been developed. Such a tool can be utilized by end-users (managers/practitioners) without any expertise in the machine learning field. This tool also helps in achieving success for financial investors and other relevant stakeholders.
... This can be seen as a quintessential form of knowledge discovery, as no assumptions are required to perform these algorithms on unknown datasets. Furthermore, this is strongly related to machine learning that has been applied successfully in the hydrological context in papers such as [99,100]. Consequently, the resulting product has many similarities with a neuro-fuzzy system or adaptive neuro-fuzzy inference system (ANFIS) that has been applied in works such as [101]. ...
Article
Full-text available
The concept of sustainability is assumed for this research from a temporal perspective. Rivers represent natural systems with an inherent internal memory on their runoff and, by extension, to their hydrological behavior, that should be identified, characterized and quantified. This memory is formally called temporal dependence and allows quantifying it for each river system. The ability to capture that temporal signature has been analyzed through different methods and techniques. However, there is a high heterogeneity on those methods´ analytical capacities. It is found in this research that the most advanced ones are those whose output provides a dynamic and quantitative assessment of the temporal dependence for each river system runoff. Since the runoff can be split into temporal conditioned runoff fractions, advanced methods provide an important improvement over classic or alternative ones. Being able to characterize the basin by calculating those fractions is a very important progress for water managers that need predictive tools for orienting their water policies to a certain manner. For instance, rivers with large temporal dependence will need to be controlled and gauged by larger hydraulic infrastructures. The application of this approach may produce huge investment savings on hydraulic infrastructures and an environmental impact minimization due to the achieved optimization of the binomial cost-benefit.
... similarity to other cases (Akbari et al. 2011;Genc and Dag 2016;Karlsson and Yakowitz 1987). GMDH neural network is known as a self-organized approach with the capability of solving extremely complex nonlinear problems (Amanifard et al. 2008;Najafzadeh and Azamathulla 2013;Najafzadeh and Lim 2015). ...
... Therefore, alternative softcomputing methods that address these limitations are needed for the modeling. Support vector machines (SVM) is an approach for generalization using structural risk minimization principle that minimizes an upper bound of the generalization error(Genc and Dag 2016;Gunn 1998). K-Nearest Neighbor (KNN) is an approach that classifies the cases based on their similarity to other cases(Akbari et al. 2011;Genc and Dag 2016; ...
... Support vector machines (SVM) is an approach for generalization using structural risk minimization principle that minimizes an upper bound of the generalization error(Genc and Dag 2016;Gunn 1998). K-Nearest Neighbor (KNN) is an approach that classifies the cases based on their similarity to other cases(Akbari et al. 2011;Genc and Dag 2016; ...
... ANNs are inspired by their biological counterparts, whereby a highly interconnected network of simplistic neurons generates a learning model that can learn arbitrarily complex non-linear functions. As such, these algorithms have been widely used in machine learning contexts, such as classification ( Cetinic, Lipic & Grgic, 2018;Melin, Miramontes & Prado-Arechiga, 2018 ), regression ( Genc & Dag, 2016;Zhou, Zhou, Yang & Yang, 2019 ), optimization ( Liu, Yuan & Liao, 2009;Yazdi, Khorasani & Faraji, 2011 ), and pattern recognition ( El-Midany, El-Baz & Abd-Elwahed, 2010;Patterson, 1996 ). The artificial neurons mimic the signal integration and activation of their biological counterparts using mathematical functions, such as sigmoidal activation functions. ...
Article
Predicting breast cancer survival is crucial for practitioners to determine possible outcomes and make better treatment plans for the patients. In this study, a hybrid data mining based methodology was constructed to differentiate the variables whose importance for survival change over time. Therefore, the importance of variables was determined for three different time periods (i.e. one, five, and ten years). To conduct such an analysis, the most parsimonious models were constructed by employing one regression analysis method—Least Absolute Shrinkage and Selection Operator (LASSO), and one metaheuristic optimization method, namely a Genetic Algorithm (GA). Due to the high imbalance between the number of survivals and deaths, two well-known resampling procedures—Random Under-sampling (RUS) and Synthetic Minority Over-sampling Technique (SMOTE)—were applied to increase the performance of the classification models. In the final stage, two data mining models, namely Artificial Neural Networks (ANNs) and Logistic Regression (LR), were utilized along with 10-fold cross-validation. Sensitivity analysis (SA) was conducted for each model to identify the importance of each variable for a certain model and time period. The obtained results revealed that certain variables lose their importance over time, while others gain importance. This information can assist medical practitioners in identifying specific subsets of variables to focus on in different periods, which will in turn lead to a more effective and efficient cancer care. Moreover, the study findings indicate that extremely parsimonious models can be developed by adopting a purely data-driven approach, rather than eliminating the variables manually. Such methodology can also be applied in treating other types of cancer.
... These models employ computational intelligence with a data set for a limited set of predictors and do not require any a priori knowledge of the mathematical relationships that interlink the predictors with the objective variable (Adamowski et al. 2012;Deo et al. 2016). SVM is an approach for generalization using the structural risk minimization principle that minimizes an upper bound of the generalization error (Genc and Dag 2016;Gunn 1998). KNN is an approach that classifies 1 Ph.D. Student, Dept. of Civil Engineering, National Institute of Technology, Rourkela 769008, India (corresponding author). ...
... ORCID: https:// orcid.org/0000-0001-9335-6214. Email: abinash.nitrkl@gmail.com the properties of a particular output based on their similarity to other outputs (Akbari et al. 2011;Genc and Dag 2016;Karlsson and Yakowitz 1987). A GMDH neural network is a self-organized approach with the capability of solving extremely complex nonlinear problems (Amanifard et al. 2008;Najafzadeh and Azamathulla 2013;Najafzadeh and Lim 2015). ...
Article
Accurate prediction of shear stress distribution along the boundary in an open channel is the key to the solution of numerous critical engineering problems. This paper investigated the distribution of boundary shear force at the cross section of meandering compound channels. The research focused on developing a model for predicting shear force using a machine learning (ML) method, the multivariate adaptive regression spline (MARS). A nonparametric regression methodology was adopted in MARS for developing a model of shear force percentage in the floodplain of two-stage meandering channels. The width ratio, relative depth, sinuosity, bed slope, and meander belt width ratio of the channel were input variables to the model. The influence of each parameter on predicting the percentage of shear force in the floodplain was also analyzed by adopting a sensitivity analysis. Performance of the MARS model was evaluated by three different machine learning techniques—the group method of data handling (GMDH), support vector machine (SVM) and k-nearest neighbor (KNN)—through different statistical measures. The results indicated that the proposed MARS model predicted the shear force percentage in the floodplain satisfactorily, with a coefficient of determination (R2) of 0.94 and 0.93 and a scatter index (SI) of 0.053 and 0.044 for the training and testing phases, respectively. Moreover, the model was successfully applied for validating the two available overbank discharge values for the Baitarani River at Anandapur (drainage area of 8,570 km2), giving the minimum errors of the evaluated methods in terms of mean absolute scaled error (MASE) of 0.014 and 0.066 for flow depths of 7.5 and 8.63 m, respectively.