Comparison of classification accuracy of SVM and KNN at motor speeds of 2000 rpm, 2500 rpm and 3000 rpm using different feature selection methods.

Comparison of classification accuracy of SVM and KNN at motor speeds of 2000 rpm, 2500 rpm and 3000 rpm using different feature selection methods.

Source publication
Article
Full-text available
Effective feature selection can help improve the classification performance in bearing fault diagnosis. This paper proposes a novel feature selection method based on bearing fault diagnosis called Feature-to-Feature and Feature-to-Category- Maximum Information Coefficient (FF-FC-MIC), which considers the relevance among features and relevance betwe...

Contexts in source publication

Context 1
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 2
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 3
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 4
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 5
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 6
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...
Context 7
... most circumstances of 3 motor speeds from Table 7, the proposed feature selection method can reach the highest diagnosis accuracy, comparing to the other 3 methods. In addition, it performs a certain adaptability and stability on SVM and KNN. ...

Similar publications

Article
Full-text available
Cancer has been generally defined as a cluster of systematic malignant pathogenesis involving abnormal cell growth. Genetic mutations derived from environmental factors and inherited genetics trigger the initiation and progression of cancers. Although several well-known factors affect cancer, mutation features and rules that affect cancers are rela...
Article
Full-text available
Email is one of the most economical and fast communication means in recent years; however, there has been a high increase in the rate of spam emails in recent times due to the increased number of email users. Emails are mainly classified into spam and non-spam categories using data mining classification techniques. This paper provides a description...
Chapter
Full-text available
Due to the growing success of machine learning in the healthcare domain, medical institutions are striving to share their patients' data in the intention to build more accurate models which will be used to make better decisions. However, due to the privacy of the data, they are reluctant. To build the best models, they have to make the best feature...

Citations

... The consequences of inaccuracies may arise if indicators fail to encapsulate essential machinery insights, and an excess of indicators can result in overwhelming data. Strategies such as sequential backward selection (SBS) [42][43][44][45], sequential forward selection (SFS) [46][47][48], and recursive feature elimination (RFE) [49][50][51] address this challenge by systematically eliminating redundant indicators. We advocate for SBS, as it excels in methodically removing the least relevant indicators, optimizing both model interpretability and performance, in contrast to SFS and RFE. ...
Article
Full-text available
Bearings represent crucial components within rotating machinery, and unexpected failures can lead to significant damage and unplanned breakdowns. This paper introduces a novel approach to diagnose bearing faults under variable working conditions, leveraging the complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and sequential backward selection (SBS). CEEMDAN automatically selects intrinsic mode functions (IMFs) from vibration and current signals to establish a comprehensive set of health indicators. Subsequently, the SBS algorithm identifies the most pertinent indicators for different bearing failure modes. The accuracy of the proposed method is evaluated on both vibration and electrical signals using data from a dedicated test bench at the Signal and Industrial Process Analysis Laboratory (LASPI). Results demonstrate the effectiveness of the proposed method in accurately identifying and classifying bearing faults across various working conditions, utilizing both types of signals. This approach holds promise for real-world industrial applications, offering a reliable method for condition monitoring and diagnostics in bearing systems.
... Secondly, the adaptive algorithm is employed to determine the optimal zero crossing rate [21], and each signal component is divided into two parts of high frequency and low frequency. Thirdly, the Maximum Information Coefficient (MIC) [22] is used to filter the features of each signal component. Then, the high frequency signal components and their features are used to forecast with the Informer model, while the low frequency signal components and their features are used to forecast with the LSTM model [23]. ...
Article
Full-text available
Accurate and fast forecasting of short-term load is conducive to the safe and stable operation of the power system, and a short-term power load combination forecasting method based on feature extraction is proposed. Firefly Sparrow Algorithm (FSA) is applied to find the optimal combination of influencing parameters in Variational Mode Decomposition (VMD) to obtain the signal components with the best effect. Since the signal components contain different influence characteristics and timing information, the Maximum Information Coefficient (MIC) is used to screen the features of each signal component, establish the feature matrix, and use the over-zero rate as an index to determine the high and low frequency signal demarcation points. Based on the different characteristics of high and low frequency signals, the Informer model is used to forecast the high frequency signal components, and the LSTM is used to forecast the low frequency signal components. All the forecasting results are reconstructed to obtain the final forecasting value. Taking the Spanish power load data as an example, considering the actual seasonal factors, and experimentally comparing with other forecasting models, the results show that after the feature screening, the errors are significantly reduced, and the decidability coefficient is significantly improved, which verifies the accuracy and universality of the model proposed in this paper.
... The consequences of inaccuracies may arise if indicators fail to encapsulate essential machinery insights, and an excess of indicators can result in overwhelming data. Strategies such as Sequential Backward Selection (SBS) [40][41][42][43], Sequential Forward Selection (SFS) [44][45][46], and Recursive Feature Elimination (RFE) [47][48][49] address this challenge by systematically eliminating redundant indicators. We advocate for SBS, as it excels in methodically removing the least relevant indicators, optimizing both model interpretability and performance, in contrast to SFS and RFE. ...
Preprint
Full-text available
Bearings represent crucial components within rotating machinery, and unexpected failures can lead to significant damage and unplanned breakdowns. This paper introduces a novel approach to diagnose bearing faults under variable working conditions, leveraging the Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and Sequential Backward Selection (SBS). CEEMDAN automatically selects intrinsic mode functions (IMFs) from vibration and current signals to establish a comprehensive set of health indicators. Subsequently, the SBS algorithm identifies the most pertinent indicators for different bearing failure modes. The accuracy of the proposed method is evaluated on both vibration and electrical signals using data from a dedicated test bench at the Signal and Industrial Process Analysis Laboratory (LASPI). Results demonstrate the effectiveness of the proposed method in accurately identifying and classifying bearing faults across various working conditions, utilizing both types of signals. This approach holds promise for real-world industrial applications, offering a reliable method for condition monitoring and Diagnostics in bearing systems.
... Some researchers have explored the application of these algorithms in mangrove classification. Tang et al. [13] utilized the maximal information coefficient (MIC) to measure the nonlinear and non-functional relationships between features and eliminate redundant and irrelevant features, thereby improving diagnostic accuracy. Fei et al. [14] used random forest (RFS) to screen the extracted features and determine the optimal number of features and sensitive bands in classifying cotton. ...
Article
Full-text available
Mangrove forests, mostly found in the intertidal zone, are among the highest-productivity ecosystems and have great ecological and economic value. The accurate mapping of mangrove forests is essential for the scientific management and restoration of mangrove ecosystems. However, it is still challenging to perform the rapid and accurate information mapping of mangrove forests due to the complexity of mangrove forests themselves and their environments. Utilizing multi-source remote sensing data is an effective approach to address this challenge. Feature extraction and selection, as well as the selection of classification models, are crucial for accurate mangrove mapping using multi-source remote sensing data. This study constructs multi-source feature sets based on optical (Sentinel-2) and SAR (synthetic aperture radar) (C-band: Sentinel-1; L-band: ALOS-2) remote sensing data, aiming to compare the impact of three feature selection methods (RFS, random forest; ERT, extremely randomized tree; MIC, maximal information coefficient) and four machine learning algorithms (DT, decision tree; RF, random forest; XGBoost, extreme gradient boosting; LightGBM, light gradient-boosting machine) on classification accuracy, identify sensitive feature variables that contribute to mangrove mapping, and formulate a classification framework for accurately recognizing mangrove forests. The experimental results demonstrated that using the feature combination selected via the ERT method could obtain higher accuracy with fewer features compared to other methods. Among the feature combinations, the visible bands, shortwave infrared bands, and the vegetation indices constructed from these bands contributed the greatest to the classification accuracy. The classification performance of optical data was significantly better than SAR data in terms of data sources. The combination of optical and SAR data could improve the accuracy of mangrove mapping to a certain extent (0.33% to 4.67%), which is essential for the research of mangrove mapping in a larger area. The XGBoost classification model performed optimally in mangrove mapping, with the highest overall accuracy of 95.00% among all the classification models. The results of the study show that combining optical and SAR remote sensing data with the ERT feature selection method and XGBoost classification model has great potential for accurate mangrove mapping at a regional scale, which is important for mangrove restoration and protection and provides a reliable database for mangrove scientific management.
... MIC is a way to measure the degree of correlation between two variables [27]. When calculating the degree of correlation, this method is independent of the data distribution, is not restricted to a specifc form of the correlation function, and is fairer and more extensive than other methods. ...
... . , x 2 n , then the MI of the feature state variables is as follows [27]: ...
Article
Full-text available
As a significant component of rotation machinery, bearing plays a role in supporting and transmitting power. However, bearings are subject to complex operating conditions and are prone to failure. To avoid ineffectiveness and improve the reliability of bearings, a data-driven method is used to predict the remaining useful life (RUL). However, this method is less stable and can only forecast the RUL of bearings under training sample conditions. An ensemble deep, long-term, and short-term memory (EDLSTM) method is proposed to solve this problem. First, the feature of the forecast-bearing RUL was extracted including time-domain features, frequency-domain energy features, and Shannon entropy. Then, a deep long- and short-term memory network prediction model of the bearing RUL was constructed. To resolve the instability of DLSTM predictions, multiple DLSTMs were ensembled using the maximum information component (MIC) criterion. The model i trained using bearing data with different failure modes under difficult operating conditions to improve the predictive stability of the model. Finally, an EDLSTM was constructed to achieve the bearing RUL prediction. In the prediction result of the training set, the cumulative relative accuracy (CRA) was above 0.9 for most of the bearings. According to the experimental results in the test set, the mean CRA was over 0.80. For some of the bearing’s RUL, the CRA was more than 0.90. The above results show that the proposed approach can effectively predict the RUL of a bearing and has a more stable prediction ability than the bagging integration method.
... Zheng et al. [20] proposed a bearing failure diagnosis method using Laplacian scores for the selection of features. The method uses multi-scale fuzzy entropy to characterize the complexity and irregularity of rolling bearing vibration signals and sorts the feature Tang et al. [21] proposed a feature selection method based on the maximum information coefficient to improve bearing fault diagnosis. This method uses the maximum information coefficient to consider the correlation between features and the correlation between features and fault categories for feature selection. ...
Article
Full-text available
In order to solve the low accuracy in rolling bearing fault diagnosis caused by irrelevant and redundant features, a feature selection method based on a clustering hybrid binary cuckoo search is proposed. First, the measured motor signal is processed by Hilbert–Huang transform technology to extract fault features. Second, a clustering hybrid initialization technique is given for feature selection, combining the Louvain algorithm and the feature number. Third, a mutation strategy based on Levy flight is proposed, which effectively utilizes high-quality information to guide subsequent searches. In addition, a dynamic abandonment probability is proposed based on population sorting, which can effectively retain high-quality solutions and accelerate the convergence of the algorithm. Experimental results from nine UCI datasets show the effectiveness of the proposed improvement strategy. The open-source bearing dataset is used to compare the fault diagnosis accuracy of different algorithms. The experimental results show that the diagnostic error rate of this method is only 1.13%, which significantly improves classification accuracy and effectively realizes feature dimension reduction in fault datasets. Compared to similar methods, the proposed method has better comprehensive performance.
... Based on the MIC feature selection method [34], a de-redundancy feature subset DE MIC FF can be obtained by de-redundancy operation. ...
Article
Full-text available
High precision and multi information prediction results of bearing remaining useful life (RUL) can effectively describe the uncertainty of bearing health state and operation state. Aiming at the problem of feature efficient extraction and RUL prediction during rolling bearings operation degradation process, through data reduction and key features mining analysis, a new feature vector based on time-frequency domain joint feature is found to describe the bearings degradation process more comprehensively. In order to keep the effective information without increasing the scale of neural network, a joint feature compression calculation method based on redefined degradation indicator (DI) was proposed to determine the input data set. By combining the temporal convolution network with the quantile regression (TCNQR) algorithm, the probability density forecasting at any time is achieved based on kernel density estimation (KDE) for the conditional distribution of predicted values. The experimental results show that the proposed method can obtain the point prediction results with smaller errors. Compared with the existing quantile regression of long short-term memory network(LSTMQR), the proposed method can construct more accurate prediction interval and probability density curve, which can effectively quantify the uncertainty of bearing running state.
... The maximum information coefficient (MIC) is a statistic proposed to measure the strength of association between two variables by Reshef and co-authors. in 2011 [15]. The MIC cannot only be used to measure the linear relationships and nonlinear relationships between two variables but also to uncover the non-functional dependencies broadly between two variables [16]. Hence, in this work, we propose to study the association between the tangential and normal FIV by utilizing the MIC. ...
Article
The ship stern shaft-bearing system is a typical nonlinear system, and its dynamic behavior is complex. In order to investigate the evolvement of dynamic behavior from the system, the attractor theory is introduced. Based on the calculated dynamic responses of the established dynamic model, the system's attractors were reconstructed, and the evolvement of dynamic behaviors under different rub-impact states was discussed. The results show that under the full annular rub-impact state, the attractor of the system is limit cycle attractor, and the volume of it increases gradually; under the partial rub-impact state, the attractor of the system goes through the process of breaking the tours attractor into a chaotic attractor and then converging into a tour attractor, and the volume of the attractor increases at first and then decreases; under the no rub-impact state, the attractor of the system is limit cycle attractor, and its volume decreases gradually. These characters of attractors demonstrate that under different rub-impact states, the system has experienced periodic motion, quasi-periodic, and chaotic motion. The attractors also reveal the evolutions of convergence and divergence.
... The maximum information coefficient (MIC) is a statistic proposed to measure the strength of association between two variables by Reshef and co-authors. in 2011 [15]. The MIC cannot only be used to measure the linear relationships and nonlinear relationships between two variables but also to uncover the non-functional dependencies broadly between two variables [16]. Hence, in this work, we propose to study the association between the tangential and normal FIV by utilizing the MIC. ...
Article
Full-text available
A reciprocating running-in experiment is carried out on a friction-abrasion testing machine with disk-pin friction pair. The friction-induced vibration (FIV) signals measured in the experiments are identified by the maximum information coefficient (MIC) method. Experimental investigation shows that the association strength between the identified tangential and normal FIV signal is in a positive correlation with the coefficient of friction. The two-directional FIV signals distribute in the same frequency range, and their root mean square (RMS) variations are in similar accord to the changing of the coefficient of friction and can indicate the wear state evolution of the disk-pin friction pair from the running-in wear to stable wear. Therefore, the FIV signals can be identified by the MIC method.
... Therefore, the fault information of mechanical equipment can be reflected by global features from time-domain and frequency-domain. The global features from time-domain and frequency-domain used in this work are shown in Table 3. Table 3. Global features in the time-domain and frequency-domain [30,31]. In this experiment, m was set to be 25. ...
Article
Full-text available
A convolutional neural network (CNN) has been used to successfully realize end-to-end bearing fault diagnosis due to its powerful feature extraction ability. However, the CNN is prone to focus on local information, ignoring the relationship between the whole and the part of the signal due to its unique structure. In addition, it extracts some fault features with poor robustness under noisy environment. A novel diagnosis model based on feature fusion and feature selection, GL-mRMR-SVM, is proposed to address this problem in this paper. First, the model combines the global features in the time-domain and frequency-domain of the raw data with the local features extracted by CNN to make full use of the signal information and overcome the weakness of traditional CNNs neglecting the overall signal. Then, the max-relevance min-redundancy (mRMR) algorithm is used to automatically extract the discriminative features from the fused features without any prior knowledge. Finally, the extracted discriminative features are input into the SVM for training and output the fault recognition results. The proposed GL-mRMR-SVM model was evaluated through experiments on bearing data of Case Western Reserve University (CWRU) and CUT-2 platform. The experimental results show that the proposed method is more effective than other intelligent diagnosis methods.