ArticlePDF Available

Fault diagnosis of key components in the rotating machinery based on Fourier transform multi-filter decomposition and optimized LightGBM

Measurement Science and Technology

October 2020
32(1)

DOI:10.1088/1361-6501/aba93b

Authors:

Changhe Zhang

Huazhong University of Science and Technology

Qi Xu

Huazhong University of Science and Technology

Kaibo Zhou

Huazhong University of Science and Technology

Show all 5 authorsHide

Rotating machinery is a primary element of mechanical equipment, and thus fault diagnosis of its key components is very important to improve the reliability and safety of modern industrial systems. The key point to diagnose the faults of these components is to extract effectively the hidden fault information. However, the actual vibration signals of rotating machinery have nonlinear and non-stationary characteristics, so traditional signal decomposition methods are unable to extract the frequency components accurately, leading to spectrum overlap of the decomposed sub-signals. Therefore, a rotating machinery fault diagnosis approach based on Fourier transform multi-filter decomposition (FTMFD), fuzzy entropy (FE), joint mutual information maximization (JMIM), and a light gradient boosting machine (LightGBM), is proposed in this paper. FTMFD is used to extract the frequency domain information of the raw vibration signals, whereas FE is used to calculate and extract the fault information of the decomposed sub-signals. Then feature selection is carried out by using JMIM to reduce the influence of redundant features on data analysis and classification accuracy. Furthermore, LightGBM is used to rank the candidate features and outputs the fault diagnosis result. Experimental results from two real datasets show that the proposed method achieves higher accuracy with fewer features than some existing methods for fault recognition. Various working conditions are also considered and verified.

Flowchart of the proposed fault diagnosis approach.

…

Fourier transform multi-filter decomposition.

…

Flowchart of the FE method.

…

Experimental platform for rotor test [49].

…

Different proportions of training and testing samples.

…

Figures - available from: Measurement Science and Technology

This content is subject to copyright. Terms and conditions apply.

Content uploaded by Changhe Zhang

Content may be subject to copyright.

Measurement Science and

Technology

PAPER

Fault diagnosis of key components in the rotating

machinery based on Fourier transform multi-filter

decomposition and optimized LightGBM

To cite this article: Changhe Zhang

et al

2021

Meas. Sci. Technol.

32 015004

View the article online for updates and enhancements.

You may also like

Fast prediction of reservoir permeability

based on embedded feature selection and

LightGBM using direct logging data

Kaibo Zhou, Yangxiang Hu, Hao Pan et al.

Disruption prediction and model analysis

using LightGBM on J-TEXT and HL-2A

Y Zhong, W Zheng, Z Y Chen et al.

Estimation of Stellar Atmospheric

Parameters with Light Gradient Boosting

Machine Algorithm and Principal

Component Analysis

Junchao Liang, Yude Bu, Kefeng Tan et

al.

This content was downloaded from IP address 115.156.143.170 on 03/04/2024 at 22:42

Measurement Science and Technology

Meas. Sci. Technol. 32 (2021) 015004 (13pp) https://doi.org/10.1088/1361-6501/aba93b

Fault diagnosis of key components in

the rotating machinery based on Fourier

transform multi-filter decomposition and

optimized LightGBM

Changhe Zhang, Li Kong, Qi Xu, Kaibo Zhouand Hao Pan

Key Laboratory of Image Processing and Intelligent Control of Education Ministry, School of Articial

Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, People’s

Republic of China

E-mail: xuqi@hust.edu.cn

Received 12 April 2020, revised 13 July 2020

Accepted for publication 24 July 2020

Published 23 October 2020

Abstract

Rotating machinery is a primary element of mechanical equipment, and thus fault diagnosis of

its key components is very important to improve the reliability and safety of modern industrial

systems. The key point to diagnose the faults of these components is to extract effectively the

hidden fault information. However, the actual vibration signals of rotating machinery have

nonlinear and non-stationary characteristics, so traditional signal decomposition methods are

unable to extract the frequency components accurately, leading to spectrum overlap of the

decomposed sub-signals. Therefore, a rotating machinery fault diagnosis approach based on

Fourier transform multi-lter decomposition (FTMFD), fuzzy entropy (FE), joint mutual

information maximization (JMIM), and a light gradient boosting machine (LightGBM), is

proposed in this paper. FTMFD is used to extract the frequency domain information of the raw

vibration signals, whereas FE is used to calculate and extract the fault information of the

decomposed sub-signals. Then feature selection is carried out by using JMIM to reduce the

inuence of redundant features on data analysis and classication accuracy. Furthermore,

LightGBM is used to rank the candidate features and outputs the fault diagnosis result.

Experimental results from two real datasets show that the proposed method achieves higher

accuracy with fewer features than some existing methods for fault recognition. Various working

conditions are also considered and veried.

Keywords: Fourier transform multilter decomposition, fuzzy entropy, joint mutual information

maximization, LightGBM classier, rotating machinery, fault diagnosis

(Some gures may appear in colour only in the online journal)

1. Introduction

Rotating machinery is widely used in industrial production

[1–3]. However, its primary components are likely to be dam-

aged during use due to the complex and harsh working envir-

onment, which severely inuences the production safety of

modern industrial systems [4,5]. Therefore, it is of great

importance to carry out an investigation on fault diagnosis

for the key components of rotating machinery. Generally

speaking, the fault diagnosis of rotating machinery based on

vibration signal analysis is composed of three main steps:

vibration signal extraction, fault feature extraction and fault

pattern recognition [1,6]. The feature extraction is most

important and often directly affects the nal diagnosis res-

ult [7,8]. It has been reported that the commonly used

signal analysis or fault feature extraction methods include

time-domain analysis, frequency-domain analysis and time-

frequency domain analysis [9–11]. However, the vibration

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

monitoring signal of rotating machinery is often nonlinear

and non-stationary, and some investigators have shown that

it is difcult to extract effectively the fault features from

non-stationary signals by using the time domain or frequency

domain methods [8,11,12].

In recent years, the entropy-based feature extraction meth-

ods have been widely used in signal analysis, image pro-

cessing, mechanical fault diagnosis and so on [13,14]. Entropy

is a measure of the randomness or disorder of time series,

which mainly includes approximate entropy (AE) [15], sample

entropy (SE) [16], fuzzy entropy (FE) [12,17], permutation

entropy (PE) [14,18], dispersion entropy (DE) [19], and sym-

bol dynamics entropy (SDE) [7], etc. Although these entrop-

ies measure the complexity of time series on a single scale,

the information on other scales are ignored [20]. Costa et al

combined a multiscale procedure with SE to obtain multiscale

entropy (MSE) [21]. However, the multiscale analysis made

use of the coarse-graining procedure with reduced data length

and an increased scale factor, leading to inaccurate estimation

[7]. Even some improved multiscale analysis methods were

also coarse-grained regarding the time series used to carry out

low-pass ltering of the vibration signal, which may have res-

ulted in the loss of high-frequency information [12,13,22,23].

In addition, some adaptive time-frequency decomposition

methods are widely used in the eld of fault diagnosis, such

as empirical mode decomposition (EMD) [24], local mean

decomposition (LMD) [25], variational mode decomposition

(VMD) [26] and so on. However, EMD suffers from the end

effect, mode mixing, envelope overshoot and undershoot [27],

whereas LMD has the defects of the endpoint effect, mode ali-

asing and low computational efciency [27,28]. For VMD

with anti-mode aliasing and noise robustness, the optimiz-

ation calculation requires a large amount of computational

resources, and the parameters such as penalty factor aand

mode number Kneed to be dened in advance [14,29]. Dif-

ferent from these decomposition methods, which lost some

frequency components of the original signal, the wavelet

packet decomposition (WPD) retains low-frequency and high-

frequency information well, with the decomposition results

largely depending on the selection of wavelet basis function

(WBF) and decomposition layers [30,31].

In this paper, the Fourier transform multi-lter decompos-

ition (FTMFD) is combined with FE to obtain sufcient fault

information from the time series for fault diagnosis. FTMFD is

an adaptive decomposition method used to completely retain

the fault information of low and high frequencies [32], while

it uses FE to replace the Heaviside function with a Gaussian

function in order to avoid the drawbacks of SE [17], is bene-

cial for extracting fault information from vibration signals

with good robustness, and has high sensitivity to the dynam-

ical change and insensitivity to background noise [6,12,20].

Therefore, FTMFD is used to decompose the vibration signal,

and then the FE values of decomposed sub-signals are calcu-

lated to form fault feature vectors with the advantage of the

information entropy method in measuring the dynamic char-

acteristics of the time series.

In order to reduce the redundant features and improve the

classication accuracy, feature selection is usually used to

nd the optimal feature subset based on the extracted fault

features [33,34]. For the lter feature selection based on

mutual information, Peng et al studied max-relevance and

min-redundancy (mRMR) based on maximum-dependency,

maximum-correlation and minimum-redundancy criteria [35],

which is widely used in the eld of fault diagnosis and its only

disadvantage is that the size of the mutual information is con-

sidered after the addition of a single feature [7,13]. In addition,

feature selection methods based on joint mutual information

(JMI) are also widely used, such as joint mutual information

[36] and joint mutual information maximization (JMIM) [37].

JMI ignores the case of single feature correlation so that the

correlation between two or more features is reduced, whereas

JMIM considers the overall stability of JMI to ensure the sta-

bility of the selected features.

Since the fault recognition and diagnosis of rotating

machinery are carried out based on the optimal feature sub-

set, the classication algorithm used in the nal stage dir-

ectly affects the performance of diagnosis methods. The

commonly used classiers include support vector machine

(SVM) [38], random forests (RF) [34], stacked auto-encoders

(SAEs) [7,39], convolutional neural network (CNN) [40], and

gradient boosting decision tree (GBDT) frameworks such as

XGBoost [41], light gradient boosting machine (LightGBM)

[42] and CatBoost [43]. Compared with the traditional meth-

ods, LightGBM is a fast and efcient classication algorithm,

exhibiting a good performance in many machine learning

tasks, e.g. regression, classication, sorting and so on [42,44].

Based on the GBDT algorithm, it is possible to obtain the con-

tribution of each feature in model training for the development

of an embedded feature selection (EFS) method.

In this paper, a feature extraction method combining

FTMFD with FE is proposed to solve the problem of fre-

quency information loss and low computing efciency exist-

ing in some traditional methods, where FTMFD is used to

decompose the signal with the advantage of FE in measuring

the dynamic characteristics of the time series. Then the JMIM

and LightGBM method is used to extract the effective fea-

tures to reduce redundant features and simplify the classier

modeling. A Bayesian optimization algorithm is further intro-

duced to optimize the hyperparameters in the LightGBM clas-

sier to improve the classication accuracy of fault diagnosis,

which is also considered to be available in other classication

algorithms. This paper is organized as follows. Section 2intro-

duces the proposed approach and the methodology. Section 3

shows the experiments results of the proposed approach with

an MFS dataset. Further experimental verication and com-

parison results using KAT bearing datasets are discussed in

section 4. The conclusion and future research are presented in

section 5.

2. The proposed approach

The owchart of the fault diagnosis approach for the key com-

ponents in the rotating machinery is shown in gure 1. Firstly,

FTMFD is applied to the original time series signal to extract

the frequency domain information of the vibration signal, and

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Figure 1. Flowchart of the proposed fault diagnosis approach.

FE is used to calculate and extract the fault information of

the decomposed sub-signals. Then, JMIM is used to select the

candidate feature set F1. On this basis, LightGBM is used to

rank the candidate features in F1; according to the ranking res-

ult these features are added in turn with labels to form a new

dataset, so that the curve of feature number and classication

accuracy can be obtained. When the classication accuracy

reaches the maximum, the features used are composed into the

nal selected feature set F2. Finally, the LightGBM classier

is used to train and classify these selected fault features. To

verify the effectiveness of the proposed approach, two kinds

of datasets are used in this paper.

2.1. Signal preprocessing and feature extraction method

2.1.1. Fourier transform multi-lter decomposition. Given a

time series {x(i)}, i=−∞,…,−1,0,1,…,+∞, the Fourier

transform of {x(i)} is dened as

X(ejω) =

+∞

∑

n=−∞

x(i)e−jωi,(1)

Figure 2. Fourier transform multi-lter decomposition.

where fis the frequency and ω=2πfdenotes the angular fre-

quency. The inverse Fourier transform of X(ejω) is

x(i) = 1

2πˆ2π

X(ejω)ejωidω. (2)

In practical engineering applications, most of the non-

stationary vibration signals collected are limited digital dis-

crete signals. Assume the length of time series {x(i)} is N, then

equations (1) and (2) can be rewritten as

X(k) =

∑

i=1

x(i)e−j2πki/N,(3)

and

x(i) = 1

∑

k=1

X(k)ej2πki/N,(4)

where kdenotes the frequency components.

An important application of the Fourier transform is sig-

nal ltering [45], which can be summarized in three steps.

First, the Fourier transform is performed to transform the sig-

nal from the time domain to the frequency domain through

equation (3); second, with the lter H(k), some required fre-

quency components are retained and other unnecessary fre-

quency components are ltered out of the spectrum, that is

X∗(k) = X(k)H(k) = {X(k),ks≤k≤ke

0,other ,(5)

where ksand kecorrespond to the start frequency and cutoff

frequency of H(k), respectively. Finally, the inverse Fourier

transform is performed on the ltered spectrum X∗(k), and the

ltered signal x∗(i) is

x∗(i) = 1

NX∗(k)ej2πki/N.(6)

The idea of FTMFD is to lter the signals by lters with differ-

ent passbands, and then carry out the inverse Fourier transform

on the ltered spectrums to obtain sub-signals with different

frequency components, as shown in gure 2.

Let ∆fi=[fis, fie ], i=1, 2, …, n, where ∆firepresents the

lter passband of lter Hi(f), and fis and fie represent the start

frequency and cutoff frequency of Hi(f) respectively.

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

If FTMFD is designed to be adaptive and the number of l-

ters is n, then according to a certain strategy, the signal spec-

trum can be divided into nparts and ndifferent sub-bands can

be obtained. The relationship of the sub-bands satises

{∆f1∪∆f2∪ · · · ∪ ∆fn= (0,Fs/2)

∆fi∩∆fj=ϕ, i=j,1≤i,j≤n,(7)

where Fsis the sampling frequency, and the original signal

s(t) is equal to the sum of all sub-signals si(t), i=1, 2, …, n,

namely

s(t) =

∑

i=1

si(t),i=1,2,..., n.(8)

In this paper, the spectrum is evenly divided according to log-

arithmic coordinates; then |∆fi|=|∆fj|, i=j, 1 ⩽i, j ⩽N.

Moreover, the spectrum can be divided by other strategies,

such as energy, which requires certain prior knowledge such

as frequency characteristics of vibration signals. In general,

FTMFD is a exible decomposition strategy, in which the

details can be adjusted according to actual needs.

2.1.2. Fuzzy entropy. FE is an improvement on SE, in which

the Heaviside function is replaced by a Gaussian function to

measure the similarity between two vectors, which can effect-

ively overcome the shortcoming of SE in practical applica-

tions.

For a given n-dimensional time series x(i), i=1, 2, …, N,

the similarity of FE is dened as follows.

ij =µ(dm

ij ,n,r) = e−ln2(dm

ij /r)n,(9)

where ris the similarity tolerance. The dm

ij represents the dis-

tance between Xm

iand Xm

j. Dene the function φmas

φm(n,r) = 1

N−m

∑

i=1



N−m−1

N−m

∑

j=1,j=i

ij 

.(10)

Then, FE can be expressed as

FE(m,n,r,N) = lnφm(n,r)−lnφm+1(n,r).(11)

The owchart of the FE method is shown in gure 3.

2.2. Feature selection method

2.2.1. Joint mutual information maximization. JMIM uses the

following iterative greedy search algorithm to nd the relevant

feature subset of size kin the feature space.

I(fi,fs;C)=I(fs;C)+I(fi;C|fs),(12)

(1) For a feature set F={f1,f2, …, fN}, the feature selec-

tion process identies a feature subset Swith dimension k,

where k≤N, and S⊆F. Theoretically, the selected feature

subset Sshould maximize the JMI between class label C

and feature subset Swith xed dimension k.

Figure 3. Flowchart of the FE method.

(2) Calculate the value of JMI between fi, fsand C:

where I(fs; C) represents the value of mutual information

between fsand C. The larger it is, the stronger the correlation

between fsand C. I (fi; C|fs) represents the value of mutual

information between fiand Cunder condition fs.

fJMIM =argmaxfi∈SI(fi,fs;C).(13)

(3) JMIM selects features according to the following criteria:

According to equation (13), JMIM considers the value of

each I(fi, fs; C). After the addition of feature fi, there is at

least feature fsin the subset, which makes the value of I(fi, fs;

C) larger than the condition of other features added.

2.2.2. Embedded feature selection with LightGBM. EFS

methods use the performance of the learning algorithm to eval-

uate the quality of the feature subset. Firstly, for a feature sub-

set to be evaluated, the EFS method need to train the classier

in advance. After that, the weight coefcients of each feature

can be obtained according to certain indexes, such as the regu-

larization term or loss function. Finally, the features are selec-

ted and ranked according to the weight coefcients.

Most GBDT algorithms, such as XGBoost, use an inef-

cient decision tree growth strategy called the level-wise

method. It is replaced by the leaf-wise method in LightGBM

to split the nodes of the weak learner. When the tree model

is selected as the basic learner of LightGBM, the sum of the

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

information gain or the frequency used by each feature during

the splitting process can be obtained after training the model,

and accordingly the features used can be ranked.

2.3. Bayesian optimization

In this paper, a Bayesian optimization [46] algorithm is

considered to optimize the hyperparameters of classication

model. The main idea of Bayesian optimization is that, for

a given optimized objective function, the posterior distribu-

tion of the objective function is updated by constantly adding

sample points, until the posterior distribution is basically t-

ted to the real distribution, so as better to adjust the current

parameters. There are two core processes in Bayesian optim-

ization: prior function (PF) and acquisition function (AC). To

achieve the objective function, the balance between explora-

tion and exploitation must be considered.

Suppose δ=δ1,δ2,…, δnrepresents hyperparameters of

classier C, and Dtrain and Dvalid are the training set and valid-

ation set, respectively. A(C, δ, Dtrain , Dvalid) and L(C, δ, Dtrain ,

Dvalid) denote the classication accuracy and validation loss

of C, respectively. K-fold cross-validation is applied and the

objective function of optimization can be described as

f(δ) = argmax(1

∑

i=1

A(C,δ,Dtrain,Dvalid )),(14)

f(δ) = argmin(1

∑

i=1

L(C,δ,Dtrain,Dvalid )).(15)

During the parameter optimization the model is trained con-

tinuously, whereas the classication performance for each

parameter combination is evaluated by calculating the object-

ive function. Compared with grid search or random search

[47], the advantages of Bayesian optimization lie in the fol-

lowing: rstly, the Gaussian process is adopted to continu-

ously update the prior by considering the information of pre-

vious parameters; secondly, the number of iterations is small

and the speed is fast; nally, for non-convex problems, it is

still robust and the result is globally optimal rather than locally

optimal.

3. Case I: MFS dataset verification

3.1. Data description and experimental setup

Firstly, a dataset of mixed rotor and bearing faults from

the Machinery Fault Simulator (MFS) platform was used to

verify the effectiveness of the proposed approach [48,49].

As shown in gure 4, the experimental platform (model

number: MFS2010-PK3) adopted here is developed by the

Spectra Quest company in the United States [48], which is

composed of an AC motor, coupling, acceleration sensor,

rotor, rolling bearing, centering adjustment plate, data acquis-

ition box and inverter. The data were collected under a

Table 1. Details of 10 types of faults.

Fault type Label Fault type Label

Central bent 1 Ball defect 6

Cocked rotor 2 Inner race defect 7

Couple bent 3 Outer race defect 8

Eccentric rotor 4 Normal 9

Unbalanced rotor 5 Combination defect of

inner and outer race

Figure 4. Experimental platform for rotor test [49].

single operational condition with the 6 kHz sampling fre-

quency, and the motor speed was 2100 rpm. The data-

set used includes 10 fault types, the details are shown in

table 1.

These fault types have a total of 1600 samples with 160

per type and 1000 data points for each sample. Python 3.7.3

is used for algorithm design and development in this paper,

and the experimental platform is congured with Intel Core

i5-6000hq CPU and 12 G RAM.

3.2. Parameter settings

Here, the number of lters of FTMFD is set as 32. That is to

say, the frequency band of each sample is evenly divided into

32 parts, and FE values of each sub-signal are calculated sep-

arately. Therefore, each sample corresponds to 32 fault fea-

tures. The parameters of FE are set as follows: the embedding

dimension m=2, the time delay λ=1, the similarity tolerance

r=0.15δ(δis the standard deviation of time series), and the

gradient of similar tolerance n=2. The dataset is composed

of extracted features and the category labels are divided into

training set and testing set. In order to eliminate the inuence

of contingency in sample division, the training set is divided

into training set and validation set according to 10-fold cross

validation, and then the trained model is used for classica-

tion prediction on testing set. The main hyperparameters of

LightGBM are shown in table 2, which are determined by the

Bayesian optimization algorithm. The details of parameters of

Bayesian optimization are listed in table 3. The objective of

optimization, i.e. the output of the algorithm, is to maximize

f(δ) in equation (14).

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Table 2. The main hyperparameters of LightGBM.

Parameter Value

Objective Multiclass

Number of classes 10

Learning rate 0.545

Number of boosting iterations 827

L1 regularization 0.001

L2 regularization 0.268

Max number of leaves in one tree 5

Limit the max depth for tree model 4

The number of seeds used to generate other seeds 50

Table 3. Details of main hyperparameters setting of Bayesian

optimization.

Parameter Value

Prior Function Gaussian process regression

Acquisition Function Probability of Improvement

Random State 30

Init_Pointsa100

N_iterb100

aInit_points is the number of steps of random exploration needs be

performed.

bN_iter is the number of steps needs be performed of Bayesian optimization.

3.3. Discussion

3.3.1. Research on different proportions of training samples.

The number of training samples will affect the classication

accuracy. In order to illustrate the advantages of the proposed

feature extraction method and LightGBM classier, different

proportions of training and testing samples are set in this sec-

tion. The classication results and training time are shown in

gure 5. The classication accuracy of the training sets reaches

100% of all experiments. When the ratio of training to testing

is set as 9:1, the accuracy of testing set reaches 100%. How-

ever, considering that there is accidental inuence with small

testing samples, and when the ratio is set as 3:2, the trained

model performs well in both validation set and testing set, so

we take the results at this time as the nal classication res-

ult. There is a positive correlation between model training time

and the ratio, but due to the speed of LightGBM, the training

time uctuates smoothly, which ranges from 1.0 s to 3.1 s.

3.3.2. Comparison with different decomposition methods.

To highlight the advantages of FTMFD, four decomposition

methods, EMD, LMD, VMD and WPD, are used for com-

parison. According to the actual decomposition, the FEs of

the rst six components of the decomposition sub-signals of

EMD and the rst four components of LMD are calculated as

fault features. According to the reference [29], the IMFs of

VMD can be determined referring to EMD, which is selec-

ted as 6 in this paper. Therefore, the FEs of the rst six, four

and six components corresponding to EMD, LMD, and VMD

are calculated as features, respectively. In addition, the WBF

of WPD is selected based on the principle of the ratio of

maximum energy to Shannon entropy [30]. Considering the

Figure 5. Different proportions of training and testing samples.

Table 4. Time consumption on model training and feature

extraction of different decomposition methods.

Method Feature extraction time (s)/sample Training time (s)

FTMFD 1.22 2.76

EMD 0.01 3.18

LMD 0.21 2.29

VMD 1.28 2.53

WPD 1.59 2.78

Table 5. Details of main parameters of the t-SNE setting.

Parameter Value

Algorithm Exact

NumPCAComponents 10

Perplexity 40

NumDimensions 2

Standardize False

LearnRate 2000

characteristics of vibration signals of rotating machinery and

different WBFs, ‘db’ wavelet, ‘sym’ wavelet and ‘coif’ wave-

let are considered here, among which the ‘coif3′is selected as

the optimal WBF by calculation. If the signal is decomposed

by the level-layer WPD, the frequency resolution can be calcu-

lated as df=fs/2level+1, where the sampling frequency is 6 kHz.

The more sub-bands divided, the more computation and fea-

ture redundancy will be increased. Therefore, the number of

decomposition layer is select as 5. That is to say, the number

of sub-signals of WPD is 32, which is the same as FTMFD.

The experimental results are shown in gure 6and table 4.

As can be seen from gure 6, except for EMD, the train-

ing accuracy of all the other decomposition methods reaches

100%. The classication accuracy of EMD, LMD and VMD

in the validation and testing set are all lower than FTMFD,

and the classication results of WPD and FTMFD are similar.

As can be seen from table 4, EMD consumes the least time in

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Table 6. Details of main parameters of the different settings of the

entropy-based methods.

Parameter SE PE DE SDE

Number of classes / / 10

Time delay factor 1 1 1 1

Embedding dimension 2 3 3 3

Similarity tolerance 0.15∗STDa/ /

Symbol interval number / / / 8

aSTD is the standard deviation of a signal.

Table 7. Time consumption of model training and feature extraction

of different entropy-based methods.

Entropy Feature extraction time (s)/sample Training time (s)

FE 1.22 2.76

SE 0.85 2.80

PE 0.15 4.94

DE 3.73 3.57

SDE 0.19 3.92

feature extraction, and FTMFD is faster than WPD when the

number of decomposed sub-signals are the same.

t-SNE is a common method used in data simplication and

feature visualization [50]. Through the visualization of feature

samples by t-SNE, the advantages and disadvantages of differ-

ent methods in feature extraction can be more intuitively seen.

The main parameters of t-SNE are listed in table 5. The results

are shown in gure 7, where it is not difcult to see that the

features extracted by FTMFD and WPD are easier to distin-

guish than EMD, LMD and VMD.

3.3.3. Comparison with different entropy-based methods.

The purpose of this section is to discuss the advantages

and disadvantages of the ability to extract fault features and

time consumption of different entropy-based methods. In this

part, SEs, PEs, DEs and SDEs of sub-signals decomposed

by FTMFD are calculated respectively, and their classica-

tion results are compared with the proposed approach using

FE. The parameter settings of all entropy-based methods are

shown in table 6.

The experimental results are shown in gure 8and table 7.

The training accuracy of all the methods reaches 100%. The

classication accuracy of FTMFD-PE in the validation and

testing set are the lowest, at only 87.14% and 85.42%. The

testing accuracy of FTMFD-SE is second to FTMFD-FE. It

can be seen from table 7that FTMFD-PE is the fastest in

the amount of time of feature extraction, while the slowest in

model training time. Therefore, among these entropy-based

methods, although the feature extraction time of FE is rel-

atively long, its classication accuracy is the highest and its

model training time is the shortest.

3.3.4. Comparison with different classication algorithms.

To illustrate that the feature extraction method proposed in this

paper has satisfactory fault diagnosis capability in combina-

tion with different classiers, SVM, RF, SAE, XGBoost and

Table 8. Time consumption of model training of different

classication algorithms.

Classier Training time (s) Classier Training time (s)

LightGBM 2.76 XGBoost 5.91

SVM 0.51 CatBoost 36.64

RF 47.01 SAE 178.82

CatBoost are used in this section to compare with LightGBM.

The main hyperparameters of these classication algorithms

are still determined by the Bayesian optimization algorithm.

The features extracted by FTMFD-FE are input into these dif-

ferent classiers, and the classication results are shown in g-

ure 9. As can be seen in gure 9, except for SVM and SAE, the

training accuracy of other classiers reaches 100%. The aver-

age accuracy on the validation set of RF is highest (98.75%).

The testing accuracy of LightGBM is highest, and the per-

formance of CatBoost is second to it. The time consumption

on model training of different classiers is shown in table 8,

among which the fastest is SVM, which is only 0.51 s. Cat-

Boost and SAE are slower, and SAE consumes the longest

time (178.82 s). Therefore, if the hyperparameters are selec-

ted suitably, there is little difference in the classication results

with different classiers, while the time consumption of model

training varies greatly, and an appropriate classier should be

selected according to the actual needs. The experiment res-

ults show that fault features extracted by the proposed method

are easy to be classied and recognized, and the classication

advantages of LightGBM in the fault diagnosis task of rotating

machinery are reected.

3.3.5. Comparison with different feature selection methods.

The number of features will affect the model training time

and classication accuracy of classiers. In order to illus-

trate the advantages of the proposed feature selection method,

ReliefF [51], JMI and mRMR are compared in this section.

The rst-round candidate features are selected by the above

methods. Then LightGBM is used to rank these features; the

curve between the number of features, model training time and

classication accuracy can be obtained according to the rank-

ing results.

The experimental results are shown in gures 10(a,b). It

can be seen from gure 10(a) that the training accuracy of

JMIM-LightGBM is the highest when two features are used

(99.79%). As can be seen from gure 10(b), when the num-

ber of features is small, it is correlated with the model train-

ing time; the reason may be that having fewer features is not

conducive to model building. When the number of features is

greater than 10, the relationship becomes positive, which is

consistent with engineering experience. The proposed method

is obviously superior to other methods when fewer features

are used: when 3 features are used, the classication accuracy

is more than 90%, while the other methods require 4 or more

features; when 12 features are used, the classication accur-

acy reaches the maximum (99.22%), and the model training

time is only 1.75 s. Therefore, by using the feature selection

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Figure 6. Classication results of different decomposition methods.

Figure 7. Feature visualization of different decomposition methods using t-SNE. (a) Combination of FTMFD and FE; (b) combination of

EMD and FE; (c) combination of LMD and FE; (d) combination of VMD and FE; (e) combination of WPD and FE.

method, fewer features can be selected in order to achieve a

better classication result and a shorter model training time.

4. Case II: KAT bearing dataset verification

4.1. Data description and experimental setup

The KAT bearing damage dataset was provided by the KAT

data center at Paderborn University [52]. The hardware con-

guration and settings of the experimental platform are shown

in [52]. There are 15 datasets, which can be categorized

into three classications as shown in table 9. The K0-series

(K001–K005) represent the healthy condition, the KA-series

(KA04, KA15, KA16, KA22, KA30) represent the outer bear-

ing ring with damage and the KI-series (KI04, KI14, KI16,

KI18, KI21) represent the inner bearing ring with damage. The

experiments are conducted with four different operating para-

meters, and the details are shown in table 10. Each experiment

is repeated 20 times, and the sampling frequency is 64 kHz. It

should be noted that the damage of the datasets is real damage

caused by accelerated lifetime test [53].

The details of the datasets used for experimental verica-

tion are shown in table 11. Datasets D1 to D4 correspond to

different fault types under the same working condition, and

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Figure 8. Classication results of different entropy-based methods.

Figure 9. Classication results of different classication algorithms.

(a) (b)

Figure 10. Comparison of different feature selection methods. (a) Classication results of the training set; (b) classication results of the

testing set and model training time.

dataset D5 contains all four working conditions. Each sample

contains 2560 non-overlapping data points, with a total of

1200 samples (100 samples for each fault type in each working

condition).

4.2. Feature visualization

The parameters of FTMFD and FE are set referring to sec-

tion 3.2; the extracted features by FTMFD-FE are compressed

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Table 9. Categorization of datasets.

Healthy (Class 1)

Outer ring damage

(Class 2)

Inner ring damage

(Class 3)

K001 KA04 KI04

K002 KA15 KI14

K003 KA16 KI16

K004 KA22 KI18

K005 KA30 KI21

Table 10. Four operation parameters.

No.

Rotational

speed (rpm)

Load torque

(Nm)

Radial force

(N)

Condition

number

0 1500 0.7 1000 C0

1 900 0.7 1000 C1

2 1500 0.1 1000 C2

3 1500 0.7 400 C3

Table 11. Details of the datasets of different working conditions.

Dataset label Working condition Number of samples

D1 C0 300

D2 C1 300

D3 C2 300

D4 C3 300

D5 C0, C1, C2, C3 1200

Figure 11. T-SNE visualization of features extracted by

FTMFD-FE.

into two dimensions by t-SNE. The main parameters of t-SNE

are still those in table 5in section 3.3.2 and the results are

shown in gure 11. It can be seen that the three fault types are

quite distinct, which indicates that the proposed feature extrac-

tion method can effectively extract fault features of the rolling

bearing.

Table 12. The main hyperparameters of LightGBM.

Parameter Value

Objective Multiclass

Number of classes 3

Learning rate 0.071

Number of boosting iterations 816

L1 regularization 0.957

L2 regularization 0.583

Max number of leaves in one tree 4

Limit the max depth for tree model 2

The number of seeds used to generate other seeds 50

4.3. Discussion

For the data after feature extraction, the ratio of training to test-

ing samples is still set as 3:2. The main hyperparameters of

LightGBM are still optimized by the Bayesian optimization

algorithm, and the results are shown in table 12. The classi-

cation results when all 32 features are used are shown in g-

ure 12. The training accuracy of all the datasets reaches 100%.

For the single working condition datasets (D1 to D4), except

for D3, the testing accuracy is 97.78%; the other datasets reach

100%, and the testing accuracy of D5 is 99.72% under the most

complex working conditions. In addition, in reference [53],

the prediction accuracy of the negative correlation ensemble

transfer learning method (NCTE) on the KAT bearing dataset

is 98.73%, which is slightly lower than the accuracy achieved

by the method proposed in this paper.

4.3.1. Comparison with different feature selection methods.

According to the experiments in section 3.3.5, JMIM is still

compared with ReliefF, JMI and mRMR, and the results are

shown in gure 13. As can be seen from gure 13(a), the train-

ing accuracy of JMIM-LightGBM reaches 100% when two

features are used, while the other methods need more features.

As can be seen from gure 13(b), when only one feature is

selected, the testing accuracy of JMIM-LightGBM is relat-

ively low, only 77.78%, while mRMR-LightGBM is highest

(89.17%). However, when 2 features are selected, the test-

ing accuracy of the proposed method reaches 97.50%, and it

reaches maximum (99.72%) when 14 features are used; mean-

while the model training time is only 0.36 s. The results further

indicate the effectiveness of the feature extraction method.

4.3.2. Comparison with different classication algorithms.

In this section, LightGBM is compared with other classiers

including SVM, RF and CatBoost. JMIM is used to select can-

didate features, and the classication results are shown in g-

ure 14. As can be seen from gure 14(a), the training accuracy

of RF is highest when one feature is used, which is 99.52%,

while the accuracy of LightGBM reaches 100% when two or

more features are used. According to the classication accur-

acy curve in gure 14(b), when one features is used, the test-

ing accuracy of SVM is highest (91.11%). When the number

of features is small, there is little difference in the classica-

tion accuracy of all four classiers. But when the number of

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

Figure 12. Classication accuracy on datasets of different working conditions.

(a) (b)

Figure 13. Comparison of different feature selection methods. (a) Classication results of the training set; (b) classication results of the

testing set and model training time.

(a) (b)

Figure 14. Comparison of different classiers. (a) Classication results of training set; (b) classication results of testing set and model

training time.

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

features is greater than seven, the classication accuracy of

LightGBM is slightly higher than other classiers. In terms

of model training time, LightGBM is only second to SVM,

while CatBoost consumes the longest time. Experimental res-

ults further illustrate the speed and superiority of LightGBM

classier.

5. Conclusion

In this paper, a new fault diagnosis approach for the key com-

ponents of rotating machinery based on FTMFD-FE, JMIM

and LightGBM is proposed. For non-linear and non-stationary

mechanical vibration signals, the combination of FTMFD and

FE for monitoring signal pretreatment and feature extraction

can effectively extract hidden mechanical fault features. While

retaining the advantages of information entropy, it overcomes

the problem that traditional multiscale analysis cannot effect-

ively extract high-frequency information and helps improve

classication accuracy. On this basis, fault feature selection

based on JMIM and LightGBM is used to effectively reduce

redundant features and simplify classier model construction,

and thus the model training time can be reduced. Finally,

the effectiveness of the proposed approach is experimentally

veried on the MFS dataset and the KAT bearing dataset

by comparative experiments of signal decomposition, feature

extraction and feature selection, respectively. The experi-

mental results show that the proposed approach can effect-

ively identify the fault states for the key components of rotat-

ing machinery. Moreover, the effectiveness of the proposed

approach under multiple working conditions is also veried

on the KAT bearing dataset, and the classication accuracy

reaches 99.72%.

The actual working environment for the key components of

rotating machinery is more complex and changeable, so future

research will focus on feature extraction and classication

of vibration signals under more complex working conditions.

In addition, except for the combinination with information

entropy, the combination of FTMFD with some dimension-

less time-domain indexes such as kurtosis for feature extrac-

tion can also be researched in future work.

Acknowledgments

The work here is supported by the National Key Research

and Development Program of China (No. 2018YFB2003303),

the Fundamental Research Funds for the Central Universities

(No. 2019kfyXJJS137), the research fund (No. 61400020401),

and the Nondestructive Detection and Monitoring Techno-

logy for High Speed Transportation Facilities, Key Laborat-

ory of Ministry of Industry and Information Technology (Nos.

KL2019W003 and KL2019W004).

ORCID iDs

Changhe Zhang https://orcid.org/0000-0001-7046-9240

Qi Xu https://orcid.org/0000-0002-9795-1616

Kaibo Zhou https://orcid.org/0000-0003-0055-3193

Hao Pan https://orcid.org/0000-0001-9324-0545

References

[1] Liu R, Yang B, Zio E and Chen X 2018 Articial intelligence

for fault diagnosis of rotating machinery: a review Mech.

Syst. Signal Process. 108 33–47

[2] Liu J, Hu Y, Wang Y, Wu B, Fan J and Hu Z 2018 An

integrated multi-sensor fusion-based deep feature learning

approach for rotating machinery diagnosis Meas. Sci.

Technol. 29 055103

[3] Lei Y, Lin J, He Z and Zuo M J 2013 A review on empirical

mode decomposition in fault diagnosis of rotating

machinery Mech. Syst. Signal Process. 35 108–26

[4] Wei Y et al 2019 A review of early fault diagnosis approaches

and their applications in rotating machinery Entropy

21 409

[5] Li Y, Li G, Yang Y, Liang X and Xu M 2018 A fault diagnosis

scheme for planetary gearboxes using adaptive multi-scale

morphology lter and modied hierarchical permutation

entropy Mech. Syst. Signal Process. 105 319–37

[6] Li Y, Xu M, Wang R and Huang W 2016 A fault diagnosis

scheme for rolling bearing based on local mean

decomposition and improved multiscale fuzzy entropy J.

Sound Vib. 360 277–99

[7] Li Y, Yang Y, Li G, Xu M and Huang W 2017 A fault

diagnosis scheme for planetary gearboxes using modied

multi-scale symbolic dynamic entropy and mRMR feature

selection Mech. Syst. Signal Process. 91 295–312

[8] Gao Y and Yu D 2020 Total variation on horizontal visibility

graph and its application to rolling bearing fault diagnosis

Mech. Mach. Theory 147 103768

[9] Wang L and Shao Y 2020 Fault feature extraction of rotating

machinery using a reweighted complete ensemble empirical

mode decomposition with adaptive noise and demodulation

analysis Mech. Syst. Signal Process. 138 106545

[10] Medina R, Macancela J C, Lucero P, Cabrera D, Cerrada M,

S´

anchez R-V and V´

asquez R E 2019 Vibration signal

analysis using symbolic dynamics for gearbox fault

diagnosis Int. J. Adv. Manuf. Technol. 104 2195–214

[11] Wen X et al 2020 Graph modeling of singular values for early

fault detection and diagnosis of rolling element bearings

Mech. Syst. Signal Process. 145 106956

[12] Liu Q, Pan H, Zheng J, Tong J and Bao J 2019 Composite

interpolation-based multiscale fuzzy entropy and its

application to fault diagnosis of rolling bearing Entropy

21 292

[13] Yan X and Jia M 2019 Intelligent fault diagnosis of rotating

machinery using improved multiscale dispersion entropy

and mRMR feature selection Knowl. Based Syst.

163 450–71

[14] Chen L and Wan S 2020 Mechanical fault diagnosis of

high-voltage circuit breakers using multi-segment

permutation entropy and a density-weighted one-class

extreme learning machine Meas. Sci. Technol.

31 85107

[15] Pincus S 1995 Approximate entropy (ApEn) as a complexity

measure Chaos 5110–7

[16] Richman J S and Moorman J R 2000 Physiological time-series

analysis using approximate entropy and sample entropy Am.

J. Physiol. Heart Circ. Physiol. 278 H2039–49

[17] Chen W, Zhuang J, Yu W and Wang Z 2009 Measuring

complexity using fuzzyen, apen, and sampen Med. Eng.

Phys. 31 61–68

[18] Bandt C and Pompe B 2002 Permutation entropy: a natural

complexity measure for time series Phys. Rev. Lett.

88 174102

Meas. Sci. Technol. 32 (2021) 015004 C Zhang et al

[19] Rostaghi M and Azami H 2016 Dispersion entropy: A measure

for time-series analysis IEEE Signal Process. Lett. 23 610–4

[20] Li Y, Wang X, Liu Z, Liang X and Si S 2018 The entropy

algorithm and its variants in the fault diagnosis of rotating

machinery: A review IEEE Access 666723–41

[21] Costa M, Goldberger A L and Peng C K 2005 Multiscale

entropy analysis of biological signals Phys. Rev. E

71 021906

[22] Wu S D, Wu C W, Lin S G, Lee K-Y and Peng C-K 2014

Analysis of complex time series using rened composite

multi-scale entropy Phys. Lett. A378 1369–74

[23] Wu S D, Wu C W, Lee K Y and Lin S-G 2013 Modied

multiscale entropy for short-term time series analysis

Physica A392 5865–73

[24] Huang N E, Shen Z, Long S R, Wu M C, Shih H H, Zheng Q,

Yen N-C, Tung C C and Liu H H 1998 The empirical mode

decomposition and the Hilbert spectrum for nonlinear and

non-stationary time series analysis Proc. R. Soc. Lond. A

454 903–95

[25] Smith J S 2005 The local mean decomposition and its

application to EEG perception data J. R. Soc. Interface

2443–54

[26] Dragomiretskiy K and Zosso D 2013 Variational mode

decomposition IEEE Trans. Signal Process.

62 531–44

[27] Wang Y, He Z and Zi Y 2010 A comparative study on the local

mean decomposition and empirical mode decomposition

and their applications to rotating machinery health

diagnosis J. Vib. Acoust. 132 2

[28] Liu W Y, Zhang W H, Han J G and Wang G F 2012 A new

wind turbine fault diagnosis method based on the local

mean decomposition Renew. Energy 48 411–5

[29] Li F, Li R, Tian L, Chen L and Liu J 2019 Data-driven

time-frequency analysis method based on variational mode

decomposition and its application to gear fault diagnosis in

variable working conditions Mech. Syst. Signal Process.

116 462–79

[30] Kankar P K, Sharma S C and Harsha S P 2013 Fault

diagnosis of rolling element bearing using cyclic

autocorrelation and wavelet transform Neurocomputing

110 9–17

[31] Eren L and Devaney M J 2004 Bearing damage detection via

wavelet packet decomposition of the stator current IEEE

Trans. Instrum. Meas.

53 431–6

[32] Pan H, Zhou K B and Liu J 2019 A fault diagnosis method for

rolling bearings based on Fourier transform multi-lter

decomposition and permutation entropy Proc. 13th

National Conf. Vibration Theory and Application (Chinese

Society of Vibration Engineering) pp 259–64 (in Chinese)

[33] Rauber T W, de Assis Boldt F and Varej˜

ao F M 2014

Heterogeneous feature models and feature selection applied

to bearing fault diagnosis IEEE Trans. Ind. Electron.

62 637–46

[34] Hu Q, Si X S, Zhang Q H and Qin A-S 2020 A rotating

machinery fault diagnosis method based on multi-scale

dimensionless indicators and random forests Mech. Syst.

Signal Process. 139 106609

[35] Peng H, Long F and Ding C 2005 Feature selection based on

mutual information criteria of max-dependency,

max-relevance, and min-redundancy IEEE Trans. Pattern

Anal. Mach. Intell. 27 1226–38

[36] Yang H and Moody J 1999 Feature selection based on joint

mutual information Proc. Int. ICSC Symp. Advances in

Intelligent Data Analysis pp 22–25

[37] Bennasar M, Hicks Y and Setchi R 2015 Feature selection

using joint mutual information maximisation Expert Syst.

Appl. 42 8520–32

[38] Fu W et al 2020 Fault diagnosis for rolling bearings based on

composite multiscale ne-sorted dispersion entropy and

SVM with hybrid mutation SCA-HHO algorithm

optimization IEEE Access 813086–104

[39] Zabalza J, Ren J, Zheng J, Zhao H, Qing C, Yang Z, Du P and

Marshall S 2016 Novel segmented stacked autoencoder for

effective dimensionality reduction and feature extraction in

hyperspectral imaging Neurocomputing 185 1–10

[40] Zhou Q, Li Y, Tian Y and Jiang L 2020 A novel method based

on nonlinear auto-regression neural network and

convolutional neural network for imbalanced fault diagnosis

of rotating machinery Measurement 161 107880

[41] Zhou K B, Zhang Z X, Liu J, Hu Z-X, Duan X-K and Xu Q

2018 Anode effect prediction based on a singular value

thresholding and extreme gradient boosting approach Meas.

Sci. Technol. 30 015104

[42] Ke G et al 2017 Lightgbm: a highly efcient gradient boosting

decision tree Adv. Neural Inf. Process. Syst. 30 3146–54

[43] Prokhorenkova L et al 2018 CatBoost: unbiased boosting with

categorical features Adv. Neural Inf. Process. Syst. pp

6638–48

[44] Sun X, Liu M and Sima Z 2018 A novel cryptocurrency price

trend forecasting model based on LightGBM Finance Res.

Lett. 32 101084

[45] Zhang J, Wen H and Tang L 2019 Improved smoothing

frequency shifting and ltering algorithm for harmonic

analysis with systematic error compensation IEEE Trans.

Ind. Electron. 66 9500–9

[46] Snoek J, Larochelle H and Adams R P 2012 Practical bayesian

optimization of machine learning algorithms Adv. Neural

Inf. Process. Syst. pp 2951–9

[47] Bergstra J and Bengio Y 2012 Random search for

hyper-parameter optimization J. Mach. Learn. Res.

13 281–305

[48] Shan Y, Zhou J, Jiang W, Liu J, Xu Y and Zhao Y 2019 A fault

diagnosis method for rotating machinery based on improved

variational mode decomposition and a hybrid articial

sheep algorithm Meas. Sci. Technol. 30 055002

[49] Ge M et al 2020 A deep condition feature learning approach

for rotating machinery based on MMSDE and optimized

SAEs Meas. Sci. Technol. (accepted) (https://doi.org.

10.1088/1361-6501/ab89e3)

[50] Maaten L and Hinton G 2008 Visualizing data using t-SNE J.

Mach. Learn. Res. 92579–605

[51] Robnik-Šikonja M and Kononenko I 2003 Theoretical and

empirical analysis of ReliefF and RReliefF Mach. Learn.

53 23–69

[52] Lessmeier C et al 2016 Condition monitoring of bearing

damage in electromechanical drive systems by using motor

current signals of electric motors: a benchmark data set for

data-driven classication Proc. Eur. Conf. Prognostics and

Health Management Society pp 05–08

[53] Wen L, Gao L, Dong Y and Zhu Z 2019 A negative correlation

ensemble transfer learning method for fault diagnosis based

on convolutional neural network Math. Biosci. Eng.

16 3311–30

A preview of this full-text is provided by IOP Publishing.

Learn more

Content available from Measurement Science and Technology

This content is subject to copyright. Terms and conditions apply.

Improved Diagnostic Approach for BRB Detection and Classification in Inverter-Driven Induction Motors Employing Sparse Stacked Autoencoder (SSAE) and LightGBM

Article

Full-text available

Mar 2024

This study introduces an innovative approach to diagnostics, employing a unique combination of techniques including a stratified group K-fold cross-validation method and a sparse stacked autoencoder (SSAE) alongside LightGBM. By examining signatures derived from motor current, voltage, speed, and torque, the framework aims to effectively detect and classify broken rotor bars (BRBs) within inverter-fed induction machines. In this kind of cross-validation method, class labels and grouping factors are spread out across folds by distributing motor operational data attributes equally over target label stratification and extra grouping information. By integrating SSAE and LightGBM, a gradient-boosting framework, we elevate the precision and efficacy of defect diagnosis. The SSAE feature extraction algorithm proves to be particularly effective in identifying small BRB signatures within motor operational data. Our approach relies on comprehensive datasets collected from motor systems operating under diverse loading conditions, ranging from 0% to 100%. Using a sparse stacked autoencoder, the model lowers the dimensionality and noise of the motor fault data. It then sends the cleaned data to the LightGBM network for fault diagnosis. LightGBM leverages the attributes of the sparse stacked autoencoder to showcase the distinctive qualities associated with BRBs. This integration offers the potential to improve defect identification by furnishing input representations that are both more precise and more concise. The proposed model (SSAE with LightGBM) was trained using 80% of the data, while the remaining 20% was used for testing. To validate the proposed architecture, we evaluate the accuracy, precision, recall, and F1-scores of the results using motor global signals, with the help of confusion matrices with receiver operating characteristic (ROC) curves. Following the training of a new LightGBM model with refined hyperparameters through Bayesian optimization, we proceed to conduct the final classification utilizing the optimal feature subset. Evaluation of the test dataset indicates that the BRBs diagnostic framework facilitates the detection and classification of issues with induction motor BRBs, achieving accuracy rates of up to 99% across all loading conditions.

Classification of Aviation Incident Causes using LGBM with Improved Cross-Validation

Article

Apr 2024
J SYST ENG ELECTRON

Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm. To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM) based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed: one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBM-HSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model's accurate identification of civil aviation incident causes can assist to improve civil aviation safety.

Enhanced Bearing Fault Diagnosis Through Trees Ensemble Method and Feature Importance Analysis

Article

Full-text available

May 2024

This research introduces a groundbreaking method for bearing defect detection. It leverages ensemble machine learning (ML) models and conducts comprehensive feature importance analysis. The key innovation is the training and benchmarking of three tree ensemble models—Decision Tree (DT), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—on an extensive experimental dataset (QU-DMBF) collected from bearing tests with seeded defects of varying sizes on the inner and outer raceways under different operating conditions. The dataset was meticulously prepared with categorical variable encoding and Min–Max data normalization to ensure consistent class distribution and model accuracy. Implementing the ML models involved a grid search method for hyperparameter tuning, focusing on reporting the models’ accuracy. The study also explores applying ensemble methods and using supervised and unsupervised learning algorithms for bearing fault detection. It underscores the value of feature importance analysis in understanding the contributions of specific inputs to the model’s performance. The research compares the ML models to traditional methods and discusses their potential for advanced fault diagnosis in bearing systems. The XGBoost model, trained on data from actual bearing tests, outperformed the others, achieving 92% accuracy in detecting bearing health and fault location. However, a deeper analysis of feature importance reveals that the models weigh certain experimental conditions differently—such as sensor location and motor speed. This research’s primary novelties and contributions are comparative evaluation, experimental validation, accuracy benchmarking, and interpretable feature importance analysis. This comprehensive methodology advances the bearing health monitoring field and has significant practical implications for condition-based maintenance, potentially leading to substantial cost savings and improved operational efficiency.

Quantitative condition assessment method for rotating machinery using fuzzy neural network

Article

Full-text available

May 2024
MEAS SCI TECHNOL

Health condition assessment of rotating machinery has been a persistent challenge. Traditional condition assessment methods often rely on single features, limiting their application to comprehensively measure the health condition of rotating machinery. This study introduced a quantitative condition assessment method for rotating machinery using fuzzy neural network (FNN). Initially, multi-domain features of signals from rotating machinery are extracted to achieve comprehensive representation of signals in the feature space. To eliminate redundant information of various features, a feature dimensionality reduction method is explored based on variance variation and stacked auto-encoder. Afterward, a normalized health indicator is constructed by integrating the optimized features through FNN, and it can indicate the current conditions of rotating machinery. Furthermore, an early anomaly alarm strategy based on 3σ criterion is designed for rotating machinery. The abnormal signal will be recognized automatically when it exceeds the predetermined thresholds. Last, the effectiveness of the proposed method is validated on IMS bearing dataset and XJTU-SY bearing dataset. The results show that the proposed method can effectively obtain the quantitative indicators that reflect the operation conditions of rotating machinery and can accurately detect the early abnormal signals.

Application of the CatBoost Model for Stirred Reactor State Monitoring Based on Vibration Signals

Article

Full-text available

Apr 2024
Comput Model Eng Sci

Stirred reactors are key equipment in production, and unpredictable failures will result in significant economic losses and safety issues. Therefore, it is necessary to monitor its health state. To achieve this goal, in this study, five states of the stirred reactor were firstly preset: normal, shaft bending, blade eccentricity, bearing wear, and bolt looseness. Vibration signals along x, y and z axes were collected and analyzed in both the time domain and frequency domain. Secondly, 93 statistical features were extracted and evaluated by ReliefF, Maximal Information Coefficient (MIC) and XGBoost. The above evaluation results were then fused by D-S evidence theory to extract the final 16 features that are most relevant to the state of the stirred reactor. Finally, the CatBoost algorithm was introduced to establish the stirred reactor health monitoring model. The validation results showed that the model achieves 100% accuracy in detecting the fault/normal state of the stirred reactor and 98% accuracy in diagnosing the type of fault.

Optimised LightGBM-based health condition evaluation method for the functional components in CNC machine tools under strong noise background

Article

Full-text available

Jan 2024
MEAS SCI TECHNOL

The accurate health condition evaluation of the functional components in computer numerical control (CNC) machine tools is an important prerequisite for predictive maintenance and fault warning. The vibration signals of the functional components in CNC machine tools often contain substantial noise, impeding the extraction of relevant health condition information from the vibration signals. This work presents an approach that leverages the variational mode decomposition (VMD) enhanced by the Artificial Hummingbird Algorithm (AHA) alongside the Light Gradient Boosting Machine (LightGBM) optimised through particle swarm optimisation (PSO) to evaluate the health condition of the functional components in CNC machine tools amidst pervasive noise. Initially, the AHA optimised the penalty factor (α) and the decomposition layer (K) within the VMD. This optimised VMD was subsequently applied to denoise the original vibration signals. After this denoising process, PSO was employed to optimise the learning rate and maximum tree depth within LightGBM. Health condition evaluation experiments were executed on the feed system and spindle of the CNC machine tool to validate the proposed methodology. Comparative analysis indicates that the proposed method attains paramount accuracy and computational efficiency, which are crucial for accurately evaluating the health condition of the functional components in CNC machine tools.

Application of the CatBoost Model for Stirred Reactor State Monitoring Based on Vibration Signals

Preprint

Full-text available

Jun 2023

Stirred reactor is a key equipment in the production process, and will result in large economic losses and safety issues when unpredictable failures occur. Therefore, it is necessary to monitor their health state. With this goal, firstly, this study presets five states of the stirred reactor: normal, shaft bending, blade eccentricity, bearing wear, and bolt looseness. x, y, z axes vibration signals are collected and analyzed in time and frequency domain. Secondly, 93 statistical features are extracted evaluated by Relieff, MIC and XGBoost. The above evaluation results are then fused by D-S evidence theory to obtain the final 16 features that are most relevant to the state of the stirred reactor. Finally, CatBoost algorithm is introduced to establish the health state monitoring model of the stirred reactor.The validation results show that accuracy of the proposed model is 100% for state recognition and 98% for fault diagnosis.

Investigations on Sample Entropy and Fuzzy Entropy for Machine Condition Monitoring：Revisited

Article

Full-text available

Aug 2023
MEAS SCI TECHNOL

Complexity measures typically represented by entropy are capable of detecting and characterizing underlying dynamic changes in a system and they have been considerably studied for machine condition monitoring and fault diagnosis. Various entropies have been developed based on Shannon entropy to meet actual demands. Nevertheless, currently existing research works about complexity measures mainly focus on experimental studies, and their theoretical studies are still going on and not fully explored. In previous studies, it was theoretically and experimentally proved that two complexity measures including correlation dimension and approximate entropy have a ‘‘bilateral reduction” effect. Since sample entropy and fuzzy entropy are two more advanced complexity measures that were developed based on the concept of correlation dimension and approximate entropy, this paper continues conducting theoretical and experimental investigations on sample entropy and fuzzy entropy and exploring their theoretical properties to enrich the domain of complexity measure analysis and its applications to machine condition monitoring. Specifically, this paper theoretically proves and verifies that sample entropy and fuzzy entropy still have a similar “bilateral reduction” effect with correlation dimension and approximate entropy, and they are indeed complexity measures. The relationships between sample entropy, fuzzy entropy, and their key parameters during their calculation are numerically and experimentally studied. Bearing and gear run-to-failure datasets are used to investigate the effectiveness of sample entropy and fuzzy entropy for bearing and gear condition monitoring, and experimental results of sample entropy and fuzzy entropy are well-matched with the theoretical “bilateral reduction” effect of sample entropy and fuzzy entropy. Overall, this paper will provide a guideline for correct uses of sample entropy and fuzzy entropy for engineering applications, especially for machine condition monitoring.

Research on fault diagnosis of industrial robots based on generative adversarial network

Article

Apr 2024

Fault Diagnosis of Bolt Loosening Based on LightGBM Recognition of Sound Signal Features

Article

Oct 2023

Aiming at the strong vibration condition of the inner curve radial plunger hydraulic motor, the bolt connection of the motor base is easy to loosen, and it is difficult to distinguish the early loosening faults online in time by using the vibration signals. In this article, we propose a bolt loosening fault diagnosis method based on LightGBM to recognize sound signal features, and this method can achieve online monitoring of bolt loosening faults. Through the vibration energy recovery test platform, the sound signals of four different bolt preload forces during the normal operation of the equipment were collected, and the bolt preload force was increased from completely loosened 0–60 $\text{N}\cdot \text{m}$ , with an increment of 20 $\text{N}\cdot \text{m}$ each time, and the sound signals were denoised using wavelet threshold denoising method. The LightGBM bolt loosening fault diagnosis model is constructed based on the gradient of one-sided sampling and mutually exclusive feature bundle algorithms. By extracting the time- and frequency-domain features of the denoised sound signals, a dataset containing labels of normal and three faulty signals is generated for training and diagnosis. Finally, the diagnostic accuracy of this method is compared and verified. The results show that the LightGBM algorithm after wavelet threshold denoising improves the diagnostic accuracy by 2.17% over the no-denoising LightGBM and by 5.47% and 2.21% over the XGboost algorithm before and after denoising, respectively.

A deep condition feature learning approach for rotating machinery based on MMSDE and optimized SAEs

Article

Full-text available

Dec 2020
MEAS SCI TECHNOL

The failure of the rotating machinery affects the quality of the product and the entire production process. However, it usually suffers the following deficiency that the hyperparameters of the fault diagnosis model require constant debugging. This paper proposes a deep condition feature learning approach for rotating machinery based on modified multi-scale symbolic dynamic entropy (MMSDE) and optimized stacked auto-encoders (SAEs). Firstly, MMSDE have been used to extract fault characteristics of the original vibration signal, due to that such methods do not rely on the prior knowledge and experience. MMSDE conducts multi-scale analysis on the original vibration signal and calculates the entropy value of the multi-scale signal. The multiscale fault characteristics are obtained. Then, Bayesian optimization-based SAEs is applied to select feature samples and classify fault status in mechanical fault diagnosis without debugging. The effectiveness of the proposed method is verified by the open source data and experimental data. Multiple working conditions are also considered and investigated.

Mechanical fault diagnosis of high-voltage circuit breakers using multi-segment permutation entropy and density-weighted one-class extreme learning machine

Article

Full-text available

May 2020
MEAS SCI TECHNOL

Condition monitoring for high-voltage circuit breakers (HVCBs) is of great significance for the safety of power grids. Based on machine-learning methods, most relevant studies have contributed significantly to improving the classification accuracy of known states. However, these studies have neglected the detection of unknown faults. In this study, a new one-class classifier, called a density-weighted one-class extreme learning machine (DW-OCELM), was proposed to detect unknown faults of HVCBs. The DW-OCELM determines the classification boundary considering data distribution by introducing the notion of density weight, such that samples located in low-density regions are more likely to be separated, improving detection performance. On this basis, a multi-class classifier was developed based on the homogeneous combination of multiple DW-OCELMs to classify known states. In addition, the proposed classifiers were trained based on multi-segment permutation entropy calculated from vibration signals. Experiments on a 35 kV HVCB demonstrated that the proposed methods outperformed other state-of-the-art techniques.

Fault Diagnosis for Rolling Bearings Based on Composite Multiscale Fine-Sorted Dispersion Entropy and SVM with Hybrid Mutation SCA-HHO Algorithm Optimization

Article

Full-text available

Jan 2020

The health condition of rolling bearing possesses a significant impact on the safety and efficiency of rotating machinery. Accordingly, to diagnose the faults in rolling bearings effectively and accurately, a novel hybrid approach coupling variational mode decomposition (VMD), composite multiscale fine-sorted dispersion entropy (CMFSDE) and support vector machine (SVM) optimized by mutation sine cosine algorithm and Harris hawks optimization (MSCAHHO) is proposed in the paper. Firstly, VMD is employed to decompose raw vibration signals with various fault types into different sets of intrinsic mode functions (IMFs) to weaken the non-stationarity of signals, before which the parameter K of VMD is decided through central frequency observation method. Subsequently, CMFSDE is put forward in this paper to analyze the complexity of fault signals by fully considering the relationship between neighboring elements based on composite multiscale technique, with which the representative features of different fault samples are extracted to construct feature vectors. Later, an enhanced hybrid optimization approach called MSCAHHO is proposed by integrating sine cosine algorithm (SCA) and a periodic mutation strategy to improve Harris hawks optimization (HHO). Then, MSCAHHO is employed to optimize the parameters of SVM, after which the optimal SVM model is utilized for fault classification. Finally, the performance of the proposed methodology is evaluated with four validity indices through comparative experiments. The experimental results reveal that the proposed VMD-CMFSDE-MSCAHHO-SVM method achieves favorable diagnosis results comparing with other relevant methods.

Vibration signal analysis using symbolic dynamics for gearbox fault diagnosis

Article

Full-text available

Oct 2019
INT J ADV MANUF TECH

This paper addresses the use of two algorithms based on symbolic dynamics analysis of vibration signal for fault diagnosis in gearboxes. The symbolic dynamics algorithm (SDA) works by subdividing the phase space described by the Poincaré plot into several angular regions; then, a symbol is assigned to each region. The probability distributions generated by the set of symbols are considered as features for classification of faults in a gearbox. The peak symbolic dynamics algorithm (PSDA) is a variant that extracts a sequence of peaks from the vibration signals and then performs the phase-space subdivision and symbol coding. A gearbox vibration signal dataset is analyzed for classifying 10 types of faults. Fault classification is attained using a multi-class support vector machine. The highest accuracy attained using k-fold cross-validation is 100.0% for load L3 with SDA and 100% with load L2 with PSDA. The accuracy considering all signals in the gearbox dataset is 99.2% with SDA and 99.8% with PSDA. The algorithms proposed have the advantage of being simple, accurate, and fast, and they could be adapted for online condition monitoring.

A fault diagnosis approach for rolling bearing based on Fourier transform multi-filter decomposition and symbolic dynamic entropy

Conference Paper

Aug 2019

Graph modeling of singular values for early fault detection and diagnosis of rolling element bearings

Article

May 2020

Early fault detection and diagnosis plays an important role in reducing maintenance cost and ensuring reliability of rolling element bearings (REBs). Singular value decomposition (SVD) is considered as a promising method to achieve this end, but lacks of consideration of inter-correlation between resulting singular values leading to the loss of weak fault information hidden in specific components. This paper, motivated by recent advances in graph modeling of highly noisy vibration signals, presents a novel method, called graph-modeled singular values (GMSVs), that integrates graph theory and SVD with the purpose of inspection of dynamic REB health conditions. The method utilizes the singular values as inputs to construct the graph, as such it achieves a balance between sensitivity to early fault and robustness to noise; meanwhile, it brings a more powerful ability of fault discrimination. Taking merits of GMSVs, a common null hypothesis testing is performed to inspect whether a fault occurs or not during REB successive operations; the KNN classifier is used to identify the fault type. Experiments are conducted on two publicly-available data sets: XJTU-SY data set and CWRU data set. Comprehensive experimental results along with comparison of those state-of-the-arts demonstrate the priority and great potential of the method in real applications.

Total variation on horizontal visibility graph and its application to rolling bearing fault diagnosis

Article

May 2020
MECH MACH THEORY

The total variation on graph (TVG) is a powerful vertex domain index for measuring the smoothness of graph signals, but its performance is closely related to the underlying graph. Since the horizontal visibility graph can better reflect the dynamics characteristics of bearing vibration signals than the path graph, the underlying graph of TVG is designated as horizontal visibility graph. The vertex domain index TVG defined on horizontal visibility graph is called simply as TVHVG in this paper. For better distinguishing the different states of rolling bearings, the bearing vibration signal is converted into the graph signal indexed by its horizontal visibility graph, and the vertex domain index TVHVG is extracted as the single fault feature. Based on TVHVG feature extraction and Mahalanobis distance classification, a novel fault diagnosis method for rolling bearings is proposed. The proposed method is applied to analyze two sets of experimental data containing normal and faulty rolling bearings. The results indicate that the proposed method can diagnose the bearing faults with different types and degrees effectively, and the vertex domain index TVHVG is superior to some classical time domain indexes in distinguishing the different states of rolling bearings.

A rotating machinery fault diagnosis method based on multi-scale dimensionless indicators and random forests

Article

May 2020

Fault diagnosis methods based on dimensionless indicators have long been studied for rotating machinery. However, traditional dimensionless indicators frequently suffer a low accuracy of fault diagnosis for nonlinear and non-stationary dynamic signals of rotating machinery. In this paper, we propose an effective fault diagnosis method based on multi-scale dimensionless indicator (MSDI) and random forests. In the proposed method, the real-time vibration signals are first processed by the variational mode decomposition and then six types of MSDI are constructed based on the decomposed signals. Through utilizing the Fisher criterion, several top ranked MSDIs are selected as fault features. Based on the selected MSDIs, the random forests model is applied to determine fault types. To verify the superiority of the proposed method, several experiments on fault diagnosis are conducted on a centrifugal multi-level impeller blower. The results demonstrate that the proposed method can successfully identify different fault types and the average accuracy can reach 95.58%. In contrast with traditional dimensionless indicators based methods, the proposed method can improve the fault diagnosis accuracy by 7.25% and outperforms other techniques such as back propagation neural network, support vector machine and extreme learning machine. These results indicate that the MSDI can effectively solve the deficiency of the traditional dimensionless indicator, and has stronger distinguishing ability for the fault types.

A novel method based on nonlinear auto-regression neural network and convolutional neural network for imbalanced fault diagnosis of rotating machinery

Article

Apr 2020
MEASUREMENT

Despite the diagnosis methods of rotating machinery based on convolutional neural network (CNN) have achieved great success. They generally assume the number of normal and fault samples is the same. However, it’s difficult to obtain adequate fault samples. Moreover, CNN cannot well handle the imbalanced fault diagnosis. Nonlinear auto-regressive neural network (NARNN) has strong prediction ability and can expand the small number of fault samples. Thus, a novel fault diagnosis approach combining CNN with NARNN has been proposed. First, NARNN is applied to expand the small number of samples. Thereby, the sample sizes of different health conditions are equal. Subsequently, continuous wavelet transform is employed to convert the 1-dimensional vibration signals into 2-dimensional time-frequency images. Finally, CNN is established to automatically learn the characteristics and achieve fault identification. Through the comparative experiments, the superiority of the proposed method has been validated based on the two datasets with different imbalanced levels.

Fault feature extraction of rotating machinery using a reweighted complete ensemble empirical mode decomposition with adaptive noise and demodulation analysis

Article

Apr 2020
MECH SYST SIGNAL PR

Fault feature extraction is crucial to detect failures as earlier as possible in fault diagnosis of rotating machinery. Due to the influence of environment noise and interference, the signal to noise ratio (SNR) of fault feature is relatively low in the measured signal. Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is an improved method based on EEMD, which has been extensively applied to signal de-noising. The key problem for CEEMDAN is to determine the fault-related degree of a decomposed intrinsic mode function (IMF), especially in the presence of both Gaussian and non-Gaussian noises or interferences. However, most of the traditional assessment criterions are developed to describe the statistical parameters of IMFs, e.g. correlation coefficient and kurtosis, which ignore the specific characteristics of the fault and are easily affected by noise components. Therefore, a new criterion is proposed to quantify the fault-related degree of a vibration signal, in which the ratio of periodic modulation components caused by fault to the generalized interferences is defined. Then, a reweighted and reconstruction strategy of the decomposed IMFs is presented to obtain the de-noised signal based on the new criterion. Furthermore, in order to detect the fault-related modulation features in multi-frequency scales, a time-frequency representation (TFR) based demodulation analysis is employed, which guarantees an accurate extraction of the fault feature at the early stage of fault. The effectiveness of the proposed fault diagnosis method comparing to traditional methods are demonstrated by both numerical simulation and experimental studies. The results show that the proposed method achieves a better performance in terms of SNR improvement and fault feature detection, it can successfully detect the fault features in the presence of Gaussian and non-Gaussian noises.

Fault diagnosis of key components in the rotating machinery based on Fourier transform multi-filter decomposition and optimized LightGBM

Abstract and Figures

Recommended publications

Data-Driven Joint Fault Diagnosis Based on RMK-ASSA and DBSKNet for Blast Furnace Iron-Making Proces...

A fault diagnosis approach for rolling bearing based on Fourier transform multi-filter decomposition...

A deep condition feature learning approach for rotating machinery based on MMSDE and optimized SAEs

A modified neighborhood mutual information and light gradient boosting machine-based long-term predi...

Modified Hierarchical Multiscale Dispersion Entropy and its Application to Fault Identification of R...