ArticlePDF Available

An interpretable data augmentation scheme for machine fault diagnosis based on a sparsity-constrained generative adversarial network

May 2021
Expert Systems with Applications 182:115234

May 2021
182:115234

DOI:10.1016/j.eswa.2021.115234

Authors:

Liang Ma

Beihang University (BUAA)

Yu Ding

Beihang University (BUAA)

Chao Wang

Beihang University (BUAA)

Show all 6 authorsHide

Vibration signal-based methods have been widely utilized in machine fault diagnosis. Usually, a lack of sufficient training data can prevent these methods from achieving satisfactory performance. The generative adversarial network (GAN) is a feasible solution to this problem. However, existing GAN-based methods struggle to stably generate raw vibration signals. To achieve vibration signal generation, a novel sparsity-constrained GAN (SC-GAN) method containing a two-stage training process is developed, which can perform data augmentation for machine fault diagnosis with a simple structure. Autoencoder (AE)-based pretraining and sparsity regularization constraints are implemented in the proposed method. Furthermore, to understand the internal mechanisms of vibration signal generation, we propose a method for analyzing the network’s weight matrix to interpret the generation mechanism. In a case study on rolling element bearings, the SC-GAN is verified to be able to generate raw vibration signals under 10 different health conditions with a more stable training process than other models. In a fault diagnosis task, the data augmentation by SC-GAN significantly improves the diagnostic accuracy by 7.44%. An analysis of the well-trained SC-GAN shows that the model captures key frequency components, which provides a credible interpretation for the generation mechanism. Another case study on the gearbox illustrates the good generalization ability of SC-GAN to other machines and more complicated signals.

Loss of the generator and discriminator during the adversarial training stage: (a) GAN, (b) AE-GAN, (c) SGAN, and (d) SC-GAN.

…

Raw vibration signals and frequency spectra of generated samples and the corresponding real samples: (a) NORMAL, (b) IR007, (c) IR014, (d) IR021, (e) B007, (f) B014, (g) B021, (h) OR007, (i) 0R014, and (j) OR021.

…

Selected weight vectors of neurons in the hidden layer of the generator and the corresponding frequency spectrum obtained by FFT: (a) #26, (b) #18, (c) #135, (d) #114, (e) #72, (f) #155, (g) #81, and (h) #148.

…

Five randomly generated vibration signals of different health conditions: (a) IR014, (b) B014, and (c) OR014.

…

Testbench used to collect the gearbox dataset.

…

Figures - uploaded by Liang Ma

Content may be subject to copyright.

Content uploaded by Liang Ma

Content may be subject to copyright.

An interpretable data augmentation scheme for machine fault diagnosis based on a

sparsity-constrained generative adversarial network

Liang Ma1,2,3, # (buaaml@buaa.edu.cn),

Yu Ding2,4, *, # (dingyu@buaa.edu.cn),

Zili Wang1,2,3 (wzl@buaa.edu.cn),

Chao Wang1,2,3 (wangchaowork@buaa.edu.cn),

Jian Ma1,2,3 (09977@buaa.edu.cn),

Chen Lu1,2,3 (luchen@buaa.edu.cn)

1 Institute of Reliability Engineering, Beihang University, Beijing, 100191, China.

2 Science and Technology on Reliability and Environmental Engineering Laboratory, Beijing, 100191, China.

3 School of Reliability and Systems Engineering, Beihang University, Beijing, 100191, China.

4 School of Aeronautic Science and Engineering, Beihang University, Beijing, 100191, China.

* Corresponding Author: Yu Ding (dingyu@buaa.edu.cn).

# These authors contributed equally to this work and should be considered co-first authors.

Highlights

⚫ A novel SC-GAN data augmentation scheme for stable raw vibration signal generation.

⚫ An interpretation method to open the black box of the proposed neural network model.

⚫ The model can learn frequency domain patterns from time domain vibration signals.

⚫ Data augmentation improves the performance of the machine fault diagnosis model.

Abstract

Vibration signal-based methods have been widely utilized in machine fault diagnosis. Usually, a lack of

sufficient training data can prevent these methods from achieving satisfactory performance. The generative

adversarial network (GAN) is a feasible solution to this problem. However, existing GAN-based methods

struggle to stably generate raw vibration signals. To achieve vibration signal generation, a novel sparsity-

constrained GAN (SC-GAN) method containing a two-stage training process is developed, which can perform

data augmentation for machine fault diagnosis with a simple structure. Autoencoder (AE)-based pretraining

and sparsity regularization constraints are implemented in the proposed method. Furthermore, to understand

the internal mechanisms of vibration signal generation, we propose a method for analyzing the network’s

weight matrix to interpret the generation mechanism. In a case study on rolling element bearings, the SC-GAN

is verified to be able to generate raw vibration signals under 10 different health conditions with a more stable

training process than other models. In a fault diagnosis task, the data augmentation by SC-GAN significantly

improves the diagnostic accuracy by 7.44%. An analysis of the well-trained SC-GAN shows that the model

captures key frequency components, which provides a credible interpretation for the generation mechanism.

Another case study on the gearbox illustrates the good generalization ability of SC-GAN to other machines

and more complicated signals.

Keywords: Generative adversarial networks, Data augmentation, Mechanism interpretation, Machine fault

diagnosis, Raw vibration signal

1 Introduction

Effective fault diagnosis techniques ensure the reliability and safety of industrial machines by avoiding

losses of life and property caused by failure. Vibration signals contain important information about the

operating status of machines. Therefore, vibration signal-based methods are widely applied in the fault

diagnosis of industrial machines (Chen et al., 2016). Signal processing methods perform well in understanding

the key information in vibration signals, which is effective for identifying the differences among fault modes.

These signal processing methods include signal decomposition methods (represented by empirical mode

decomposition (EMD) (Lei et al., 2011), variational mode decomposition (VMD) (Dibaj et al., 2020), wavelet

package transform (WPT) (Wu and Liu, 2009), etc.) and feature extraction methods (represented by time-

domain features (Rauber et al., 2015), frequency-domain features (Z. Chen et al., 2019), etc.). Moreover, with

the rapid development of deep learning (DL), benefitting from the ability of DL-based methods to process

high-dimensional nonlinear signals without relying on prior expert knowledge, the use of DL-based methods

to process vibration signals and build fault diagnosis models is currently a popular research topic.

However, DL-based intelligent methods highly rely on sufficient, balanced and labeled data, which are

difficult to obtain in actual industrial scenarios (He et al., 2020). Notably, machines usually work under healthy

conditions, and faulty conditions are infrequent. In addition, from an economic perspective, it is costly to

obtain complete, high-quality fault data for industrial machines. Therefore, data augmentation is a direct and

effective way to solve fault diagnosis problems under the circumstance of a lack of sufficient data. The

synthetic minority oversampling technique (SMOTE) (Chawla et al., 2002) and its variants (Bunkhumpornpat

et al., 2009) (Zhang et al., 2018) are very popular algorithms in data augmentation. These methods utilize real

samples to synthesize new samples by linear interpolation, and their mechanism remains at the instance level

instead of capturing the intrinsic distribution of the data.

A generative adversarial network (GAN) is a framework that can learn a mapping between a prior noise

distribution and the real data distribution via an adversarial learning mechanism and then generate synthetic

samples that are similar to the real data. The most famous research on and application of GANs is fake image

generation (Goodfellow et al., 2014) (Radford et al., 2016). In addition, GANs have been applied to other

fields. Chen et al. (J. Chen et al., 2019) proposed a SAE-GAN method that trains the GAN to generate the

features extracted by the SAE to detect credit card fraud. Recently, GANs have also been investigated and

applied in data augmentation for machine fault diagnosis. Most of these studies focused on generating the

frequency spectra of vibration signals obtained by a fast Fourier transform (FFT). Wang et al. (Wang et al.,

2018) proposed a scheme that combined a GAN and stacked denoising autoencoders (SDAEs) to generate the

spectra of planetary gearbox vibration signals. The SDAE-GAN method performed well in anti-noise ability

and fault diagnosis under the condition of small samples. Guo et al. (Guo et al., 2020) developed a multilabel

1D generation adversarial network (ML1D-GAN) to generate realistic FFT spectra. The ML1D-GAN was

used to extend the training set, which improved the diagnostic accuracy of rolling bearings from 95% to 98%.

Gao et al. (Gao et al., 2020) utilized a GAN to enlarge a dataset consisting of finite element method-based

simulation signals and measured signals. Li et al. (Li et al., 2019) proposed a cross-domain fault diagnosis

method with data augmentation using a GAN. Studies by Wang et al. (Wang et al., 2019), Mao et al. (Mao et

al., 2019), Ding et al. (Ding et al., 2019) and Zheng et al. (Zheng et al., 2020) also involved the generation of

the spectra of vibration signals, which were then utilized to improve the performance of machine fault

diagnosis models under the conditions of insufficient or unbalanced data. In addition to using GANs to

generate vibration signal spectra, existing studies have developed methods for generating the vibration signal

features obtained by signal processing methods. Liu et al. (Liu et al., 2018) applied categorical adversarial

autoencoders (CatAAEs) to generate 10 handcrafted features extracted from the vibration signal. Cabrera et

al. (Cabrera et al., 2019) selected a WPT-based feature extraction result as the generation target for the GAN.

In the study of Zhou et al. (Zhou et al., 2020), a GAN was designed to generate fault features extracted via an

autoencoder (AE) instead of directly generating real fault data samples.

To a certain extent, the information in the original signals will inevitably be lost in the process of

decomposing signals using the aforementioned methods. Thus, directly generating the raw vibration signals

is a data augmentation method with no information loss. However, due to the high nonlinearity and noise

disturbances in signals collected from industrial scenarios, the generation procedure is undeniably difficult to

successfully implement, as stated by Mao et al. (Mao et al., 2019). Based on a literature survey, there have

been only two successful attempts involving the generation of raw vibration data. Shao et al. (Shao et al., 2019)

proposed an auxiliary classifier GAN (ACGAN)-based framework to generate realistic one-dimensional raw

data. To improve the stability of the adversarial training process, multiple 1D convolution layers were stacked

to form the generator and discriminator, and auxiliary labels were fed into the model to provide additional

information. Zhang et al. (Zhang et al., 2020) developed a complex GAN structure for the generation of raw

vibration signals under one specific health condition. The generator consisted of 6 layers, and the discriminator

consisted of 5 layers, including convolutional layers, flattened layers and fully connected layers. However, the

authors mentioned that further research on the optimization of the network structure is needed.

From the literature review, it can be concluded that the following knowledge gaps and key challenges are

still not solved by existing GAN-based data augmentation methods for machine fault diagnosis:

(1) The stable generation of raw vibration signals for lossless data augmentation is difficult. Although a

few methods have been successfully applied, they rely on complex network structures and some training tricks,

which make the GAN hard to train stably.

(2) The mechanism of realistic data generation using GANs needs to be further studied and explained.

Like other DL-based methods, a GAN is usually treated as a black box. Research on how the GAN works can

help us to better understand it and utilize it with more confidence, which is also mentioned by Pan et al. (Pan

et al., 2020).

To address these issues, the major efforts and main contributions of this paper are listed as follows:

(1) To achieve the stable generation of raw vibration signals for machine fault diagnosis, a novel sparsity-

constrained GAN (SC-GAN) model, which includes a two-stage training process, is proposed. Inspired by the

findings of Lei et al. (Lei et al., 2016) that the sparse filtering neural network model shares a physical

interpretation, a sparsity constraint is imposed on the model during the processes of AE-based pretraining and

GAN-based adversarial training, by which the model can learn useful and explainable representations of the

signal. With a simple fully connected structure with one hidden layer, the generator and discriminator can

stably converge and then realize the generation of raw time-domain vibration signals.

(2) To explain how the SC-GAN can stably generate raw vibration signals, a method for interpreting the

vibration signal generation mechanism is proposed from the aspect of the feed forward mechanism of neural

networks. The column vectors of the weight matrix of the hidden sparse layer are separated and regarded as

the learned representations by each neuron. The output of the generator is expressed as a linear combination

of several activated units. According to the results of two case studies, by analyzing the well-trained model

based on the proposed generating mechanism interpreting method, the learned representations of the SC-GAN

contain the key frequency components of the vibration signal, even if directly using the time-domain signals

during the training process. These results provide a clear and direct explanation of the proposed data

augmentation scheme.

The remainder of the paper is organized as follows: A brief introduction to the GAN and sparse

autoencoders (SAEs) is given in Section 2. The proposed interpretable data augmentation scheme for machine

fault diagnosis is described in Section 3, including the model structure and two-stage training process of the

SC-GAN, as well as the method of mechanism interpretation. Experimental validations are given in Sections

4 and 5, followed by the discussion in Section 6 and conclusions in Section 7.

2 Generative adversarial networks and sparse autoencoders

2.1 Generative adversarial networks

Figure 1 Architecture of a GAN.

As shown in Figure 1, a GAN consists of two dual parts, namely, the generator (G) and discriminator (D).

Both G and D are neural networks. G receives random vectors sampled from a certain prior noise distribution,

transforms them into a distribution as similar to the real data as possible, and attempts to confuse D to give

the wrong distinguishing results. In contrast, D tries to determine whether the input sample is a synthesized

sample or a sample obtained from the real data. Mathematically, given the noise vector  , the generator

outputs the synthesized sample  . The discriminator receives the generated sample  and real

sample  and outputs the probability values  and  that reflect whether the input sample is

derived from the real data. G tries to learn the real data distribution and outputs  to confuse D to predict

=1, while D tries to predict =0 and =1.

Consequently, the loss function of G is given in Eq. (1).



(1)

The loss function of D is given in Eq. (2).



(2)

The generator and discriminator play a binomial zero-sum game. In the training process, the parameters

of G and D are alternatively updated by adversarial training, as shown in Eq. (3).







(3)

At the end of the adversarial training process, G and D reach a Nash equilibrium, where G can generate

realistic samples and D tends to predict equal probabilities for real and generated samples.

2.2 Sparse autoencoders

An AE is a neural network that tries to learn a reconstruction mapping from the input data to itself. Given

the input , the AE learns the nonlinear function 

, where 

 represents reconstructed data

and   and   are the weight matrixes and bias vectors of the encoder and decoder,

respectively.

Within the encoder, the input data are transformed to a hidden space by a linear transformation and

nonlinear activation function:

 

(4)

 is the encoded vector in the hidden space.  are the parameters of the encoder, including  and

.  is the nonlinear activation function.

Similar to the encoder, within the decoder, the encoded vector is transformed to the reconstructed vector:





(5)

Here,  is the activation function of the decoder.

The mean squared error (MSE) between 

 and  is usually selected as the loss function, as shown in

Eq. (6), where  is the number of samples.







 





(6)

If a sparsity constraint is imposed on the encoded vector by the hidden layer, the features captured by the

AE will be more implementable and robust. Limiting the activation of hidden units is an effective solution.

We denote the activation value of the jth hidden unit corresponding to the input sample  as .

Thus, the average activation level of this hidden unit within  samples is 





 .

It is expected that the average activation of each hidden unit remains at the low level , where  is the

sparsity parameter that is set to a positive number near 0. The KL divergence between  and  is considered

a part of the loss function that needs to be minimized, as shown in Eq. (7), where 

 is the vector composed

of the activation values of all  hidden units.



 

 







(7)

An AE with the sparsity constraint is referred to as an SAE. The final loss function of the SAE is a

combination of Eq. (6) and Eq. (7), where  controls the strength of the sparsity penalty.

 



(8)

3 Sparsity-constrained generative adversarial networks for raw vibration

signal generation

In this section, the two-stage training process of the proposed SC-GAN is described. The two-stage

training method consists of an SAE-based pretraining stage and a GAN-based adversarial training stage. An

effective method for revealing the interpretable patterns captured by the SC-GAN will be established. This

method can also be adapted to analyze other neural networks with similar structures.

3.1 Two-stage training process of the SC-GAN

Figure 2 Schematic of SAE-based pretraining, GAN-based adversarial training and the connection between these processes.

As shown in Figure 2, in the pretraining stage, an SAE with one input layer, one hidden layer with

sparsity-constrained units and one output layer is trained to accurately reconstruct the raw vibration signals.

After pretraining, the decoder part of the SAE is separated and reconstructed to form the generator of the SC-

GAN by adding a fully connected layer before it. The encoder is separated and reconstructed to act as the

discriminator of the SC-GAN by adding an output unit after it.

Stage 1: SAE-based pretraining

In the first stage, an SAE is iteratively trained by the minibatch stochastic gradient descent (SGD)

algorithm until convergence is achieved. In this stage, the encoder is enabled to transform vibration signals to

sparse vectors, thereby compressing the critical information contained in the original data into a limited

number of activated units. Correspondingly, the decoder acquires the ability to reconstruct the original

vibration signals from the sparse vectors. The pseudocode is expressed as follows:

Algorithm 1 Minibatch SGD training for the SAE

for  iterations do

⚫ Shuffle the dataset  that contains  vibration signal

samples of a health condition.

⚫ Let .

for  

 iterations do

⚫ Sample a minibatch  from  that

contains  samples.

⚫ Calculate the reconstruction error loss  based on Eq. (6).

⚫ Calculate the sparsity loss 

 based on Eq. (7).

⚫ Update the parameters considering the SAE loss based on Eq. (8).

 

 

⚫ Let .

end for

Stage 2: GAN-based adversarial training

In the second stage, the pretrained encoder and decoder of the SAE are utilized as part of the discriminator

and generator, respectively, of the SC-GAN.

Specifically, a fully connected neural layer is added before the pretrained decoder. The function of this

layer is to transform the random noise into a sparse vector, which can activate the embedded information in

the decoder to generate a realistic vibration signal. To keep the outputs of the second layer of the generator

sparse, a sparsity constraint is imposed on the generator’s second layer to introduce an additional sparsity loss.

The final loss function for the sparsity-constrained generator is:

 





(9)

where  controls the strength of the penalty,  is the sparsity parameter of the generator, and 





is the average activation vector of the second layer of the generator.

A sigmoid-activated unit is added after the pretrained encoder to form the discriminator. Similar to the

function mentioned in the SAE, the first layer of the discriminator is expected to extract critical features from

real or synthetic vibration signals and transform them into sparse vectors. The second layer issues

discrimination results according to the sparse vectors that reflect whether the samples are real or fake.

Consequently, the second layer of the discriminator has a sparsity constraint. The final loss function is:

 





(10)

where the variables have meanings analogous to those in Eq. (9).

The pseudocode is expressed as follows:

Algorithm 2 Minibatch SGD training for the SC-GAN.

for  iterations do

for  steps do

⚫ Sample a minibatch of  random noise samples .

⚫ Sample a minibatch of  samples  from the real

dataset.

⚫ Calculate the loss of the sparsity-constrained discriminator based on Eq.

(10).

⚫ Update the parameters of the discriminator by gradient descent:



end for

⚫ Sample a minibatch of  random noise samples .

⚫ Calculate the loss of the sparsity-constrained generator based on Eq. (9).

⚫ Update the parameters of the generator by gradient descent:



end for

In Algorithm 2,  controls the ratio of the training time of D to the training time of G.  and 

are the parameter sets of the discriminator and generator, respectively.

3.2 Method of interpreting the SC-GAN

Normally, DL methods, including GANs, are regarded as black boxes. This study proposes a method of

interpreting the SC-GAN to reveal the captured patterns that enable it to generate realistic synthetic vibration

signals.

After adversarial training, the generator with a three-layer structure of the SC-GAN can generate realistic

samples from random noise. Given the noise vector , the output of the first layer is a sparse vector:

 

(11)

where  and  are the weight matrix and bias vector, respectively, of the first layer of the

generator.

The sparse vector is then transformed into the vibration signal, which is output by the generator:

   

(12)

Here, we rewrite the weight matrix  with its column vectors and the sparse vector  with the

elements that it contains:

 















 







 

(13)

Since the activation function is monotonously increasing, it only affects the amplitude of the signal, not

the frequency, which is more critical to vibration-based fault diagnosis. In Eq. (13), regardless of the activation,

the generated vibration signals can be denoted as a linear combination of 





plus a bias

vector. The weights of the linear combination are the elements of the sparse vector. Due to the sparsity

restrictions, there are many close-to-zero values in the vector. Hence, the generator utilizes a sparse vector to

activate several columns that contain embedded discriminative information associated with the real vibration

signals. The generation mechanism of the SC-GAN can be interpreted by analyzing the column vectors of the

hidden layer’s weight matrix in the generator.

4 Case study 1: Rolling element bearings

4.1 Data description

A rolling bearing dataset provided by Case Western Reserve University (CWRU) is employed in this

study (Smith and Randall, 2015). The data under a 1-hp load were applied and sampled with a frequency of

48 kHz. Data for 4 different kinds of health conditions, including normal (N), inner race fault (IR), rolling ball

fault (B), and outer race fault (OR) conditions, were collected. Additionally, the fault diameters were different:

0.007 inches (007), 0.014 inches (014) and 0.021 inches (021). Therefore, 10 health conditions, including

different fault modes and fault locations, were considered, as shown in Table 1. The vibration signals were

separated using a fixed-width window; as a result, each sample contained 320 data points.

Table 1 Health conditions and their abbreviations in Case study 1.

Health

condition

Fault mode

Fault diameters

Abbreviation

normal

inner race fault

0.007 inches

IR007

0.014 inches

IR014

0.021 inches

IR021

rolling ball fault

0.007 inches

B007

0.014 inches

B014

0.021 inches

B021

outer race fault

0.007 inches

OR007

0.014 inches

OR014

0.021 inches

OR021

4.2 Parameter settings

Model parameters

In the SAE-based pretraining stage, the AE has a single-hidden-layer structure with 320-160-320 neurons

in each layer. The activation functions in the hidden layer and output layer are sigmoid and tanh, respectively.

The functional forms of the sigmoid activation function and tanh activation function are denoted as Eq. (14)

and Eq. (15), respectively.



(14)



(15)

An L1 sparsity constraint with the parameter  is imposed on the hidden layer. In the GAN-

based adversarial training stage, the generator of the SC-GAN has a single-hidden-layer structure with 320-

160-320 neurons in each layer, and the discriminator has a structure of 320-160-1. The activation function in

the hidden layers of the generator and discriminator, as well as the output layer of the discriminator, is sigmoid.

The activation function in the output layer of the generator is tanh. An L1 sparsity constraint with the parameter

  is imposed on the hidden layers of both the generator and discriminator. The noise vector is

sampled from the uniform distribution between -1 and 1.

Training parameters

Both the SAE-based pretraining stage and GAN-based adversarial training stage use the Adam optimizer

with recommended hyperparameters and involve 2000 training epochs. The batch size is 64. The training ratio

between D and G is  . In addition, a soft-label training strategy is implemented, which means that the

label of a real sample is 0.95 instead of 1.

4.3 Model training

After the SAE-based pretraining process, the training loss during the adversarial training process is shown

in Figure 3(d). To demonstrate the necessity of AE-based pretraining and the sparsity constraint, experiments

with three other comparative models are performed. These models include the basic GAN model (GAN,

Figure 3(a)), GAN model with AE-based pretraining (AE-GAN, Figure 3(b)) and GAN model with a sparsity

constraint but no AE-based pretraining (SGAN, Figure 3(c)). In addition, to more clearly show the

convergence of the model, the last 200 training epochs of the SGAN and SC-GAN are enlarged in Figure 3.

The GAN and AE-GAN are far from Nash equilibrium. For SGAN, although the G-loss and D-loss values

almost reach Nash equilibrium, there is a gap between them, indicating that equilibrium is still not reached.

Only the two losses of SC-GAN converge to near equality, indicating that SC-GAN can be stably trained.

Figure 3 Loss of the generator and discriminator during the adversarial training stage: (a) GAN, (b) AE-GAN, (c) SGAN, and (d)

SC-GAN.

Loss curves cannot reflect the variation in the generated sample quality at different training epochs. Thus,

we introduce the MMD to measure the quality of the generation. The definition of MMD between the two

probability distributions p and q in space is:

( ) ( )

( )

( , , ) sup 



   

 = −

   

x p x q

MMD p q E f x E f x

(16)

where



is the class of functions

:→f

, and is the reproducing kernel Hilbert space (RKHS).

Given two datasets

 

1, ,=

and

 

1, ,=

, the estimation of MMD can be denoted as:

( ) ( ) ( )

1 1



=−



p q p

p q

MMD nn

X X x x

(17)

where

( ):



→

is a feature map. After employing the kernel method, Eq. (17) can be rewritten as:

( ) ( ) ( ) ( )

1/2

1 1 1 1 1 1

, , , ,

= = = = = =



= + +





  

p p q q p q

n n n n n n

i j i j i j

i j i j

p q p p q q p q

p p q q p ij

MMD k k k

n n n n n n

X X x x x x x x

(18)

where

( )

is a kernel function. In this case, the Gaussian kernel is applied:

( )

, exp / 2



= − −



k x x x x

The MMD distance between real samples and synthetic samples is calculated at the end of each training

epoch, as shown in Figure 4. The light noisy curve is the original MMD of each training epoch. The dark

smooth curve is the result of smoothing the original curve by locally weighted scatterplot smoothing

(LOWESS) (Cleveland, 1981) to better reflect the overall trend of MMD changes. In Figure 4, as adversarial

training continues, the discrepancy between real samples and synthetic samples gradually decreases,

demonstrating that the quality of the generated samples increases.

Figure 4 Variation in the MMD distance between real samples and synthetic samples with the training epoch.

4.4 Generation quality evaluation

4.4.1 By a frequency spectra analysis

Figure 5 Raw vibration signals and frequency spectra of generated samples and the corresponding real samples: (a) NORMAL,

(b) IR007, (c) IR014, (d) IR021, (e) B007, (f) B014, (g) B021, (h) OR007, (i) 0R014, and (j) OR021.

For ten health conditions, the generated raw vibration signals and real signals, as well as the generated

frequency spectra and real spectra, are shown in Figure 5. Beyond the time domain, the generated samples

show high similarities with the real samples in the frequency domain. The key frequencies of the generated

samples and real samples match, which indicates that the SC-GAN can capture useful patterns hidden in the

raw signals in the time and frequency domains, although the signal spectrum is not directly applied during the

training process.

4.4.2 By a fault diagnosis task

Data augmentation is usually performed to improve the performance of machine fault diagnosis

algorithms. As a critical issue in health monitoring tasks, a fault diagnosis experiment is designed to evaluate

the quality of generated samples and their contributions to improvements in the diagnosis results. A stacked

AE-based deep neural network (DNN) is employed to build the diagnosis model in this study; this DNN was

proposed by Jia et al. (Jia et al., 2016). The model structure is 160-120-80-120-160. The activation function

in the hidden layers is a rectified linear unit (ReLU). The number of pretraining epochs is 100 for each AE,

and 200 epochs are used for fine-tuning. The batch size is 32, and the Adam optimizer is applied. The diagnosis

model needs to be trained on the training set, which contains vibration signal samples collected from the

bearings under the ten health conditions listed in Table 1. After training, the model needs to diagnose the health

condition of the bearing when given vibration signal samples in the testing set. To reflect a case with

insufficient training data, the DNN needs to be trained on 10 real samples and tested on 50 samples (for each

health condition, the same process). Different numbers of generated samples are added to the training set to

reflect different levels of data augmentation. Five independent trials were carried out for each experimental

setting. The mean diagnosis accuracy and standard deviation were calculated, as shown in Table 2. According

to the results, when the training data are insufficient, data augmentation by SC-GAN can significantly improve

the diagnosis accuracy by 7.44%. Furthermore, as the proportion of generated samples increases, the diagnosis

accuracy also increases. It can be inferred that the quality of the generated samples is sufficient enough that

these samples can substitute the missing information caused by insufficient real data.

Table 2 Diagnosis results with different numbers of real samples and generated samples for training.

Real training samples

(each health condition)

Generated training samples

(each health condition)

Testing samples

(each health condition)

Mean accuracy (%)

80.44

82.44

84.96

86.48

87.12

87.88

Standard deviation (%)

2.14

2.04

1.23

1.75

1.34

0.74

4.5 Physical interpretation of the model and generation mechanism

Figure 6 Selected weight vectors of neurons in the hidden layer of the generator and the corresponding frequency spectrum

obtained by FFT: (a) #26, (b) #18, (c) #135, (d) #114, (e) #72, (f) #155, (g) #81, and (h) #148.

DNNs are typically regarded as black boxes, which limits their use since people are unclear about how

they work. By implementing the proposed interpretation method for the SC-GAN, the weight vectors of the

generator under the NORMAL condition are analyzed to reveal what the model learns. Eight neurons are

selected from all the 160 neurons in the hidden layer; the raw weight vectors and corresponding frequency

spectra are shown in Figure 6. As Figure 5 shows, the spectrum of the NORMAL condition mainly contains

three key frequency components. As shown in Figure 6, these neurons have learned useful but diverse patterns

hidden in the raw vibration signals. Some neurons mainly learned one frequency component, such as neurons

#26 and #18 (the lowest frequency), neuron #135 (the middle frequency), and neuron #114 (the highest

frequency). Some neurons, such as neurons #72 and #155, learned a combination of two frequency

components. Some neurons learned all the key components, and these neurons included #81 and #148.

Therefore, the generation mechanism of the proposed SC-GAN can be interpreted as follows: (1) First, the

random noise fed into the generator is embedded into a sparse vector by the first layer; (2) Second, a few

nonzero values in the sparse vector activate the corresponding neurons and weight vectors that contain

different components of the vibration signal; (3) Third, the activated weight vectors are linearly combined

based on the weights provided by the sparse vector, which enables the final outputs to contain complete

frequency components similar to real signals; and (4) Last, the combined vector is transformed by the

nonlinear activation function, which adjusts the signal amplitude to obtain the final synthetic samples that are

output by the generator.

4.6 Comparison with related methods

To understand the complexity of the data and overall performance of the method, the proposed SC-GAN

is compared with related methods from two aspects.

(1) To illustrate the improvement from the innovation of the proposed method, three baseline models are

selected as comparisons: the basic GAN (without pretraining and sparsity constraint), AE-GAN (without

sparsity constraint) and SGAN (without pretraining).

According to the experimental results, none of the three comparison methods achieved the generation of

vibration signals. As shown in Figure 3, the GAN and AE-GAN failed to converge to the Nash equilibrium

state during the adversarial training process. Although the SGAN has a tendency to converge, compared with

SC-GAN, it still has a large deviation from the Nash equilibrium. The SGAN also failed to stably generate

meaningful vibration signals.

(2) To illustrate the superiority of the proposed method compared to related methods in the literature, two

models proposed by studies cited in the introduction section, AC-GAN (Zhang et al., 2020) and Deep-GAN

(Zhang et al., 2020), are selected for comparison.

The AC-GAN was validated using an induction motor vibration dataset in the original paper. To eliminate

the interference caused by different datasets, we reproduced the AC-GAN and validated it on the CWRU

dataset with the same parameter settings. Since the validation of the Deep-GAN applied the CWRU dataset in

the original paper, the experiment was not repeated, and the results in the original paper were used for

comparative analysis.

Figure 7 Five randomly generated vibration signals of different health conditions: (a) IR014, (b) B014, and (c) OR014.

In the comparative experiment, AC-GAN converges near the Nash equilibrium state and can realize

vibration signal generation. However, the AC-GAN produces a typical problem of the GAN: mode collapse.

Using the trained model, 5 independent samples were generated for three health conditions: IR014, B014 and

OR014. The results are shown in Figure 7. As shown in Figure 7, although the input noise vectors are sampled

from a probability distribution (uniform distribution between -1 and 1) randomly and independently, the

vibration signals generated by AC-GAN are almost identical. Once the model falls into a mode collapse, it

will not be able to generate diversified samples for data augmentation and diagnostic capability improvement.

For the Deep-GAN, although it has realized vibration signal generation on the CWRU dataset, the model

structure is more complicated. According to the original paper, the generator consists of 3 convolutional layers,

3 fully connected layers, 5 batch normalization layers and 1 flatten layer, and the discriminator consists of 2

convolutional layers, 2 fully connected layers and 1 flatten layer. Due to the complicated model structure, it

is difficult to tune the hyperparameters, and the required number of training epochs, which is set to 200,000

in the original paper, will also increase.

Table 3 Comparison of the proposed method and related methods (N-No, Y-Yes, H-High, L-Low).

Model

Is it able to generate

vibration signals?

Mode collapse

Is it explainable?

Model complexity

GAN

AE-GAN

SGAN

AC-GAN

Deep-GAN

SC-GAN (proposed)

In summary, the five comparison methods are compared with the proposed method in different aspects,

as shown in Table 3. It can be concluded that compared to related methods, the proposed SC-GAN can realize

stable generation of vibration signals with no mode collapse. Simultaneously, the mechanism of vibration

signal generation can be explained, and the model complexity is relatively low.

5 Case study 2: Gearbox

5.1 Experiments and data description

To validate the effectiveness and performance of the proposed method, a dataset collected from a

planetary gearbox is utilized. The dataset is available in

https://drive.google.com/file/d/1NWPbfAq52Wb9M0X5OoZF17eezUhPDZT5/view?usp=sharing. The test

bench employed for collecting the data in this case study is the power transmission fault prediction test bench

manufactured by Spectra Quest, USA. This test bench consists of a driver, a lubrication system, a driver motor,

four gearboxes, a load motor, torque sensors and force sensors, as shown in Figure 8. Faults were injected into

the planetary gear of the gearbox, including one missing tooth, broken, wear and tooth root cracks. The gear

shape that corresponds to the four fault modes is shown in Figure 9. The vibration sensor was attached to the

outer end cap of the input shaft of the tested planetary gearbox via a thread connection.

The dataset contains 5 health conditions: normal condition, one missing tooth, wear, broken and tooth

root cracks. The rotating speed of the gear was 60 Hz, and the load was 1.2Nm. For each health condition,

data of 10 seconds were collected with a sampling frequency of 12.8 kHz. The data were divided into samples

with 640 points, so there were 200 samples for each health condition.

Figure 8 Testbench used to collect the gearbox dataset.

Figure 9 Gear shape of the four fault modes: (a) one missing tooth, (b) broken, (c) wear, (d) tooth root crack.

5.2 Parameter settings

Model parameters

In the SAE-based pretraining stage, the AE has a structure of 640-320-640 neurons in each layer. In the

GAN-based adversarial training stage, the structure of the generator is 640-320-640, and the structure of the

discriminator is 640-320-1. Other parameters, including the activation functions and the sparsity constraint in

the pretraining and adversarial training stages, are the same as those of the previous case study. The noise

vector is sampled from the uniform distribution between -1 and 1.

Training parameters

With the exception that the training involves 5000 epochs and the batch size is 32, all the other parameters

are the same as those of the previous study.

5.3 Vibration signal generation results and model physical interpretation

Figure 10 Raw vibration signals and frequency spectra of generated samples and the corresponding real samples: (a) Normal,

(b) One missing tooth, (c) Wear, (d) Broken, and (e) Tooth root crack.

The generated vibration signals and corresponding frequency spectra of 5 different health conditions of

the gearbox are shown in Figure 10. According to the generation results, the SC-GAN can generate realistic

vibration signals that are similar to real samples in the time and frequency domains, although the signals of

the gearbox are much more complicated than those of the rolling element bearings. The good performance of

the SC-GAN on the gearbox dataset illustrated its generalization ability for vibration signals that contain more

noise and frequency components. In particular, there are more high frequencies in the vibration signals of the

gearbox than those of the rolling bearings.

Table 4 Diagnosis results with different numbers of real samples and generated samples for training.

Real training samples

(each health condition)

Generated training samples

(each health condition)

100

300

600

Testing samples

(each health condition)

600

Mean accuracy (%)

94.92

96.51

100

Standard deviation (%)

3.47

5.08

Figure 11 Variation in the mean testing accuracy with fine-tuning epoch of the fault diagnosis task using different numbers of

generated training samples. The accuracy is the average of five independent trials.

To validate the effectiveness of the proposed data augmentation scheme, a DNN model is established to

complete a fault diagnosis task using different numbers of generated training samples. Except for the number

of neurons in each layer and number of fine-tuning epochs, all the other hyperparameters are the same as those

in the case of rolling element bearings. Here, the model structure is 320-160-80-160-320, and the fine-tuning

epoch is 100. In this case study, a more severe small sample condition is considered, with only 5 real vibration

signal samples available for training and 600 samples for testing. Five independent trials were carried out for

each experimental setting. The diagnosis results are shown in Table 4. According to the results, the fault

diagnosis performance with insufficient real training samples can be improved significantly. Although the

diagnosis accuracy reaches 100% and no longer increases when there are 50 generated samples, the required

number of epochs is reduced when the number of generated samples increases, as shown in Figure 11. The

samples generated by SC-GAN have reasonable consistency with real samples, so the DNN is provided with

more information during the AE-based pretraining stage and the model parameters are located at better

positions at the beginning of the fine-tuning stage.

Figure 12 Selected weight vectors of neurons in the hidden layer of the generator and the corresponding frequency spectrum

obtained by FFT: (a) #315, (b) #100, (c) #0, (d) #184, (e) #83, (f) #84, (g) #272, and (h) #34.

The SC-GAN that is trained to generate vibration signals of the wear fault mode is selected to illustrate

the generation mechanism. By implementing the proposed model interpretation method, the weight vectors of

the generator and the corresponding FFT spectrum are shown in Figure 12. Similar to the case of rolling

element bearings, in this case study, the neurons also learned useful but diverse frequency domain patterns

contained in the time domain vibration signals. In addition, the frequency components learned by the model

cover all the key frequency bands of the real vibration signals. Therefore, the interpretation of the raw vibration

signal generation mechanism stated in section 4.5 still holds.

6 Discussion

As stated in the introduction, the main contributions of this research can be summarized in the following

two points: the proposal of a data augmentation scheme that can stably generate raw vibration signals and an

interpretation of the generation mechanism of the proposed model.

For the first point, different from existing GAN-based methods, the proposed SC-GAN takes advantage

of AE-based pretraining and the imposition of the sparsity constraint. In the pretraining stage, patterns of the

vibration signals can be prestored in the network, providing strong prior information for GAN-based

adversarial training. It makes the GAN stable and easier to reach the Nash Equilibrium, as shown in Figure 3.

From a frequency domain perspective, the energy of the vibration signals of industrial machines for different

health states is concentrated at several characteristic frequencies. Therefore, the pretraining stage before the

game process of a GAN can guide the subsequent training process, instead of letting the entire game process

start a random exploration from scratch. The sparsity constraint forces the models to learn more useful patterns,

which is also stated by Lei et al. (Lei et al., 2016), so that the model can reconstruct or generate vibration

signals with fewer activated neurons.

For the second point, by the physical interpretation of the model and the generation mechanism illustrated

in Figure 6 and Figure 12, it is noted that the neurons have successfully mined the characteristic frequency

hidden in the vibration signals instead of becoming confused in the noise. Eventually, the trained SC-GAN

managed to obtain the characteristic frequencies and generate new vibration signals by combining the learning

components of each characteristic frequency rather than copying the time-domain signals.

The stable generation ability and clear generation mechanism interpretation make the proposed SC-GAN

a reliable data augmentation scheme for machine fault diagnosis.

The authors believe that this research provides new inspiration for vibration signal generation methods

applied for data augmentation of machine fault diagnosis. As described in the Introduction, most of these

methods focus on generating FFT spectra or handcrafted features of a raw vibration signal, inevitably causing

information loss. A few studies that attempt to generate a raw vibration signal are focused on the use of a more

complex GAN architecture to improve the learning capability of the model. This research proposes two other

new aspects, namely, the pretraining mechanism and introduction of sparse regularization constraints. The

pretraining mechanism has been widely employed in the field of deep learning. This method ensures that the

GAN has a good initial state. During the AE-based pretraining process, the network learns key features of the

vibration signal, thereby avoiding the instability caused by training from scratch. The introduction of sparse

regularization constraints has been shown to be able to improve the performance of the model. Especially in

(Lei et al., 2016), sparse filtering feature learning enables the network to learn meaningful frequency

components in vibration signals. This research further illustrates its important role in vibration signal

generation tasks. The authors suggest that future research could attempt to employ other GAN architectures

to achieve vibration signal generation with the premise of fully utilizing the two points of inspiration, or after

adding these two improvements, try to use GAN to generate other data that are difficult to generate at this

stage.

7 Conclusions

In this paper, a novel data augmentation scheme for machine fault diagnosis based on an SC-GAN that

contains a two-stage training process is developed. The scheme can adaptively capture the distribution of

vibration signals and stably generate high-quality raw vibration data without a complicated model structure.

Furthermore, a method of interpreting the patterns learned by the SC-GAN is proposed. To the best of our

knowledge, this method is quite an early attempt to reveal the mechanism of the GAN-based data

augmentation method. In a case study on rolling element bearings, vibration signals under ten health

conditions are generated, and the quality of the signals is evaluated by a frequency spectra analysis and a fault

diagnosis task. The generated samples show high consistency with real samples in the time and frequency

domains and significantly contribute to improvements in diagnostic performance when real data are lacking.

The learned patterns are obtained from the model to support the interpretation of the generation mechanism.

Additionally, another case study on the gearbox validates the generation ability of the SC-GAN for other types

of industrial machines, of which the vibration signals are noisier and more complicated.

In future research, the authors would like to focus on the combination of GAN-based data augmentation

and transfer learning methods. In the experiment performed to evaluate the generation quality by a fault

diagnosis task of rolling element bearings, the results show that although there is a significant increase in the

diagnosis accuracy, it is difficult to obtain an accuracy as high as that achieved using real data. This finding

may indicate that there is still a divergence between the distributions of real and generated data. Transfer

learning is probably a feasible way of narrowing this gap, so further investigation is warranted.

Acknowledgment

This study was supported by the National Natural Science Foundation of China (Grant Nos. 61973011

and 61803013), the Fundamental Research Funds for the Central Universities (Grant No.YWF-20-BJ-J-517

and YWF-20-BJ-J-723), National key Laboratory of Science and Technology on Reliability and

Environmental Engineering (Grant Nos. 6142004180501 and 6142004190501), the Research Fund(Grant No.

61400020401), the Capital Science & Technology Leading Talent Program (Grant No. Z191100006119029),

as well as the China Postdoctoral Science Foundation (Grant No. 2019M650438).

References

Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C., 2009. Safe-Level-SMOTE: Safe-Level-Synthetic

Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem BT - Advances in

Knowledge Discovery and Data Mining, in: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B.

(Eds.), . Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 475–482.

Cabrera, D., Sancho, F., Long, J., Sanchez, R.V., Zhang, S., Cerrada, M., Li, C., 2019. Generative

Adversarial Networks Selection Approach for Extremely Imbalanced Fault Diagnosis of Reciprocating

Machinery. IEEE Access 7, 70643–70653. https://doi.org/10.1109/ACCESS.2019.2917604

Chawla, N. V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE: Synthetic minority over-

sampling technique. J. Artif. Intell. Res. 16, 321–357. https://doi.org/10.1613/jair.953

Chen, J., Li, Z., Pan, J., Chen, G., Zi, Y., Yuan, J., Chen, B., He, Z., 2016. Wavelet transform based on inner

product in fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process.

https://doi.org/10.1016/j.ymssp.2015.08.023

Chen, J., Shen, Y., Ali, R., 2019. Credit Card Fraud Detection Using Sparse Autoencoder and Generative

Adversarial Network. 2018 IEEE 9th Annu. Inf. Technol. Electron. Mob. Commun. Conf. IEMCON

2018 1054–1059. https://doi.org/10.1109/IEMCON.2018.8614815

Chen, Z., Zhai, W., Wang, K., 2019. Vibration feature evolution of locomotive with tooth root crack

propagation of gear transmission system. Mech. Syst. Signal Process. 115, 29–44.

https://doi.org/10.1016/j.ymssp.2018.05.038

Cleveland, W.S., 1981. LOWESS: A program for smoothing scatterplots by robust locally weighted

regression. Am. Stat. 35, 54.

Dibaj, A., Ettefagh, M.M., Hassannejad, R., Ehghaghi, M.B., 2020. A hybrid fine-tuned VMD and CNN

scheme for untrained compound fault diagnosis of rotating machinery with unequal-severity faults.

Expert Syst. Appl. 114094. https://doi.org/10.1016/j.eswa.2020.114094

Ding, Y., Ma, L., Ma, J., Wang, C., Lu, C., 2019. A generative adversarial network-based intelligent fault

diagnosis method for rotating machinery under small sample size conditions. IEEE Access 7, 149736–

149749. https://doi.org/10.1109/ACCESS.2019.2947194

Gao, Y., Liu, X., Xiang, J., 2020. FEM Simulation-Based Generative Adversarial Networks to Detect

Bearing Faults. IEEE Trans. Ind. Informatics 16, 4961–4971. https://doi.org/10.1109/TII.2020.2968370

Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio,

Y., 2014. Generative Adversarial Nets, in: Proceedings of the 27th International Conference on Neural

Information Processing Systems - Volume 2, NIPS’14. MIT Press, Cambridge, MA, USA, pp. 2672–

2680.

Guo, Q., Li, Y., Song, Y., Wang, D., Chen, W., 2020. Intelligent Fault Diagnosis Method Based on Full 1-D

Convolutional Generative Adversarial Network. IEEE Trans. Ind. Informatics 16, 2044–2053.

https://doi.org/10.1109/TII.2019.2934901

He, Z., Shao, H., Cheng, J., Zhao, X., Yang, Y., 2020. Support tensor machine with dynamic penalty factors

and its application to the fault diagnosis of rotating machinery with unbalanced data. Mech. Syst.

Signal Process. 141, 106441. https://doi.org/10.1016/j.ymssp.2019.106441

Jia, F., Lei, Y., Lin, J., Zhou, X., Lu, N., 2016. Deep neural networks: A promising tool for fault

characteristic mining and intelligent diagnosis of rotating machinery with massive data. Mech. Syst.

Signal Process. 72–73, 303–315. https://doi.org/10.1016/j.ymssp.2015.10.025

Lei, Y., He, Z., Zi, Y., 2011. EEMD method and WNN for fault diagnosis of locomotive roller bearings.

Expert Syst. Appl. 38, 7334–7341. https://doi.org/10.1016/j.eswa.2010.12.095

Lei, Y., Jia, F., Lin, J., Xing, S., Ding, S.X., 2016. An Intelligent Fault Diagnosis Method Using

Unsupervised Feature Learning Towards Mechanical Big Data. IEEE Trans. Ind. Electron. 63, 3137–

3147. https://doi.org/10.1109/TIE.2016.2519325

Li, X., Zhang, W., Ding, Q., 2019. Cross-domain fault diagnosis of rolling element bearings using deep

generative neural networks. IEEE Trans. Ind. Electron. 66, 5525–5534.

https://doi.org/10.1109/TIE.2018.2868023

Liu, H., Zhou, J., Xu, Y., Zheng, Y., Peng, X., Jiang, W., 2018. Unsupervised fault diagnosis of rolling

bearings using a deep neural network based on generative adversarial networks. Neurocomputing 315,

412–424. https://doi.org/10.1016/j.neucom.2018.07.034

Mao, W., Liu, Y., Ding, L., Li, Y., 2019. Imbalanced fault diagnosis of rolling bearing based on generative

adversarial network: A comparative study. IEEE Access 7, 9515–9530.

https://doi.org/10.1109/ACCESS.2018.2890693

Pan, T., Chen, J., Xie, J., Chang, Y., Zhou, Z., 2020. Intelligent fault identification for industrial automation

system via multi-scale convolutional generative adversarial network with partially labeled samples. ISA

Trans. 101, 379–389. https://doi.org/10.1016/j.isatra.2020.01.014

Radford, A., Metz, L., Chintala, S., 2016. Unsupervised Representation Learning with Deep Convolutional

Generative Adversarial Networks.

Rauber, T.W., De Assis Boldt, F., Varejão, F.M., 2015. Heterogeneous feature models and feature selection

applied to bearing fault diagnosis. IEEE Trans. Ind. Electron. 62, 637–646.

https://doi.org/10.1109/TIE.2014.2327589

Shao, S., Wang, P., Yan, R., 2019. Generative adversarial networks for data augmentation in machine fault

diagnosis. Comput. Ind. 106, 85–93. https://doi.org/10.1016/j.compind.2019.01.001

Smith, W.A., Randall, R.B., 2015. Rolling element bearing diagnostics using the Case Western Reserve

University data: A benchmark study. Mech. Syst. Signal Process.

https://doi.org/10.1016/j.ymssp.2015.04.021

Wang, J., Li, S., Han, B., An, Z., Bao, H., Ji, S., 2019. Generalization of Deep Neural Networks for

Imbalanced Fault Classification of Machinery Using Generative Adversarial Networks. IEEE Access 7,

111168–111180. https://doi.org/10.1109/ACCESS.2019.2924003

Wang, Z., Wang, J., Wang, Y., 2018. An intelligent diagnosis scheme based on generative adversarial

learning deep neural networks and its application to planetary gearbox fault pattern recognition.

Neurocomputing 310, 213–222. https://doi.org/10.1016/j.neucom.2018.05.024

Wu, J. Da, Liu, C.H., 2009. An expert system for fault diagnosis in internal combustion engines using

wavelet packet transform and neural network. Expert Syst. Appl. 36, 4278–4286.

https://doi.org/10.1016/j.eswa.2008.03.008

Zhang, W., Li, Xiang, Jia, X.D., Ma, H., Luo, Z., Li, Xu, 2020. Machinery fault diagnosis with imbalanced

data using deep generative adversarial networks. Meas. J. Int. Meas. Confed. 152, 107377.

https://doi.org/10.1016/j.measurement.2019.107377

Zhang, Y., Li, X., Gao, L., Wang, L., Wen, L., 2018. Imbalanced data fault diagnosis of rotating machinery

using synthetic oversampling and feature learning. J. Manuf. Syst. 48, 34–50.

https://doi.org/10.1016/j.jmsy.2018.04.005

Zheng, T., Song, L., Wang, J., Teng, W., Xu, X., Ma, C., 2020. Data synthesis using dual discriminator

conditional generative adversarial networks for imbalanced fault diagnosis of rolling bearings. Meas. J.

Int. Meas. Confed. 158, 107741. https://doi.org/10.1016/j.measurement.2020.107741

Zhou, F., Yang, S., Fujita, H., Chen, D., Wen, C., 2020. Deep learning fault diagnosis method based on

global optimization GAN for unbalanced data. Knowledge-Based Syst. 187, 104837.

https://doi.org/10.1016/j.knosys.2019.07.008

A meta transfer learning method for gearbox fault diagnosis with limited data

Article

Full-text available

May 2024
MEAS SCI TECHNOL

Intelligent diagnosis of mechanical faults is an important means to guarantee the safe maintenance of equipment. Cross domain diagnosis may lack sufficient measurement data as support, and this bottleneck is particularly prominent in high-end manufacturing. This paper presents a few-shot fault diagnosis methodology based on meta transfer learning for gearbox. To be specific, firstly, the subtasks for transfer diagnosis are constructed, and then joint distribution adaptation is conducted to align the two domain distributions; secondly, through adaptive manifold regularization, the data of target working condition is further utilized to explore the potential geometric structure of the data distribution. Meta stochastic gradient descent is explored to dynamically adjust the model’s parameter based on the obtained task information to obtain better generalization performance, ultimately to achieve transfer diagnosis of gearbox faults with few samples. The effectiveness of the approach is supported by the experimental datasets of the gearbox.

Deep optimal feature extraction and selection-based motor fault diagnosis using vibration

Article

Full-text available

Apr 2024
ELECTR ENG

The rolling elements of the induction motor are highly susceptible to faults. The detection and diagnosis of rolling element faults are accurate and reliable only when the extracted features are accurate. The paper proposes an approach for bearing and rotor fault diagnosis using deep optimal feature extraction and selection based on vibration signal analysis. The deep feature extraction is done using an ensemble deep models features extraction approach in which features are extracted from seven pretrained models are fused serially using serial-based feature fusion technique. This leads to a solution for a higher efficacy model, but at the cost of high processing time as the feature data set gets large. A unique approach termed Ensemble Feature Selection has been developed to address this issue and limit the harmful impact of unwanted features in data-driven diagnostics. The processing time is further reduced using the shallow classifier at the fully connected layer. The proposed model is tested using the data acquired in the laboratory and validated using the available online benchmark data sets.

Effective time-series Data Augmentation with Analytic Wavelets for bearing fault diagnosis

Article

Full-text available

Feb 2024
EXPERT SYST APPL

Generative artificial intelligence and data augmentation for prognostic and health management: Taxonomy, progress, and prospects

Article

Jun 2024
EXPERT SYST APPL

Data-Driven Machinery Fault Detection: A Comprehensive Review

Preprint

Full-text available

May 2024

In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep learning approaches have been used ubiquitously in manufacturing. Timely and accurately identifying faulty machine signals is vital in industrial applications for which many relevant solutions have been proposed and are reviewed in many articles. Despite the availability of numerous solutions and reviews on MFD, existing works often lack several aspects. Most of the available literature has limited applicability in a wide range of manufacturing settings due to their concentration on a particular type of equipment or method of analysis. Additionally, discussions regarding the challenges associated with implementing data-driven approaches, such as dealing with noisy data, selecting appropriate features, and adapting models to accommodate new or unforeseen faults, are often superficial or completely overlooked. Thus, this survey provides a comprehensive review of the articles using different types of machine learning approaches for the detection and diagnosis of various types of machinery faults, highlights their strengths and limitations, provides a review of the methods used for condition-based analyses, comprehensively discusses the available machinery fault datasets, introduces future researchers to the possible challenges they have to encounter while using these approaches for MFD and recommends the probable solutions to mitigate those problems. The future research prospects are also pointed out for a better understanding of the field. We believe this article will help researchers and contribute to the further development of the field.

Sparse measure of bearing fault features based on Legendre wavelet multi-scale multi-mode Entropy

Article

May 2024
COMPUT ELECTR ENG

TFFNet: A Robust Approach with Anti-noise and Domain Shift Adaptation for Intelligent Fault Diagnosis of Rotating Machinery

Article

Feb 2024
J VIB CONTROL

Recently, deep learning has been a predominantly used technique for intelligent fault diagnosis of industrial machines. It has accomplished satisfactory performance as well. However, noise is present in a real-life industrial working environment, and the operational load also constantly changes. This work proposes a Time-Frequency Fusion Network (TFFNet) for intelligent fault diagnosis. It is robust convolutional neural network based deep-learning algorithm and eliminates the signal processing required for denoising. The success of the developed model is verified in the presence of real-time noisy conditions and under a load-varying environment. The proposed model attained 99.98% accuracy in a noisy environment and 98.6% average accuracy under six cases of domain shift. Finally, the results are compared with past studies using accuracy as a performance indicator.

Advanced Acoustic Leak Detection in Water Distribution Networks Using Integrated Generative Model

Article

Mar 2024
WATER RES

A zero-sample intelligent fault diagnosis method for bearings based on category relationship model

Article

Apr 2024
ENG APPL ARTIF INTEL

Review of Interpretable Deep Learning in Fault Diagnosis and Its Application Prospectives in Condition Monitoring

Conference Paper

Aug 2023

Data Synthesis using Dual Discriminator Conditional Generative Adversarial Networks for Imbalanced Fault Diagnosis of Rolling Bearings

Article

Full-text available

Mar 2020
MEASUREMENT

Diagnosis of rolling bearings plays an important role in condition monitoring of industrial rotating machinery. In many actual applications, rolling bearings work in normal state at most time and faulty samples are difficult to be collected. Thus, it is easy to arise problem of imbalanced dataset which restricts accuracy and stability of fault diagnosis. Generative adversarial networks (GANs) have been proved to be effective to produce artificial data that are alike real data, and have been widely used in image fields. Data synthesis using deep generative model provide a promising methodology for imbalanced fault diagnosis of machinery. In this paper, we propose a novel framework named dual discriminator conditional generative adversarial networks (D2CGANs) to learn from sensor signals on multimodal fault samples and automatically synthesize realistic one-dimensional signals of each fault. The framework is designed to produce realistic multimodal samples with fault labels and dual-discriminator structure is benefit to enhance the quality and diversity of synthesized data without mode collapse. Then, synthesized data can be used for data augmentation to improve the accuracy of imbalanced fault diagnosis. In order to evaluate the performance of the generative model, we introduce a set of assessments to evaluate quality and diversity of synthesized data, including quantitative statistical metrics and qualitative visualization. Finally, experiments on rolling bearings datasets from Case Western Reserve University (CWRU) are implemented to verify the effectiveness of the proposed approach for imbalanced fault diagnosis. Results demonstrate our method outperforms other widely used synthesis techniques in terms of data synthesis quality and fault diagnosis accuracy, and timeliness analysis also denotes our method can meet requirement of online fault diagnosis.

Intelligent fault identification for industrial automation system via multi-scale convolutional generative adversarial network with partially labeled samples

Article

Full-text available

Jan 2020
ISA T

Rolling bearings are the widely used parts in most of the industrial automation systems. As a result, intelligent fault identification of rolling bearing is important to ensure the stable operation of the industrial automation systems. However, a major problem in intelligent fault identification is that it needs a large number of labeled samples to obtain a well-trained model. Aiming at this problem, the paper proposes a semi-supervised multi-scale convolutional generative adversarial network for bearing fault identification which uses partially labeled samples and sufficient unlabeled samples for training. The network adopts a one-dimensional multi-scale convolutional neural network as the discriminator and a multi-scale deconvolutional neural network as the generator and the model is trained through an adversarial process. Because of the full use of unlabeled samples, the proposed semi-supervised model can detect the faults in bearings with limited labeled samples. The proposed method is tested on three datasets and the average classification accuracy arrived at of 100%, 99.28% and 96.58% respectively Results indicate that the proposed semi-supervised convolutional generative adversarial network achieves satisfactory performance in bearing fault identification when the labeled data are insufficient.

A Generative Adversarial Network-Based Intelligent Fault Diagnosis Method for Rotating Machinery Under Small Sample Size Conditions

Article

Full-text available

Nov 2019

Rotating machinery plays a key role in mechanical equipment, and the fault diagnosis of rotating machinery is a popular research topic. To overcome the dependency on expert knowledge regarding conventional time-frequency analysis diagnosis methods, machine learning (ML) and artificial intelligence (AI)-based methods are commonly studied. Although these methods can achieve high-accuracy diagnosis results, they are based on a large number of training samples. A generative adversarial network (GAN) is an algorithm with the capability of generating realistic samples that are similar to the real samples, and it can be applied to solve fault diagnosis problems with insufficient training data, which is called the small sample size condition in this study. However, a single-GAN model cannot achieve a good diagnostic result. To achieve adaptive feature extraction and high diagnosis accuracy, this study proposes an intelligent fault diagnosis method for rotating machinery based on GANs under small sample size conditions. The effectiveness and performance of the proposed method are validated using rolling bearing and gearbox datasets. In these datasets, only 10% and 20% of the samples are selected as the training data. Samples associated with different health conditions and various working conditions are included in the datasets. Compared with those of other diagnosis methods, the high-accuracy and low-volatility diagnosis results indicate that the proposed method can stably distinguish fault modes under different working conditions in an adaptive way, even though few training samples are available.

Generalization of Deep Neural Networks for Imbalanced Fault Classification of Machinery Using Generative Adversarial Networks

Article

Full-text available

Jun 2019

Mechanical fault datasets are always highly imbalanced with abundant common mechanical fault samples but a paucity of samples from rare fault conditions. To overcome this weakness, simulation of rare fault signals is proposed in this paper. Specifically, frequency spectra are employed as model signals, then Wasserstein generative adversarial network (WGAN) is implemented to generate simulated signals based on a labeled dataset. Finally, the real and artificial signals are combined to train stacked autoencoders (SAE) to detect mechanical health conditions. To validate the effectiveness of the proposed WGAN-SAE method, two specially designed experiments are carried out and some traditional methods are adopted for comparison. The diagnosis results show that the proposed method can deal with imbalanced fault classification problem much more effectively. The improved performance is mainly due to the artifical fault signals generated from WGAN to balance the dataset, where the signals that are lacking in training dataset are effectively augmented. Furthermore, the learned features in each layer of the generator network are also analyzed via visualization, which may help us understand the working process of WGAN.

A hybrid fine-tuned VMD and CNN scheme for untrained compound fault diagnosis of rotating machinery with unequal-severity faults

Article

Oct 2020
EXPERT SYST APPL

In the case of a compound fault diagnosis of rotating machinery, when two failures with unequal severity occur in distinct parts of the system, the detection of a minor fault is a complicated and challenging task. In this case, the minor fault is overshadowed by the more severe one, and the characteristics of the compound fault are prone to the more severe one. Generally, the proposed methods in the literature consider compound failure as an individual fault type and unrelated to the corresponding single faults, either at the different locations of a sensitive component or in two separate parts, such as the bearing and gear, with approximately the same fault severity. Considering these issues, this study proposes a novel end-to-end fault diagnosis method based on fine-tuned VMD and convolutional neural network (CNN). The main idea is that CNN is trained only on a healthy and single fault dataset, without the use of compound fault data in training. In the test stage of the CNN model, the intelligent method alarms an untrained compound fault state if acquired probabilities of CNN output satisfy a set of probabilistic conditions. The performance of the fine-tuned VMD and the proposed hybrid method is evaluated by the decomposition of a simulated vibration signal and the analysis of a gearbox system with a compound fault scenario in such a way that one fault is minor and the other severe. The results obtained show the high accuracy of the proposed method in compound fault diagnosis and the feature extraction and classification of a minor fault in the presence of a more severe one.

Machinery fault diagnosis with imbalanced data using deep generative adversarial networks

Article

Dec 2019
MEASUREMENT

Despite the recent advances of intelligent data-driven fault diagnosis methods on rotating machines, balanced training data for different machine health conditions are assumed in most studies. However, the signals in machine faulty states are usually difficult and expensive to collect, resulting in imbalanced training dataset in most cases. That significantly deteriorates the effectiveness of the existing data-driven approaches. This paper proposes a deep learning-based fault diagnosis method to address the imbalanced data problem by explicitly creating additional training data. Generative adversarial networks are firstly used to learn the mapping between the distributions of noise and real machinery temporal vibration data, and additional realistic fake samples can be generated to balance and further expand the available dataset afterwards. Through experiments on two rotating machinery datasets, it is validated that the data-driven methods can significantly benefit from the data augmentation, and the proposed method offers a promising tool on fault diagnosis with imbalanced training data.

FEM Simulation- Based Generative Adversarial Networks to Detect Bearing Faults

Article

Jan 2020

Complete fault samples is essential to activate artificial intelligent (AI) models. A novel fault detection scheme is proposed to build a bridge between AI and real-world running mechanical systems. Firstly, the finite element method (FEM) simulation is used to simulate samples with different faults to overcome the shortcoming of missing fault samples. Secondly, to enlarge datasets, new samples similar to the simulation and measurement fault samples are generated by generative adversarial networks (GANs) and further combined with the original simulation and measurement samples to obtain synthetic samples. Finally, the synthetic and unknown fault samples are severed as the training and test samples, respectively to the classifiers of AI models, and the unknown fault types will be finally determined. A public datasets of bearings have been used to verify the effectiveness of the proposed scheme. It is expected that the proposed scheme can be extended to complex mechanical systems.

Support tensor machine with dynamic penalty factors and its application to the fault diagnosis of rotating machinery with unbalanced data

Article

Oct 2019

The fault diagnosis methods of rotating machinery based on machine learning have been developed in the past years, such as support vector machine (SVM) and convolutional neural networks (CNN). SVM just can be only used for the classification of the vector space in which the feature data extracted from raw signals are input data in vector form, so SVM loses its functions while the input feature data are high order tensors which can contain rich feature information of rotating machinery. Moreover, a large number of data are needed in CNN, but it’s hard to get large numbers of fault samples of rotating machinery under different conditions. Recently, a kind of tensor classifier called support tensor machines (STM) can solve the problems in the above methods. But when the input samples of STM are unbalanced data, the hyper-plane obtained by the training of STM may not be the optimal hyper-plane and it may reduce the overall classification rate. Therefore, in this paper, a novel tensor classifier called support tensor machine with dynamic penalty factors (DC-STM) is proposed and applied to the fault diagnosis of rotating machinery. In this method, for linear separable case, linear support tensor model with dynamic penalty factors (DC-LSTM) is proposed, which does not ignore the impact of rare support vectors of a class with less training samples on the structural risk. Subsequently, for nonlinear separable case, a tensor kernel function is introduced into DC-LSTM, and nonlinear support tensor model with dynamic penalty factors (DC-NSTM) is proposed. In order to verify the performance of DC-STM in unbalanced data classification, it is applied to fault classification of rotating machinery with unbalanced data. The experimental results show that the proposed method can achieve better classification results when the training samples of rotating machinery are unbalanced data.

Intelligent Fault Diagnosis Method Based on Full 1D Convolutional Generative Adversarial Network

Article

Aug 2019

Data-driven fault diagnosis is essential for the reliability and safety of industry equipment. However, the lack of real labeled fault data make the machine learning based diagnosis methods difficult to carry out. To solve this problem, this paper proposes a new fault diagnosis framework called multi-label 1D generation adversarial network (ML1D-GAN). In our method, Auxiliary Classifier GAN (AC-GAN) is utilized first for real damage data generation. Then the generated and real damage data are both used to train the fault classifier. Experimental results reveal that the generated data is applicable, and ML1D-GAN improves the diagnosing accuracy for real bearing faults from 95% to 98% when trained with the generated data. The scalability of the learning model is also proven in the experiment.

Deep Learning Fault Diagnosis Method Based on Global Optimization GAN for Unbalanced Data

Article

Jul 2019
KNOWL-BASED SYST

Deep learning can be applied to the field of fault diagnosis for its powerful feature representation capabilities. When a certain class fault samples available are very limited, it is inevitably to be unbalanced. The fault feature extracted from unbalanced data via deep learning is inaccurate, which can lead to high misclassification rate. To solve this problem, new generator and discriminator of Generative Adversarial Network (GAN) are designed in this paper to generate more discriminant fault samples using a scheme of global optimization. The generator is designed to generate those fault feature extracted from a few fault samples via Auto Encoder (AE) instead of fault data sample. The training of the generator is guided by fault feature and fault diagnosis error instead of the statistical coincidence of traditional GAN. The discriminator is designed to filter the unqualified generated samples in the sense that qualified samples are helpful for more accurate fault diagnosis. The experimental results of rolling bearings verify the effectiveness of the proposed algorithm.

An interpretable data augmentation scheme for machine fault diagnosis based on a sparsity-constrained generative adversarial network

Abstract and Figures

Recommended publications

Binomial adversarial representation learning for machinery fault feature extraction and diagnosis

A Generative Adversarial Network-Based Intelligent Fault Diagnosis Method for Rotating Machinery Und...

Fault Diagnosis of Hydraulic Systems with Missing Data Based on FGCN: A Data Augmentation Approach

An adversarial model for electromechanical actuator fault diagnosis under nonideal data conditions