ArticlePDF Available

Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox

March 2023
International Journal of Green Energy 20(01)

March 2023
20(01)

DOI:10.1080/15435075.2023.2194375

Authors:

Liu Yongqian

North China Electric Power University

Show all 5 authorsHide

Labeling the fault data is a time-consuming and expensive operation. Therefore, the monitoring data obtained from wind farms are rarely accurately labeled. The method of deep adversarial transfer neural network for diagnosis of gearbox in wind turbine was put forward, which used the auxiliary data set and solved the problem of large data distribution differences with the help of domain adversarial method to transfer the features learned by auxiliary data set to the data from wind turbines. The fault diagnosis model under the condition of unsupervised was established, which, to a certain extent, reduced the dependence of the deep learning model to the labeled data obtained from wind turbine. The effectiveness of proposed method was verified by using vibration data from bearing failure test at Case Western Reserve University and measured vibration data from the gearbox in wind turbine. The results showed that this method was effective in realizing the cross-domain transfer mission of the fault diagnosis model between similar domains and provided a new direction for constructing the data-driven fault diagnosis model. ARTICLE HISTORY

The training and inferencing procedures of deep adversarial transfer neural network.

…

The flow chart of the algorithm.

…

The images of source domain samples in time domain and frequency domain.

…

The images of target domain samples in time domain and frequency domain.

…

The time-frequency spectrum of source domain samples.

…

Figures - uploaded by Hang Meng

Content may be subject to copyright.

Content uploaded by Hang Meng

Content may be subject to copyright.

Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=ljge20

International Journal of Green Energy

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/ljge20

Deep adversarial transfer neural network for fault

diagnosis of wind turbine gearbox

Yuanchi Ma, Yongqian Liu, Zhiling Yang, Ming Cheng & Hang Meng

To cite this article: Yuanchi Ma, Yongqian Liu, Zhiling Yang, Ming Cheng & Hang Meng

(2023): Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox,

International Journal of Green Energy, DOI: 10.1080/15435075.2023.2194375

To link to this article: https://doi.org/10.1080/15435075.2023.2194375

Published online: 31 Mar 2023.

Submit your article to this journal

View related articles

View Crossmark data

Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox

Yuanchi Ma, Yongqian Liu, Zhiling Yang, Ming Cheng, and Hang Meng

State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources, North China Electric Power University, Beijing, China

ABSTRACT

Labeling the fault data is a time-consuming and expensive operation. Therefore, the monitoring data

obtained from wind farms are rarely accurately labeled. The method of deep adversarial transfer neural

network for diagnosis of gearbox in wind turbine was put forward, which used the auxiliary data set and

solved the problem of large data distribution dierences with the help of domain adversarial method to

transfer the features learned by auxiliary data set to the data from wind turbines. The fault diagnosis

model under the condition of unsupervised was established, which, to a certain extent, reduced the

dependence of the deep learning model to the labeled data obtained from wind turbine. The eective-

ness of proposed method was veried by using vibration data from bearing failure test at Case Western

Reserve University and measured vibration data from the gearbox in wind turbine. The results showed

that this method was eective in realizing the cross-domain transfer mission of the fault diagnosis model

between similar domains and provided a new direction for constructing the data-driven fault diagnosis

model.

ARTICLE HISTORY

Received 16 September 2022

Accepted 4 January 2023

KEYWORDS

Wind turbine gearbox; fault

diagnosis; transfer learning;

domain adversary; deep

learning

1. Introduction

Operating in the severe nature environment for a long time,

the wind turbines have higher fault rate than conventional

generator system (Pang et al. 2020; Pérez-Pérez et al. 2022).

Gearbox is the crucial equipment in the wind turbine, which

connects the drive system and the generation system (Nejad,

Odgaard, and Moan 2018). Therefore, it’s common to see the

faults of gearbox like tooth breaking or wear of tooth surface

happening under complex loads. Compared with other com-

ponents of the wind turbine, the fault rate of gearbox is not the

highest, but it will cause a lot of inconvenience to maintain the

turbine once it couldn’t normally work. In this way, the shut-

down time and economic loss caused by gearbox always come

to the first among all the parts of the wind turbine (Aafif et al.

2022; Dabrowski and Natarajan 2017).

The gearbox of wind turbine is a specific complex rotating

machinery equipment with high reliability (Feng et al. 2013).

However, the conventional fault diagnosis technologies are

difficult to adapt this kind of complex system with a long life

and high reliability (Tang et al. 2022; Zhu et al. 2022). In

general, the fault of gearbox in wind turbine can only be

recorded once or twice over years and takes tens of the thou-

sands of hours to collect monitoring data. Therefore, it is

difficult or even impossible to establish the gearbox fault

diagnosis system based on supervised learning methods due

to little failure information from field monitoring data.

The failure or fault of wind turbine gearbox is caused by

performance degradation and appertains to degenerative fail-

ure. Its performance is gradually degraded before failure,

which will be determined as failure or fault to a certain extent.

If the wind turbines are manufactured and used under the

same condition, its failure levels will be the same, then the

degradation orbit and failure state should also be the same.

However, the real situation is not like that, even the same batch

of turbines in the same wind farm do not work under the same

condition and environment. Due to the influence of different

terrains and wakes of the other turbines, the wind speed,

direction and turbulence intensity of different turbine loca-

tions are all distinguishing. Sometimes the differences are even

wide. Therefore, the loads borne by the gearbox are quite

different. The degradation process of gearbox is always inter-

fered by various fluctuations, and finally leads to the fuzziness

of failure and fault judgment (Dhiman et al. 2021; Rahimilarki

et al. 2022). Then, it’s difficult to diagnose the failure and fault

state with clear criterion.

The fault diagnosis of wind turbine is a challenging pro-

blem. At present, domestic and overseas scholars have car-

ried out a lot of work in relevant directions and have made

some meaningful achievements. According to the classifica-

tion of diagnosis methods, the fault diagnosis methods of

wind turbine gearbox are divided into the physical model

and data-driven model. According to the classification of

signal classification, including vibration signal, acoustic sig-

nal, electrical signal, temperature and oil composition. The

vibration analysis is the most commonly applied condition

monitoring technology for rotating machinery and is the

most effective method for fault diagnosis of wind turbine

drive trains (Isham et al. 2019). Time domain analysis,

frequency domain analysis and the time frequency domain

analysis are the main methods of traditional vibration. R.

Uma Maheswari et al. (Maheswari and Umamaheswari

2017) concluded the feature extraction and fault classifica-

tion of non-linear and non-stationary signals in variable

CONTACT Yongqian Liu yqliu@ncepu.edu.cn State Key Laboratory of Alternate Electrical Power System with Renewable Energy Sources, North China Electric

Power University, Beijing 102206, China

INTERNATIONAL JOURNAL OF GREEN ENERGY

https://doi.org/10.1080/15435075.2023.2194375

speed drive such as wind turbines drive chain to improve the

fault diagnosis accuracy. However, considering of the limita-

tions of existing methods in reality, the fault identification of

wind turbine gearbox still depends on the expert’s experi-

ence to make the final judgment, which is subjective, and is

difficult to describe it clearly in a formalized way, which

means that these methods will not be versatile and general-

izable in wind farms with multitude turbines of different

models and operating conditions.

At present, the fault diagnosis and health evaluation

methods of wind turbine based on data-driven have become

a research hotspot in this field. Ruonan Liu (Liu et al. 2015)

has studied and summarized different artificial intelligence

techniques like k-nearest neighbor, naive bayes, support

vector machine and artificial neural network on the problem

of fault diagnosis of rotating machinery. Although the con-

ventional machine learning is widely studied in fault diag-

nosis, the training process has high requirements for the

amount of fault data. However, the fault data of large com-

ponents of wind turbine are scarce, especially for the data

with fault labels, which has impeded the application of

supervised learning in condition monitoring due to the

lack of fault samples. In addition, with the aging of the

condition of the wind turbine gearbox, its performance will

gradually decrease, and the signal representing the condition

of gearbox will change its distribution accordingly. It will

seriously affect the generalization ability of diagnostic model

if it is not able to adaptively adjust.

In view of this limitation, unsupervised learning is advanta-

geous in cases of fault identification of the wind turbine gear-

box. The current research on unsupervised learning in pattern

recognition mainly includes denoising autoencoder and clus-

tering algorithm. Guoqian Jiang et al. (Jiang et al. 2017) pro-

posed a novel feature representation learning approach based

on stacked multilevel-denoising autoencoder (SMLDAEs),

which enabled to learn more general fault feature simulta-

neously at different scales from complex frequency spectra of

raw vibration data and improved the accuracy of fault identi-

fication of gearbox. Cameron Sobie (Sobie, Freitas, and Nicolai

2018) addressed the challenges for diagnosing roller bearings

with race faults by generating training data using information

gained from high resolution simulations of roller bearing

dynamics. The problem of vibration-based damage detection

for a population of nominally identical structures is considered

by K.J. Vamvoudakis-stefanou (Vamvoudakis-Stefanou,

Sakellariou, and Fassois 2018) via unsupervised statistical

time series type methods in order to solve the problem of

vibration damage monitoring, which realized a significant

improvement on its performance. Zhuang Li (Li 2016) opti-

mized the fuzzy kernel clustering neural network by using

particle swarm algorithm and the input unlabeled data is

classified in the high dimensional mapping space, the gearbox

fault features with high universality are identified and different

operational states of gearbox are distinguished successfully.

However, the effect of unsupervised learning based on cluster-

ing algorithm depends on the degree of similarity between

different data samples, so it is difficult to distinguish specific

types of faults, and the unsupervised learning is still difficult to

apply in actual fault diagnosis. In addition, the current

unsupervised method relies on the known fault samples to

train the network, so the problem of the lack of labeled data

is not solved.

The method of domain adaption in transfer learning pro-

vides a new solving direction of the task transformation

between different domains (Pan and Yang 2010; Zhuang

et al. 2015). Therefore, the data in the source environment

can transfer the task to the target data. Inspired by this idea, it

is considerable to introduce the bearing data with fault labels

to indicate the feature identification and fault classification on

the problem of gearbox fault diagnosis of wind turbine so as to

avoid the increased wind power operation cost in collecting

a large number of gearbox fault samples. At present, the

research of the transfer learning method in fault diagnosis

field is few. Chao Chen et al. (Chen, Shen, and Yan 2017)

carried out an enhanced LSSVM transfer learning strategy

based on supplemental data, which applied in bearing fault

diagnosis when data volume is insufficient. Aiming at the

problem of the motor diagnosis under variable speeds and

variable loads, the diagnostic method was proposed by Fei

Shen (Shen, Chen, and Yan 2017) based on singular value

decomposition of autocorrelation matrix, which combined

with feature extraction and transfer learning classifier.

Zgraggen et al. explore the main challenges of domain adapta-

tion for fault detection based on wind turbine SCADA data

and focus on fault detection algorithms for newly installed

turbines, or turbines with little historical data under diverse

operating conditions (Zgraggen et al. 2021). Jamil et al. pro-

posed an instance-based deep transfer learning method that

updates the weights of the source and the target machine

training samples separately. The results show that the pro-

posed method ignores negative transfer and achieves higher

accuracy compared to standard deep learning and deep trans-

fer learning methods. (Jamil et al. 2022). With the popularity of

deep learning methods, more and more researchers utilize the

transfer learning with deep neural network in this area.

Compared with traditional non-deep transfer learning, deep

transfer learning directly improves the learning effect on dif-

ferent tasks (Long et al. 2015, 2017; Tzeng et al. 2014; Yosinski

et al. 2014).

In this article, inspired by Wasserstein GAN (Arjovsky,

Chintala, and Bottou 2017; Gulrajani et al. 2017) and

domain adversarial neural network approaches (Ganin

et al. 2016; Shen et al. 2018; Tzeng et al. 2017), we propose

a method of deep adversarial transfer neural network for the

fault diagnosis of gearbox in wind turbine. This method

utilizes auxiliary data set, transfers the features learned

from the auxiliary data to the actual monitoring data and

establishes the fault diagnosis model under unsupervised

condition, which to some extent reduces the dependence of

the deep neural network model on the actual monitoring

labeled data of wind turbine. The paper consists of four

parts. The first section introduces the background and

development status of fault diagnosis technology of the

wind turbine drive chain. The second section introduces

modeling process of the deep adversarial transfer neural

network. The third sector presents the effect of the method

on the fault diagnosis task of wind turbine gearbox. The

conclusion is drawn in the final section.

2Y. MA ET AL.

2. Deep adversarial neural network model

2.1. Formal description on the problem of fault transfer

diagnosis

Aiming at the above problems of wind turbine gearbox fault

diagnosis, this paper proposes a transfer diagnosis method for

the fault of wind turbine gearbox. This method uses the test

data with labels from laboratory and the field data without

label from wind farm to proceed the feature transferring and

eventually implements the fault diagnosis of wind turbine

gearbox under the condition of unsupervised learning mode.

This task comes down to the problem of domain adaption in

the transfer learning field.

In order to be consistent with the notation of transfer

learning, we use s and t to refer to the source domain and

target domain respectively, and the auxiliary data set is

denoted as the source domain Ds, and the monitoring data

set of wind turbine as target domain Dt. The source domain

contains the auxiliary data set obtained under the control of

laboratory; the target domain is the data set of actual wind

turbine, which usually contains only a little even no annotated

data. The data samples in source and target domain respec-

tively the data sample sets in source and target domain respec-

tively Xs and Xt represent ys and yt represent respectively the

actual category of source domain and the target domain. Ys

and Yt respectively represent the category space of source

domain and the target domain.

In the view of the above marking ways, the

transfer diagnosis problem can be described in

a formal language as: given an annotated source domain

Xs¼ fxðiÞ

s;yðiÞ

sgn

i¼1 and an unannotated target domain

Xt¼ fxðjÞ

tgm

j¼1,and they have the same feature space, i.e.

Xs¼ Xt, and they also have the same category space

Ys¼ Yt. However, these two domains have different mar-

ginal distributions, i.e. Psxs

ð Þ�Ptxt

ð Þ, and the conditional

probability distributions are also different, i.e.

Psysjxs

ð Þ�PtðytjxtÞ. Because target category yt cannot be

observed in advance, it is incapable to implement super-

vised learning in target domain. In the approach of adver-

sarial transfer learning, the annotated source domain is

first used in training a source mapping fs:Rn!Rd and

classifier fc:Rd!R to realize the classification of the

source domain; the unannotated target data set Xt is

utilized to learn the feature mapping of the target domain

ft:Rn!Rd so as to minimize the distance of feature

distribution between source and target domains, i.e.

fsxs

ð Þ and ftxt

ð Þ; Finally, the feature of target domain is

input into the trained classifier fc to predict the labels of

target domain yt2 Yt. The Unsupervised domain adaption

is achieved in this way.

2.2. Adversarial transfer model

The challenge of unsupervised domain adaption comes from

the different distribution of data in source domain and target

domain. The literature

[26]

combined Generative Adversarial

Networks (GANs) proposed Adversarial Discriminative

Domain Adaption (ADDA), which is a domain adversarial

framework applying the technology of GANs. On this basis,

we propose a new approach to address the distribution differ-

ence between source and target domain, by minimizing the

Wasserstein distance of feature distribution from source and

target domain, which makes the distribution of features pro-

duced by the target domain feature extractor are close to that

produced by the source domain feature extractor so as to

predict the labels of target domain with source domain

classifier.

2.2.1. Wasserstein distance

Before we get into the domain adversarial model, it is neces-

sary to take a brief look at the Wasserstein distance.

Wasserstein distance is used to measure the distance between

two distributions, which is defined as:

W P1;P2

ð Þ ¼ inf

γ,�P1;P2

ð Þ Ex;yð Þ,γjjxyjj½  (1)

In the formula, �P1;P2

ð Þ is the set of all possible joint prob-

ability distribution combined P1 and P2. x and y can be sampled

from Each possible joint distribution γ, and then figure out the

distance jjxyjj, and the expected value of distance between

samples can be calculated in this joint probability distribution γ.

The lower bound to this expected value in all possible joint

distributions is the Wasserstein distance.

We can visualize the probability distributions P1 and P2 as

two piles of soil, Eðx;yÞ,γ½jjxyjj can be seen as the consump-

tion required to move pile P1 to pile P2 under the path γ. While

Wasserstein distance is the minimum consumption under

optimal path. Therefore, Wasserstein distance is also called

Earth-Mover Distance. Using Wasserstein distance as the loss

function can avoid the problem of gradient disappearing.

Kantorovich-Rubinstein duality principle is applied in

Wasserstein GAN

[25]

, and Wasserstein distance is approxi-

mated as the solution of the following optimization problem:

W P1;P2

ð Þ  max

f2DEx,P1f xð Þ½   Ex,P2f xð Þ½  (2)

In the formula, D is the set of functions satisfied the

1-Lipschitz constraint, i.e. the set of all functions satisfied the

following constraint jjfðx1Þ  fðx2Þjj  jjx1x2jj.

2.2.2. Adversarial transfer loss

According to the formal description of the problem of fault

transfer diagnosis, the adversarial transfer model includes two

objects, one is to minimize the Wasserstein distance between

the feature distributions of source domain and target domain,

the other is to minimize the classification error of source

domain classifier. Therefore, the loss function of adversarial

transfer model expressed as Equation (3).

L ¼ Lcxs;ys

ð Þ þ λLdxs;xt

ð Þ (3)

Lrepresents the ultimate loss of the domain adversarial model;

Lcxs;ys

ð Þ represents the classification loss of source domain; λ

is the parameter weighting two parts.

First of all, establishing the model of Lcxs;ys

ð Þ. Lcneeds to

reflect the classification loss of the annotated dataset, which is

completely consistent with the conventional model, and the

INTERNATIONAL JOURNAL OF GREEN ENERGY 3

Cross Entropy Loss function can be adopted, as shown in

Equation (4).

Lcxs;ys

ð Þ ¼  1

m1X

i¼1X

k¼1

1yðiÞ

s¼k

 log fcfsxðiÞ

  k(4)

In the formula, 1 yið Þ

s¼k

 is the indicator function, fsxið Þ

 

represents the feature of source domain, fcfsxið Þ

  k corre-

sponds to the kth dimension of output category distribution

fcfsxið Þ

  , m1 represents the sample size of source domain,

C is category size.

Next, establish the model of Ldðxs;xtÞ. According to

Equation (2), the Wasserstein distance calculation needs to

satisfy the 1-Lipschitz constraint. If the function fd with the

parameter θd satisfies the constraint 1-Lipschitz, then the

Wasserstein distance between source domain feature fxxs

ð Þ

and target domain feature ftxt

ð Þ can be expressed as:

Lwdðxs;xtÞ ¼ 1

m1X

xs2Xs

fdðfsðxsÞÞ  1

m2X

xt2Xt

fdðftðxtÞÞ (5)

In the formula, fsxs

ð Þand ftxt

ð Þrepresent respectively as source

domain feature and target domain feature, while the source

domain feature fxxs

ð Þ is fixed and m1 and m2 represent respec-

tively the sample size of source domain and target domain.

In order to enforce compliance with the 1-Lipschitz con-

straint, the author of the literature suggests using the method

of weights clip to address this problem. The proposed method

adds the constraint of gradient norm to the objective function,

which can approximately satisfy the 1-Lipschitz constraint.

The gradient penalty term can be expressed as:

Lpenaltyð^

hÞ ¼ Ñ^

hfdð^

hÞ







21

 2(6)

In the formula, the feature representation ^

h not only stands for

the feature of source domain and target domain, but also the

features of the region between the source domain and target

domain feature. Therefore, the specific form of LdXs;Xt

ð Þ can

be described as:

Ldxs;xt

ð Þ ¼ max

θdLwd γLpenalty

  (7)

In the formula, γ is the weight coefficient of equilibrium.

So far, the domain adversarial model has been transformed

to solve the following optimization problems:

min

θs;θt;θcLcxs;ys

ð Þ þ max

θdLwd γLpenalty

 

  (8)

In the formula, θs;θt;θc;θd respectively correspond to func-

tions fs;ft;fc;fd with parameters.

2.3. Deep adversarial transfer neural network

Inspired by Generative Adversarial Neural Networks (GANs),

the above four functions fs;ft;fc;fd with parameters can be

separately implemented by neural network. Figure 1 shows

the forward propagation and back propagation process of the

network, where fs;ft;fc;fd respectively correspond to the

source domain feature extractor, target domain feature extrac-

tor, classifier and discriminator. Unlike GANs, the function of

generator changes, which no longer generates new samples,

but instead plays the role of feature extraction: it constantly

learns the features of domain data, which makes it impossible

for the discriminator to distinguish between the two domains.

The main function of the feature extractor is mapping the

given data to the feature distribution space. It is realized

through a series of simple transformation of data mapping

from the input to the features. Compared with other network

Figure 1. The forward and back propagation process of deep adversarial transfer neural network.

4Y. MA ET AL.

structures, the feature extractor has deeper layers and more

complex layer structures, which ensures that the distinguish-

able features can be generated from complex original data,

while the specific network layer structure of feature extractor

relies on the form of input data. The classifier is used to carry

out the final classification task. Its network layer is fully con-

nected with shallow depth, usually only one or two layers. The

discriminator mainly evaluates the distribution differences of

input features so as to determine whether the input feature

comes from the source domain or the target domain. The

structure of discriminator is also relatively simple, which

usually has two layers of fully connected layers network. The

detailed configurations of each neural network’s structure are

given in the following experimental cases.

It needs to be noted that the network structures of source

and target domain feature extractor are the same. The target

feature extractor is trained under unsupervised conditions.

Compared with the source domain feature extractor, training

difficulty of target feature extractor is greatly increased, while

data distribution of source domain and target domain has

a certain similarity. The trained parameters in source domain

are taken as initial values to start fine-tuning

[22]

the para-

meters of target domain feature extractor, which could not

only transfers the feature structure information learned from

source domain, but make the target domain feature extractor

converge faster to a reasonable result.

Since the left and right terms in (8) respectively correspond

to different optimization parameters, they can be divided into

two optimization sub problems (9) and (10) to be solved

separately. While sub-problem (9) corresponds to

a supervised learning process in the source domain, the sub-

problem (10) corresponds to the WGAN. Therefore, the sol-

ving process of adversarial transfer model can be divided into

two stages: the pre-training stage and the domain adversarial

Figure 2. The training and inferencing procedures of deep adversarial transfer neural network.

Figure 3. The ﬂow chart of the algorithm.

INTERNATIONAL JOURNAL OF GREEN ENERGY 5

training stage. Figure 2 shows the training and inferencing

procedures of deep adversarial transfer neural network.

cmin

θs;θcLcxs;ys

ð Þ (9)

cmin

θt

max

θdLwd γLpenalty

  (10)

Based on the above modeling process, the training algorithm

of the deep adversarial transfer neural network is as follows.

Figure 3 shows the flow chart of the algorithm.

Algorithm 1 Deep Adversarial Transfer Neural Network

Require: source data Xs and label Ys; target data Xt; minibatch

size m; discriminator training step nd; coefficient γ;λ; learning

rate for source domain feature extractor α1, classifier α2, target

domain feature extractor α3 and discriminator α4.

1. Initial source domain feature extractor parameters θs ran-

domly, initial classifier parameters θc randomly.

2. while θs;θc have not converged do

3. Sample minibatch pair xið Þ

s;yið Þ

n om

i¼1 from Xs and Ys.

4. θs θsα1ÑθsLcxs;ys

ð Þ

5. θc θcα2ÑθcLcxs;ys

ð Þ

6. end while

7. Initial target domain feature extractor parameters θt θs,

initial discriminator parameters θd randomly.

8. while θt;θd have not converged do

9. for t¼1;. . . ;n do

10. Sample minibatch xið Þ

n om

i¼1 xið Þ

n om

i¼1from Xs and Xt.

11. Sample a random number �,U½0;1.

12. hs fsxs

ð Þ, ht ftxt

ð Þ

13. ^

h �hsþ ð1�Þht

14. θd θdþα3ÑθdLwd xs;xt

ð Þ  γLpenalty ^

 h i

15. end for

16. θt θtα4Ñθt1mPm

i¼1fdftxt

ð Þð Þ

 

17. end while

3. Case Analysis of wind turbine gearbox fault

diagnosis

In this paper, two kinds of data sets are used for case analysis.

One is the auxiliary data set with fault labels (referred as source

domain data in this paper), and the other is the gearbox data of

wind turbine without fault labels (referred as target domain

data in this paper). The purpose of this paper is to realize the

fault diagnosis of gearbox in the target domain under the

unsupervised learning by applying the fault diagnosis model

based on deep adversarial transfer neural network proposed in

this paper.

The source data used in this paper is from the bearing fault

data of the bearing data center of Case Western Reserve

University in America

[29]

. The test bearings support the

motor shaft. Single point faults were introduced to the test

bearings using elector-charge machining with fault diameters

of 7 mils, 14 mils, 21 mils, 28 mils, and 40 miles. SKF bearings

were used for the 7, 14, and 21 mils diameters fault, and NTN

equivalent bearings were used for the 28 mil and 40 mil faults.

The experiments were conducted for drive end bearings with

inner and outer raceway faults and the outer raceway faults

were located at 3 o’clock, at 6 o’clock, and at 12 o’clock. The

accelerometers were placed at the 12 o’clock position at the

motor housing to collect the normal and fault vibration signals

with the 16 channel DAT recorder. Digital data was collected

at 48,000 samples per second, and data was also collected at

48,000 samples per second. Speed and horsepower data were

collected using the torque transducer and were recorded in real

time. Since the fault category space of bearing is different from

gearbox, the fault types of bearing are merged into one class to

meet the requirement of the model. The source domain is

divided into two categories, normal and the fault, and labeled

respectively with 0 and 1. The size and percentage of various

fault samples in the source domain are shown in Table 1.

The target domain data used in this implementation is from

the monitoring gearbox vibration data of a 1.5 MW wind

turbine in north China. Considering the complexity and varia-

bility of the actual working condition of wind turbines, 406 sets

of gearbox vibration velocity signals in the radial and axial

direction of high-speed shaft and low-speed shaft ends were

selected at the rotational speed of 908, 914, 929, 949, 1013,

1069 and 1498rpm. The gearbox dataset contains 4 states,

which are normal, gear wearing, broken tooth and mechanical

loose. The sampling frequency of the selected data in the

implement is 5120 Hz, and the sampling amount of each

sample is 8192 points. In order to correspond to category

space of source domain, the dataset of target domain is also

divided into two categories, normal and fault, and labeled with

0 and 1 respectively. The sample size and percentage of various

fault samples in the target domain are shown in Table 2.

Table 1. The sample size and percentage of various fault samples in the source domain.

Fault location Normal Inner race Ball Outer race(3:00) Outer race(6:00) Outer race(12:00)

Sample Size 1696 3390 3389 2298 2181 1456

Percentage 11.8% 23.5% 23.5% 15.9% 15.1% 10.1%

Fault Label 0 1 1 1 1 1

Table 2. The sample size and percentage of various fault samples in the target domain.

Fault Location Normal Gear Wearing Broken Tooth Mechanical Loose

Sample Size 232 56 72 46

Percentage 57.1% 13.8% 17.7% 11.3%

Fault Label 0 1 1 1

6Y. MA ET AL.

Although there are many differences between the

laboratory bearing fault simulation data and the actual

wind turbine gearbox data, they still have the similarity

to a certain extent. The main simulation of bearing fault

experiment of Case Western Reserve University is bearing

the surface defect. For the rotating machinery, the surface

defect is usually caused by fatigue spalling, which has the

same mechanism as gear wearing and tooth fracturing. In

addition, if there is an early surface defect of the bearing,

an impact will be generated when the bearing contact

passes through the defect, which will stimulate the corre-

sponding feature frequency. While the harmonic compo-

nent will be also generated in vibration signal, when the

faults like gearbox wearing, tooth fracturing, mechanical

loose occur in gearbox. Therefore, it is convincing that

these faults can trigger corresponding feature frequency.

Even the frequency is quite different. Based on the above

two reasons and empirical analysis, we think that there is

certain similarity between the data of source domain and

target domain.

3.1. Data pre-processing

Through the observation of the source domain data and

target domain data, it is found that the two types of data

have great differences. The specific differences are as follows:

(1) Sampling frequency; (2) Data quality (SNR); (3) Data

size; (4) Data dimension; (5) Data distribution. To facilitate

modeling, the following data pre-processing steps are

performed:

(1) Implement down sampling for source domain data. In

view of the difference of sampling frequency in two

domains, the source domain data is down sampled to

reach 5120 Hz to be consistent with the target data sample

rate.

(2) Randomly divide training sets and testing sets. The

source domain and target domain were respectively

divided into training sets and testing sets with the

proportion of 75% (training) and 25% (testing), i.e.

source domain training data, source domain testing

data, target domain training data, target domain

testing data. Among them, source domain training

data and target domain training data are used to

train model, while the source domain testing data is

set to test the effect of pre-training, and the target

domain testing data is to test the final diagnosis

effect on target domain.

(3) Split each data with invariant time window to obtain

the vibration fragments of the same dimension. Since

the data size of the source domain is large, in order to

facilitate the analysis, the source domain data is

divided into a set of every 2048 points and a total

of 3077 sets of vibration acceleration fragments. The

original data in the target domain was also divided

into a set of every 2048 points and a total of 406 sets

of fragments.

(4) Implement the short-term Fourier transform (STFT)

to the vibration acceleration fragments. Although the

vibration acceleration signals of bearing and gearbox

are time-varying signals, the frequency component

varies little over time. Therefore, the short-term

Fourier transform can be applied to achieve a good

result. What’s more, the time-frequency spectrum

can be used in advanced neural network structure,

such as the two-dimensional conventional neural

network.

Figure 4 and Figure 5 respectively present the images of the

vibration acceleration fragments of source domain and target

domain after the pre-processing, which are drawn in time

domain and frequency domain. Figures 6 and Figure 7 respec-

tively present the time-frequency spectrum of source domain

and target domain.

3.2. The conguration of neural network

Since the input of the feature extractor is the time-frequency

spectrum of vibration acceleration fragments, which has the

data structure in the form of two-dimensional matrix, the

conventional neural network structure is adopted to the feature

extractor; while the classifier and discriminator are relatively

simple, the fully connected neural network can meet the

requirement of them. The configuration of each network is

shown in Table 3.

According to Table 3, the feature extractor is consist of 7

layers of network structure. The first layer (convolution layer) is

composed of 8 mappings(channels). Each neuron specifies

a receiving domain of size 3 × 3, and the neurons share 3 × 3

weight parameters; The second layer, batch normalization, sets 8

Table 3. The conﬁguration of network structure.

Feature extractor network structure

(same in source domain and target domain) Classiﬁer network structure Discriminator network structure

Network input (51×55 vibration spectrum) Network input (1×18, feature vector) Network input (1×18, feature vector)

3×3 conv,8

Batch Normalization,8

Max pooling/4

ReLU(Activation function)

3×3 conv,16

Batch Normalization,16

Max pooling/4

ReLU(Activation function)

FC,18

ReLU(Activation function)

FC,2

Softmax(Activation function)

FC,18

ReLU(Activation function)

FC,18

ReLU(Activation function)

FC,2

Network output(18 dimensions feature vector) Network output (2 dimensions feature vector) Network output (2 dimensions feature vector)

INTERNATIONAL JOURNAL OF GREEN ENERGY 7

mapping channels; The third layer, pooling layer, reduces the

size by 4 times; The fourth layer (convolution layer) is composed

of 16 mappings(channels). Each neuron specifies a receiving

domain of size 3 × 3, and the neurons share 3 × 3 weight para-

meters; The fifth layer, batch normalization, sets 16 mapping

channels; The sixth layer, pooling layer, reduces the size by 4

times; The seventh layer, the fully connected layer before output

function, consists of 18 output neurons, which are constructed

together as the output vector.

The classifier is composed of two layers of network struc-

ture. The first layer (fully connected layer) consists of 18

neurons and introduces activation function ReLU;

The second layer consists of 2 neurons and activation function

Softmax is introduced to output the final classification results.

The discriminator is composed of three layers of network

structure: The first layer (fully connected layer) consists of 18

neurons and introduces activation function ReLU; The second

layer consists of 2 neurons and introduces activation function

Softmax; The third layer consists of 2 neurons and outputs the

domain discriminative result.

3.3. Case analysis

3.3.1. Case design

In order to fully validate the applicability and advantages of the

model proposed in this paper, four cases C1,C4

ð Þ of different

testing conditions are designed and described as follows:

Figure 4. The images of source domain samples in time domain and frequency domain.

8Y. MA ET AL.

Figure 5. The images of target domain samples in time domain and frequency domain.

Figure 6. The time-frequency spectrum of source domain samples.

INTERNATIONAL JOURNAL OF GREEN ENERGY 9

C1: combine the source domain feature extractor and clas-

sifier, implement supervised learning on the source domain,

and validate the diagnostic effect on source domain.

C2: replace the target domain feature extractor with the

trained feature extractor on the source domain and validate

the diagnostic effect on target domain.

C3: train the network with the fault diagnosis method of

deep adversarial transfer neural network proposed in this

paper and validate the diagnostic effect on target domain.

C4: combine the target domain feature extractor and classi-

fier, implement supervised learning on the target domain, and

validate the diagnostic effect on target domain.

The above four scenarios C1,C4 are transition from

supervised learning on the source domain to the supervised

learning on the target domain. Case 1 is the pre-training

process in essence. Since there are numerous labeled data

on the source domain, high diagnostic performance is easily

achieved in C1. As the comparison case, the purpose of

scenario C2 is to reflect the data distribution difference

between source domain and target domain. If the distribu-

tions of these two domains are close, then the diagnostic

performance of C2 can be ideal, otherwise will be disappoint-

ing. The scenario C3 is designed for the deep adversarial

transfer neural network proposed in this paper. In this sce-

nario, the deep adversarial transfer neural network method

transfers the diagnostic information from the source domain

to the target domain to achieve the unsupervised learning on

the target domain. The scenario C4 is the supervised learning

on the target domain. Similarly, the purpose is to be com-

pared with scenario C3, and theoretically scenario C4 will

realize the best result on the target domain.

3.3.2. Evaluation criterion

In order to facilitate the evaluation of the model performance,

the single-number evaluation metric is needed to reflect the

diagnostic performance of the model. The combination of

Precision and Recall cannot be used as Single-number

Evaluation Metric, because they present two values to estimate

the classifier, while the application of multi-numbers evaluation

metric increases the difficulty of comparing the diagnostic

performance.

Accuracy is the single-number evaluation metric, which has

been commonly used to evaluate the diagnostic performance of

classifier. However, the accuracy is not suitable as the perfor-

mance evaluation criterion for fault diagnostic model, since the

amount of fault category in testing set is far less than the normal

category. If the accuracy is adopted as the standard, the diag-

nostic model is not able to accurately reflect the performance of

the less category (i.e. fault category). In contrast, F1 is the

harmonic mean of precision and recall value, which can reflect

the average level of the diagnostic system on imbalance dataset.

Therefore, F1 value is adopted as the single-number evaluation

metric in this paper, the calculation formulas of F1 value are as

follows:

P¼TP

TPþFP (11)

R¼TP

TPþFN (12)

F1¼2PR

PþR(13)

In the formulas, TP denotes the sample size of the positive

categories predicted as positive; FN denotes the sample size of

Figure 7. The time-frequency spectrum of target domain samples.

10 Y. MA ET AL.

the positive categories predicted as negative; FP denotes the

sample size of the negative categories predicted as positive; TN

denotes the sample size of the negative categories predicted as

negative; P represents accuracy rate; R represents recall rate.

3.3.3. Result analysis

The software environment of the case implement is

Ubuntu16.04, Python3.5, PyTorch0.3, and the hardware envir-

onment is two Intel Xeon E5-2680v4 Server Processor@ 2.4 GHz,

128GB memory, two NVIDIA GTX-1080 Graphics Processor.

The results of four cases are shown in Table 4, and F1 score,

precision value, recall value of each type of state in each

scenario are given in Table 4. It can be seen from the result

of scenario C1. If the supervised training is implemented with

labeled bearing data, the classification of the normal and fault

state can be perfectly achieved as expected.

From the result of scenario C2, we can tell that, it has

a significant influence on diagnostic performance when apply-

ing the source domain feature extractor directly on the target

domain. The classifier tends to determine the samples as fault

in this case, which indicates the difference between the dis-

tribution of the source domain and target domain data is

relatively large on the one hand. On the other hand, it is

demonstrated that the transfer performance of conventional

supervised learning is undesirable. Therefore, more advanced

technology must be adopted to address the problem of

distribution difference between source domain and target

domain.

It can be seen from the result of scenario C3 that the

diagnostic performance F1 score on average of deep adversar-

ial transfer neural network model achieved 90% under the

condition of unsupervised learning, and the precision rate

and recall rate are 100% and 78% respectively, which means

this method has a extremely low false alarm rate and accepta-

ble missed alarm rate. This result demonstrates the excellent

feature extractor ability and transfer learning ability.

Compared with the realization conditions of the other cases,

the method proposed in this paper only has the requirement

for the labeled dataset in the similar field and realizes

a satisfactory performance, which greatly reduces the time

and cost for developing the fault diagnosis system and demon-

strates the feasibility to some extent.

In the result of scenario C4, the F1 value also reached 100%,

which indicates that sufficiency of labeled fault samples can

significantly improve the performance of the diagnosis system.

However, marking the real fault data of gearbox is time-

consuming and costly. Therefore, the labeled fault data col-

lected in real operation is not enough to drive the supervised

learning of deep neural network. Due to this realistic reason,

scenario C4 cannot be applied to real gearbox fault diagnosis

problem. From the comparison of C3 and C4, it has been found

that the method proposed in this paper offers a seminal

Table 4. Case result.

Case Actual state Precision rate Recall rate Harmonic mean F1Sample size

C1Normal 1.00 1.00 1.00 114

Fault 1.00 1.00 1.00 656

Average/Total 1.00 1.00 1.00 770

C2Normal 0.00 0.00 0.00 56

Fault 0.45 1.00 0.62 45

Average/Total 0.20 0.45 0.27 101

C3Normal 0.85 1.00 0.92 56

Fault 1.00 0.78 0.88 45

Average/Total 0.92 0.90 0.90 101

C4Normal 1.00 1.00 1.00 56

Fault 1.00 1.00 1.00 45

Average/Total 1.00 1.00 1.00 101

Figure 8. the visualized distribution of dimensionality reduction feature on the source and target domaina) conventional supervised learning method (C1andc2) b)

deep adversarial transfer neural network(c3).

INTERNATIONAL JOURNAL OF GREEN ENERGY 11

thought and direction for the development of data-driven fault

diagnosis system.

3.3.4. Feature visualization

In order to further illustrate the advantage of proposed method

in solving the problem of data distribution differences, the

t-SNE dimensionality reduction method (van der Maaten

and Hinton 2008) is adopted to visualize the output feature

distribution of feature extractor, where red and blue respec-

tively represent the positive and negative samples (i.e., fault

and normal) of the source domain, and purple and green

respectively represent the positive and negative samples of

the target domain.

As can be observed from Figure 8a, the features of source

domain have been obviously distinguished in the scenario C1.

While in the scenario C2, the features of the target domain

couldn’t be completely distinguished, and there are still some

purple and green points mixed in the upper left corner, which

indicates the huge difference of feature marginal distribution.

In addition, it also can be seen from Figure 8a that the dis-

tribution distance of positive samples in the source domain

and target domain, and that of negative samples in the source

domain and target domain are extremely large, and the coin-

cidence rate is also quite low, which reflects the huge difference

in the conditional probability distribution between source and

target domain in feature space.

As can be seen from the feature distribution in Figure 8b, in

the scenario C3 the distribution correlation of the target

domain data and source domain data has been clearly

improved in feature space. The distances of positive samples

and negative samples increase greatly and the distances of the

same type are much smaller, which indicates that the deep

adversarial transfer neural network not only realizes the

matching of the marginal distribution for the source domain

and target domain, but the matching of conditional probability

distribution as well. It is found from the comparison of the left

and right figures that the deep adversarial transfer neural net-

work has made considerable improvement in addressing the

problem of large data distribution difference in cross-domain

transferring.

3.4. Results and discussion

Through the analysis of the four cases, the deep adversarial

transfer neural network diagnostic model proposed in this

paper shows two advantages of the application of wind turbine

gearbox fault diagnosis. One is the excellent effect of unsuper-

vised learning, the other is the improvement on huge differ-

ences data distribution problem in cross-domain transferring.

It is noteworthy that the category space is only divided into

normal type and fault type in the case analysis, while the

method proposed in this paper is suitable for multi-class

fault diagnosis problems. In addition, the case analysis is

mainly implemented in the scenario of unsupervised learning,

while this method is also applicable in the semi-supervised

scenarios.

Besides, taking the advantage of the similarity of wind

turbine structure of the same type and some operating para-

meters, this method can transfer the diagnostic experience of

one wind turbine to the state diagnostic tasks of similar wind

turbines and realize the sharing and promotion of knowledge.

Apart from the above advantages, since the adversarial

training technology is adopted in this method, it is inevi-

table to find that the existing problem of training diffi-

culty. Although the Wasserstein distance with better

performance is applied to the proposed model to measure

the difference between domains, it is still necessary to

adjust the hyper-parameter in the training process and

take long training time. While, as the theory of GANs

and the training technology become increasingly mature,

it is expected that these problems will be solved in the near

future.

4. Conclusion

In this paper, we put forward the method of deep adversarial

transfer neural network and apply it to the fault diagnosis of

wind turbine gearbox. The conclusions are as follows:

(1) Thanks to the powerful feature representation ability of

the deep learning model, this method can discover the

ignored mechanism, laws and knowledge of perfor-

mance degradation of the wind turbine from the mon-

itoring data and the auxiliary data. Better fault

representation can also be extracted automatically

from the vibration data, avoiding the limitation of the

artificial feature engineering, and realizing the gener-

icized technology of fault feature extraction.

(2) This approach is inspired by the thought of the transfer

learning technology, and creatively uses the auxiliary

data from laboratory, transfers the features learned

from the auxiliary data to the actual monitoring data

and establishes the fault diagnosis model under unsu-

pervised condition, which to some extent reduces the

dependence of the deep learning model on the actual

monitoring labeled data of the wind turbine.

(3) To some extent, the method solves the problem of huge

data distribution differences in cross-domain transfer-

ring, realizes the transfer of fault diagnosis model in

similar domains, provides the approach to realize the

transfer of diagnostic experience in similar domains,

and provides a new direction for establishing the fault

diagnosis model based on the data-driven method.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Funding

The work was supported by the the National Key Research and

Development Program of China [No.2019YFE0104800].

References

Aafif, Y., A. Chelbi, L. Mifdal, S. Dellagi, and I. Majdouline. 2022. Optimal

preventive maintenance strategies for a wind turbine gearbox. Energy

Reports 8:803–14. doi:10.1016/j.egyr.2022.07.084.

12 Y. MA ET AL.

Arjovsky, M., S. Chintala, and L. Bottou. 2017. Wasserstein GAN. arXiv.

doi:10.48550/arXiv.1701.07875.

Chen, C., F. Shen, and R. Q. Yan. 2017. Enhanced least squares support

vector machine-based transfer learing strategy for bearing fault

diagnosis. Chinese Jpurnal of Scientific Instrument 38 (01):33–40.

doi:10.19650/j.cnki.cjsi.2017.01.005.

Dabrowski, D., and A. Natarajan. 2017. Identification of loading condi-

tions resulting in roller slippage in gearbox bearings of large wind

turbines. Wind Energy 20 (8):1365–87. doi:10.1002/we.2098.

Dhiman, H. S., D. Deb, S. M. Muyeen, and I. Kamwa. 2021. Wind turbine

gearbox anomaly detection based on adaptive threshold and twin

support vector machines. IEEE Transactions on Energy Conversion

36 (4):3462–69. doi:10.1109/TEC.2021.3075897.

Feng, Y., Y. Qiu, C. J. Crabtree, H. Long, and P. J. Tavner. 2013.

Monitoring wind turbine gearboxes. Wind Energy 16 (5):728–40.

doi:10.1002/we.1521.

Ganin, Y., E. Ustinova, H. Ajakan, P. Germain, H. Larochelle,

F. Laviolette, M. Marchand, and V. Lempitsky. 2016. Domain-

adversarial training of neural networks. arXiv. http://arxiv.org/abs/

1505.07818 .

Gulrajani, I., F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville.

2017. Improved training of wasserstein GANs. In Advances in

neural information processing systems 30 (Nips 2017), ed. I. Guyon,

U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan,

and R. Garnett. Vol. 30. La Jolla: Neural Information Processing

Systems (nips). https://www.webofscience.com/wos/alldb/sum

mary/eb1a14a1-2b45-4785-9ddd-4c974b661fbe-4e83ebd9/rele

vance/1 .

Isham, M. F., M. S. Leong, M. H. Lim, and Z. A. Bin Ahmad. 2019.

Intelligent wind turbine gearbox diagnosis using VMDEA and ELM.

Wind Energy 22 (6):813–33. doi:10.1002/we.2323.

Jamil, F., T. Verstraeten, A. Nowé, C. Peeters, and J. Helsen. 2022. A deep

boosted transfer learning method for wind turbine gearbox fault

detection. Renewable Energy 197:331–41. doi:10.1016/j.renene.2022.

07.117.

Jiang, G., H. He, P. Xie, and Y. Tang. 2017. Stacked multilevel-denoising

autoencoders: A new representation learning approach for wind tur-

bine gearbox fault diagnosis. IEEE Transactions on Instrumentation

and Measurement 66 (9):2391–402. doi:10.1109/TIM.2017.2698738.

Li, Z. 2016. Research on methods of intelligent fault diagnosis for wind

turbine drive train based on unsupervised learning. Doctor, North

China Electric Power University. https://kns.cnki.net/kcms/detail/

detail.aspx?dbcode=CDFD&dbname=CDFDLAST2017&filename=

1016271653.nh&uniplatform=NZKPT&v=

0MMvOoFCNWmGd2Z5aFTxyAPVdUOABbPYBc9SeiSCX_

4ZLWD703ok5wDkpxYmUqYX.Liu .

Liu, R., B. Yang, E. Zio, and X. Chen. 2015. Artificial intelligence for fault

diagnosis of rotating machinery: a review. Mechanical Systems and Signal

Processing 108 (August):33–47. doi:10.1016/j.ymssp.2018.02.016.

Long, M., Y. Cao, J. Wang, and M. I. Jordan. 2015. Learning transferable

features with deep adaptation networks. In International Conference on

Machine Learning, ed. F. Bach and D. Blei, vol. 37, 97–105. San Diego:

Jmlr-Journal Machine Learning Research. https://www.webofscience.

com/wos/alldb/summary/2a9f5ee2-f030-45e6-9172-b4487f3c11ad

-4e83e6f9/relevance/1 .

Long, M., H. Zhu, J. Wang, and M. Jordan. 2017. Deep transfer learning

with joint adaptation networks. In International Conference on

Machine Learning, ed. D. Precup and Y. W. Teh, Vol. 70, San Diego:

Jmlr-Journal Machine Learning Research. https://www.webofscience.

com/wos/alldb/summary/24c008a7-291c-4548-bab9-c2e275e5f9be

-4e83dbdb/relevance/1 .

Maheswari, R. U., and R. Umamaheswari. 2017. Trends in non-stationary

signal processing techniques applied to vibration analysis of wind turbine

drive train - a contemporary survey. Mechanical Systems and Signal

Processing 85 (February 15):296–311. doi:10.1016/j.ymssp.2016.07.046.

Nejad, A. R., P. F. Odgaard, and T. Moan. 2018. Conceptual Study of

a gearbox fault detection method applied on a 5-MW spar-type floating

wind turbine. Wind Energy 21 (11):1064–75. doi:10.1002/we.2213.

Pan, S. J., and Q. Yang. 2010. A survey on transfer learning. IEEE

Transactions on Knowledge and Data Engineering 22 (10):1345–59.

doi:10.1109/TKDE.2009.191.

Pang, Y., L. Jia, X. Zhang, Z. Liu, and D. Li. 2020. Design and implemen-

tation of automatic fault diagnosis system for wind turbine. Computers

& Electrical Engineering 87 (October 1):106754. doi:10.1016/j.comp

eleceng.2020.106754.

Pérez-Pérez, E. -J., F. -R. López-Estrada, V. Puig, G. Valencia-Palomo,

and I. Santos-Ruiz. 2022. Fault diagnosis in wind turbines based on

ANFIS and takagi–sugeno interval observers. Expert Systems with

Applications 206 (November 15):117698. doi:10.1016/j.eswa.2022.

117698.

Rahimilarki, R., Z. Gao, N. Jin, and A. Zhang. 2022. Convolutional neural

network fault classification based on time-series analysis for bench-

mark wind turbine machine. Renewable Energy

185 (February 1):916–31. doi:10.1016/j.renene.2021.12.056.

Shen, F., C. Chen, and R. Q. Yan. 2017. Application of SVD and teansfer

learing strategy on motorfault diagnosis. Journal of Vibration

Engineering 30 (01):118–26. (in Chinese).

Shen, J., Y. Qu, W. Zhang, and Y. Yu. 2018. Wasserstein distance guided

representation learning for domain adaptation. Thirty-Second Aaai

Conference on Artificial Intelligence/Thirtieth Innovative Applications

of Artificial Intelligence Conference/Eighth Aaai Symposium on

Educational Advances in Artificial Intelligence, 4058–65, Palo Alto,

Assoc Advancement Artificial Intelligence. https://www.webofscience.

com/wos/alldb/summary/e2d6ebb2-a58e-4ff9-8d82-4fe3363e9e75-

4e841b3a/relevance/1 .

Sobie, C., C. Freitas, and M. Nicolai. 2018. Simulation-driven machine

learning: bearing fault classification. Mechanical Systems and Signal

Processing. 99 (15):403–19. January. doi:10.1016/j.ymssp.2017.06.025.

Tang, X., Y. Xu, X. Sun, Y. Liu, Y. Jia, F. Gu, and A. D. Ball. 2022. Intelligent

fault diagnosis of helical gearboxes with compressive sensing based

non-contact measurements. ISA Transactions. (July 21). https://www.

sciencedirect.com/science/article/pii/S0019057822003779 .

Tzeng, E., J. Hoffman, K. Saenko, and T. Darrell. 2017. Adversarial

discriminative domain adaptation. 30th Ieee Conference on Computer

Vision and Pattern Recognition (Cvpr 2017), 2962–71. New York, Ieee.

doi:10.1109/CVPR.2017.316

Tzeng, E., J. Hoffman, N. Zhang, K. Saenko, and T. Darrell. 2014. Deep

domain confusion: maximizing for domain invariance (version 1).

arXiv. doi:10.48550/arXiv.1412.3474.

Vamvoudakis-Stefanou, K. J., J. S. Sakellariou, and S. D. Fassois. 2018.

Vibration-based damage detection for a population of nominally iden-

tical structures: unsupervised multiple model (MM) statistical time

series type methods. Mechanical Systems and Signal Processing

111 (October 1):149–71. doi:10.1016/j.ymssp.2018.03.054.

van der Maaten, L., and G. Hinton. 2008. Visualizing data using T-SNE.

Journal of Machine Learning Research 9 (November):2579–605. doi:10.

1016/j.ymssp.2016.07.046.

Yosinski, J., J. Clune, Y. Bengio, and H. Lipson. 2014. How transferable

are features in deep neural networks? In Advances in neural informa-

tion processing systems 27 (Nips 2014), ed. Z. Ghahramani, M. Welling,

C. Cortes, N. D. Lawrence, and K. Q. Weinberger. vol. Vol. 27. La Jolla:

Neural Information Processing Systems (nips). https://www.

webofscience.com/wos/alldb/summary/b8582f5b-cc4d-430e-b974

-96ca305ff333-4e83e1da/relevance/1 .

Zgraggen, J., M. Ulmer, E. Jarlskog, G. Pizza, and L. Goren Huber. 2021.

Transfer learning approaches for wind turbine fault detection using

deep learning. PHM Society European Conference 6 (1):12. doi:10.

36001/phme.2021.v6i1.2835.

Zhu, X., R. Wang, Z. Fan, D. Xia, Z. Liu, and Z. Li. 2022. Gearbox fault

identification based on lightweight multivariate multidirectional

induction network. Measurement 193 (April 1):110977. doi:10.1016/j.

measurement.2022.110977.

Zhuang, F. Z., P. Luo, Q. He, and Z. Z. Shi. 2015. Survey on transfer

learning research. Journal of Software 26 (01):26–39. doi:10.13328/j.

cnki.jos.004631.

INTERNATIONAL JOURNAL OF GREEN ENERGY 13

Unknown-class recognition adversarial network for open set domain adaptation fault diagnosis of rotating machinery

Article

Full-text available

May 2024
J INTELL MANUF

Transfer learning methods have received abundant attention and extensively utilized in cross-domain fault diagnosis, which suppose that the label sets in the source and target domains are coincident. However, the open set domain adaptation problem which include new fault modes in the target domain is not well solved. To address the problem, an unknown-class recognition adversarial network (UCRAN) is proposed for the cross-domain fault diagnosis. Specifically, a three-dimensional discriminator is designed to conduct domain-invariant learning on the source domain, target known domain and target unknown domain. Then, an entropy minimization is introduced to determine the decision boundaries. Finally, a posteriori inference method is developed to calculate the open set recognition weight, which are used to adaptively weigh the importance between known class and unknown class. The effectiveness and practicability of the proposed UCRAN is validated by a series of experiments. The experimental results show that compared to other existing methods, the proposed UCRAN realizes better diagnosis performance in different domain transfer task.

Wind turbine generator early fault diagnosis using LSTM-based stacked denoising autoencoder network and stacking algorithm Wind turbine generator early fault diagnosis using LSTM-based stacked denoising autoencoder network and stacking algorithm

Article

Full-text available

Feb 2024
INT J GREEN ENERGY

To reduce the significant economic losses caused by the fault deterioration of wind turbine generators, it is urgent to detect and diagnose the early faults of generators. The existing condition monitoring and fault diagnosis (CMFD) methods have disadvantages of less considering data temporal characteristic, acquiring early faults with difficulty, and having lower diagnostic accuracy. To address those limitations, a novel LSDAE-stacking CMFD method of generators was proposed. Specifically, a multivariate spatio-temporal condition monitoring model (LSDAE) was established by combining the LSTM and SDAE networks, which can detect generator early anomalies through real-time monitoring the reconstruction residual. Then, based on the stacking ensemble algorithm, a multi-classification fault diagnosis model (Stacking) was constructed to identify early fault types, which can integrate advantages of different base-classifiers to achieve a better diagnostic accuracy. Case studies on three actual generator failures were employed to validate the effectiveness and accuracy of the proposed LSDAE-stacking method. The results illustrated that, compared with conventional SDAE model, the proposed LSDAE model had higher reconstruction precision and superior early-fault-warning capacities. And compared with traditional algorithms such as SVM, RF, AdaBoost, GBDT and XGBoost, the constructed Stacking model can effectively identify the fault types of generators and had higher diagnostic accuracy.

Transfer Learning Approaches for Wind Turbine Fault Detection using Deep Learning

Article

Full-text available

Jun 2021

Wind park operators start to recognize the cost-effectiveness of intelligent maintenance solutions for wind turbines based on the readily available 10-minute SCADA data. In particular, recent advances have shown that deep learning algorithms can enhance the performance and robustness of fault detection algorithms which are fed with such SCADA data. In order to deploy deep learning fault detection algorithms, a large amount of historical data is needed. In case the data is not available for a certain turbine, training the algorithms becomes challenging. The common approaches in this case are referred to as transfer learning or domain adaptation methods, which attempt to allow the transfer of knowledge between different machines. In this paper we explore the main challenges of domain adaptation for fault detection based on wind turbine SCADA data. We focus on practical use cases, stemming from the commercial need to deploy fault detection algorithms for newly installed turbines, or turbines with little historical data under diverse operating conditions. We analyze different reasons for domain shifts between turbines, which require the development of new domain adaptation approaches beyond the ones familiar for other PHM applications, and present results for several of these challenging cases.

Optimal preventive maintenance strategies for a wind turbine gearbox

Article

Full-text available

Nov 2022

This paper investigates two maintenance strategies for wind turbine gearboxes. The first one is frequently adopted in practice. It consists in monitoring the state of the gearbox through its temperature. As soon as the latter reaches a predefined threshold level, production rate is drastically reduced by slowing down the wind turbine while cooling the gearbox for a certain period before recovering the desired output rate. As it becomes more frequent with time, the wind turbine operators will decide to renew the gearbox. The latter is replaced by a new identical one or submitted to an overhaul based only on the judgement of the maintenance agents. For this first strategy, an analytical model is developed to optimize the renewal period of the gearbox considering the balance between the cost of production loss and cooling each time the threshold temperature is reached, and the cost of renewal. The second strategy is a new one proposed in this paper. It suggests performing an imperfect preventive maintenance (PM) action each time the temperature threshold is reached, reducing hence the failure rate of the gearbox to a value between the current one and the one of a new gearbox. The imperfect preventive action is performed N times before the gearbox must be renewed. A mathematical model is also developed to simultaneously find the optimal number of PM actions to be performed before renewing the gearbox, and the optimal period for the maintenance crew to start the PM or renewal action after the instant at which the temperature threshold level is exceeded. This period being longer or shorter depending on the logistics in place to move the maintenance crew to the site and prepare for the intervention. Numerical examples are presented, a sensitivity analysis is performed, and the two strategies are compared. Optimal solutions are obtained for each strategy. Also, the result of the comparison shows that each strategy can be more economical depending on the reliability of the gearbox, and the different costs incurred, particularly the PM and the renewal related logistics costs.

A deep boosted transfer learning method for wind turbine gearbox fault detection

Article

Full-text available

Jul 2022
RENEW ENERG

Deep learning methods have become popular among researchers in the field of fault detection. However, their performance depends on the availability of big datasets. To overcome this problem researchers started applying transfer learning to achieve good performance from small available datasets, by leveraging multiple prediction models over similar machines and working conditions. However, the influence of negative transfer limits their application. Negative transfer among prediction models increases when the environment and working conditions are changing continuously. To overcome the effect of negative transfer, we propose a novel deep transfer learning method, coined deep boosted transfer learning, for wind turbine gearbox fault detection that prevents negative transfer and only focuses on relevant information from the source machine. The proposed method is an instance-based deep transfer learning method that updates the weights of the source and the target machine training samples separately. The weights of different source training samples are gradually decreased to reduce the impact on the final model. The proposed method is verified by the Case Western Reserve University bearing and real field wind farm datasets. The results show that the proposed method ignores negative transfer and achieves higher accuracy compared to standard deep learning and deep transfer learning methods.

Wind Turbine Gearbox Anomaly Detection Based on Adaptive Threshold and Twin Support Vector Machines

Article

Full-text available

Dec 2021

Data-driven condition monitoring reduces downtime of wind turbines and increases reliability. Wind turbine operation and maintenance (O&M) cost is a significant factor that calls for automated fault detection systems in wind turbines. In this manuscript, the anomaly detection problem for wind turbine gearbox is formulated based on adaptive threshold and twin support vector machine (TWSVM). In this work, SCADA data from wind farms located in the UK is considered with samples from twelve months before failure, and from one month before failure. Gearbox oil and bearing temperatures are used as two univariate time-series for analyzing adaptive threshold. The effectiveness of the proposed method is compared with standard classifiers like support vector machines (SVM), k-nearest neighbors (KNN), multi-layer perceptron neural network (MLPNN), and decision tree (DT). Anomaly detection of wind turbine gearbox using TWSVM and adaptive threshold results in an accurate performance, thus increasing the reliability. The missed failure and false positive rate that indicate the proposed methodology's ability is also investigated to discriminate between false alarms, and comparison with previous studies shows superior performance.

Wasserstein Distance Guided Representation Learning for Domain Adaptation

Article

Apr 2018

Domain adaptation aims at generalizing a high-performance learner on a target domain via utilizing the knowledge distilled from a source domain which has a different but related data distribution. One solution to domain adaptation is to learn domain invariant feature representations while the learned representations should also be discriminative in prediction. To learn such representations, domain adaptation frameworks usually include a domain invariant representation learning approach to measure and reduce the domain discrepancy, as well as a discriminator for classification. Inspired by Wasserstein GAN, in this paper we propose a novel approach to learn domain invariant feature representations, namely Wasserstein Distance Guided Representation Learning (WDGRL). WDGRL utilizes a neural network, denoted by the domain critic, to estimate empirical Wasserstein distance between the source and target samples and optimizes the feature extractor network to minimize the estimated Wasserstein distance in an adversarial manner. The theoretical advantages of Wasserstein distance for domain adaptation lie in its gradient property and promising generalization bound. Empirical studies on common sentiment and image classification adaptation datasets demonstrate that our proposed WDGRL outperforms the state-of-the-art domain invariant representation learning approaches.

Intelligent fault diagnosis of helical gearboxes with compressive sensing based non-contact measurements

Article

Jul 2022
ISA T

Helical gearboxes play a critical role in power transmission of industrial applications. They are vulnerable to various faults due to long-term and heavy-duty operating conditions. To improve the safety and reliability of helical gearboxes, it is necessary to monitor their health conditions and diagnose various types of faults. The conventional measurements for gearbox fault diagnosis mainly include lubricant analysis, vibration, airborne acoustics, thermal images, electrical signals, etc. However, a single domain measurement may lead to unreliable fault diagnosis and the contact installation of transducers is not always accessible, especially in harsh and dangerous environments. In this article, a Compressive Sensing (CS)-based Dual-Channel Convolutional Neural Network (CNN) method was proposed to accurately and intelligently diagnose common gearbox faults based on two complementary non-contact measurements (thermal images and acoustic signals) from a mobile phone. The raw acoustic signals were analysed by the Modulation Signal Bispectrum (MSB) to highlight the coupled modulation components relating to gear faults and suppress the irrelevant components and random noise, which generates a series of two-dimensional matrices as sparse MSB magnitude images. Then, CS was used to reduce the image redundancy but retain key information owing to the high sparsity of thermal images and acoustic MSB images, which significantly accelerates the CNN training speed. The experimental results convincingly demonstrate that the proposed CS-based Dual-Channel CNN method significantly improves the diagnostic accuracy (99.39% on average) of industrial helical gearbox faults compared to the single-channel ones.

Fault diagnosis in wind turbines based on ANFIS and Takagi–Sugeno interval observers

Article

Nov 2022
EXPERT SYST APPL

Wind turbine power generation is becoming one of the most critical renewable energy sources. As wind power grows, there is a need for better monitoring and diagnostic strategies to maximize energy production and increase its security. In this paper, a fault diagnosis approach based on a data-driven technique, which represents the system behavior employing a Takagi–Sugeno (TS) model, is developed. An adaptive neuro-fuzzy inference system (ANFIS) method is used to obtain a set of polytopic-based linear representations and a set of membership functions to interpolate the linear models of the convex TS model. Then, considering the TS model, a fault diagnosis strategy based on convex state observers generate residuals to detect and isolate sensor faults. Unlike other methods, this proposal only needs to be trained with fault-free data. The proposed methodology is tested under different fault scenarios on a well-known wind turbine benchmark built upon fatigue, aerodynamics, structures, and turbulence (FAST). The results demonstrate the method’s effectiveness in detecting and isolating different sensor faults.

Gearbox fault identification based on lightweight multivariate multi-directional induction network

Article

Mar 2022
MEASUREMENT

The fault diagnosis of the wind turbine gearbox is of great significance for improving the safety of the unit operation and reducing the downtime. Therefore, aiming at the contradiction between diagnostic accuracy and complexity of diagnostic model in a noisy environment, this paper studies it and proposes Lightweight multivariate and multi-directional induction network (LM-MDINet). This method designs dense separable blocks (DS- Blocks) to enhance deep feature extraction. At the same time, by decoupling the mapping relationship between the space and the channel, the amounts of parameters are reduced. In addition, a multivariate and multi-directional induction (M-MDI) layer has been added to guide the network towards the expression of effective fault information to enhance the network's ability to learn effective information. The experimental results show that the proposed method has outstanding comprehensive performance in noisy environment to compare with other methods.

Convolutional neural network fault classification based on time-series analysis for benchmark wind turbine machine

Article

Dec 2021
RENEW ENERG

Fault detection and classification are considered as one of the most mandatory techniques in nowadays industrial monitoring. The necessity of fault monitoring is due to the fact that early detection can restrain high-cost maintenance. Due to the complexity of the wind turbines and the considerable amount of data available via SCADA systems, machine learning methods and specifically deep learning approaches seem to be powerful means to solve the problem of fault detection in wind turbines. In this article, a novel deep learning fault detection and classification method is presented based on the time-series analysis technique and convolutional neural networks (CNN) in order to deal with some classes of faults in wind turbine machines. To validate this approach, challenging scenarios, which consists of less than 5% performance reduction (which is hard to identify) in the two actuators or four sensors of the wind turbine along with sensors noise are investigated, and the appropriate structures of CNN are suggested. Finally, these algorithms are evaluated in simulation based on the data of a 4.8 MW wind turbine benchmark and their accuracy approves the convincing performance of the proposed methods. The proposed algorithm are applicable to both on-shore and off-shore wind turbine machines.

Design and implementation of automatic fault diagnosis system for wind turbine

Article

Oct 2020
COMPUT ELECTR ENG

Operation of wind turbines under fault state will directly affect the power output efficiency of wind farms. This paper proposes a new automatic fault diagnosis method for wind turbines. A fault diagnosis system framework is constructed and data of vibration status of wind turbines collected is processed and used for fault diagnosis. Firstly, wavelet coefficients are obtained using a discrete wavelet transform (DWT) for vibration acceleration signals collected from wind turbines. Then, the wavelet coefficients are sequentially subjected to phase space reconstruction (PSR) and singular value decomposition (SVD) to extract the fault features. Finally, an extreme learning machine (ELM) is used to classify the faults. Experimental results show that the proposed method is more effective and accurate than other fault diagnosis methods for wind turbines, such as support vector machine (SVM) and multiscale convolutional neural network (MSCNN).

Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox Deep adversarial transfer neural network for fault diagnosis of wind turbine gearbox

Abstract and Figures

Recommended publications

Weighted joint matching adaptive network for fault diagnosis of wind turbines in the new wind farm (...

Fault Feature Transfer and Recognition of Bearings Under Multi-condition

A Balanced Deep Transfer Network for Bearing Fault Diagnosis

A DEEP CONVOLUTIONAL TRANSFER LEARNING APPROACH FOR SMART BEARING FAULT DETECTION AND DIAGNOSIS