Conference PaperPDF Available

Evolutionary Neural Network Applied to Induction Motors Stator Fault Detection

Authors:

Abstract

This paper addresses an induction motor fault detection and diagnosis system. The system is based on monitoring of key electrical signals associated with an evolutionary artificial neural network (EANN) model. Stator windings inter-turns short-circuit have been successfully detected by the system. This type of fault is one of the most frequent in induction machines. The paper also discusses the automatic design of artificial neural networks. A real-encoding genetic algorithm has been suggested to evolve architectures and weights of neural networks. The genetic operators, mutation and recombination, have been evaluated to obtain a consistent and automatic learning algorithm. Performance comparisons on correct faults detection and memory requirements provided by the evolutionary network EANN and alternative nonlinear modeling techniques have shown that EANN outperforms its counterparts.
Evolutionary Neural Network Applied to Induction
Motors Stator Fault Detection
Leite, D. F.+; Attux, R.+; Von Zuben, F.+;CostaJr.,P.
;Gomide,F.
+
+Faculty of Electrical and Computer Engineering, University of Campinas, Brazil
Graduate Program in Electrical Engineering, Catholic University of Minas Gerais, Brazil
danfl7@dca.fee.unicamp.br; attux@dca.fee.unicamp.br; vonzuben@dca.fee.unicamp.br;
pyramo@pucminas.br; gomide@dca.fee.unicamp.br
Abstract This paper addresses an induction motor fault
detection and diagnosis system. The system is based on moni-
toring of key electrical signals associated with an evolutionary
artificial neural network (EANN) model. Stator windings inter-
turns short-circuit have been successfully detected by the system.
This type of fault is one of the most frequent in induction
machines. The paper also discusses the automatic design of
artificial neural networks. A real-encoding genetic algorithm has
been suggested to evolve architectures and weights of neural
networks. The genetic operators, mutation and recombination,
have been evaluated to obtain a consistent and automatic learning
algorithm. Performance comparisons on correct faults detection
and memory requirements provided by the evolutionary network
EANN and alternative nonlinear modeling techniques have shown
that EANN outperforms its counterparts.
I. INTRODUCTION
Currently, electrical motors are amongst the most useful
equipments in the industry. Generally, they are critical compo-
nents in the industrial processes. Therefore, questions concern-
ing their protection against failures have received considerable
attention by maintenance engineers and experts.
The condition monitoring and fault diagnosis of induction
motors may promote significant improvements in availability,
quality and productivity of production lines. The key to
reach this objective is the ability to detect faults in early
stages because functional failures may quickly occur as a
consequence of faulty behaviors.
A major part of induction motors premature faults occurs in
the stator windings. The inter-turns short-circuit is a primary
fault that happens after insulation breakdowns. Among the
main reasons for insulation fails are [1] the high stator core
or winding temperatures; slack core lamination, slot wedges,
and joints; loose bracing for end winding; contamination due
to chemical reactions, moisture, or dirt; electrical discharges
due to aging of the insulating material; and leakage in cooling
systems. After primary faults the degradation process of the
motor increases and more serious, damaging failures, such
as phase-to-phase and phase-to-ground short-circuits tend to
appear. Usually, these types of faults result in irreversible
motor damages. However, if the turn-to-turn fault is detected
by diagnosis systems at incipient stage, for example, the faulty
phase winding may be replaced, what significantly reduces
financial losses and increases operational safety. The benefits
that such systems can bring to industry include motor life
extension, idle period shortening, reduction in the rate of
unnecessary disconnections, manpower scheduling at the fault
moment, repair cost minimization, human security improve-
ment, reduction of direct-indirect overtime rewards, cut in
components storage, and minimization of losses (particularly,
the loss in production).
During the last two decades, soft computing has been
investigated by experts in electrical machines fault diagnosis.
Artificial Neural Networks (ANN) and Hybrid Systems (HS)
have been successful to detect different types of faults [2]
- [6]. The major difficulty of ANN application in predictive
maintenance systems, as well as in the vast majority of real-
world engineering applications, is the selection of the best
network characteristics and parameters. A suitable selection
of the ANN parameters is essential to achieve high perfor-
mance in correct detections. Examples of characteristics and
parameters are the number of hidden layers and the number
of neurons at each layer. Currently they are chosen based on
a trial and error process performed by a human designer. This
is an exhaustive task that can last days or weeks [7]. Efforts
to turn ANN design process more systematic and less human-
dependent is underway.
Another concern of feedforward ANN in fault detection
scenarios is the use of the backpropagation algorithm (BP) for
training. After the rediscovery of BP by Rumelhart et al. [8],
researchers have intensively used the BP to train ANN. Since
BP has a local convergence nature, it can be demonstrated that
its solution is highly dependent on the initial weights and may
guide to a low quality local optimal solution.
Several variations of the BP were compared with Quasi-
Newton, Non-Derivative Quasi-Newton, Gauss-Newton and
Secant optimization methods by [9]; Evolutionary Strategies
by [10]; Genetic Algorithms (GA) by [11] - [13]; Ant Colony
by [14]; and Particle Swarm by [15] - [16]. All these training
techniques can outperform BP in terms of learning efficiency,
time training, easiness of use or performance in application,
in certain scenarios.
Considering the current limitations of the ANN training
techniques and given the challenges of the induction motors
fault detection and diagnosis task, the current paper suggests
evolving architectures and weights of ANN using a genetic
978-1-4244-4252-2/09/$25.00 ©2009 IEEE 1721
algorithm specially designed for that purpose. The evolved
features include the number of hidden layers and hidden
neurons. In addition to the weights, the structure of layered
networks affects the generalization ability of the ANN; too
complex models may overfit the training data and exhibit poor
generalization while simpler models may be insufficient to
approximate the potentially nonlinear relationship expressed
by the training data.
The reasons for the choice of GA as learning method,
against conventional nonlinear optimization methods, include
the following:
GA operates on codified parameters. It results in a search
for improved candidate solutions independently of the
continuity of the function or the existence of derivatives,
The search toward the best solution starts from a set
of points, or individuals, deployed in the search space
(global search - populational strategy). Thus, the prob-
ability that the solution gets stuck on local minima is
reduced,
GA automatizes the trial and error approach to setup
appropriate ANN parameters. The advantage of the auto-
matic design over the manual design becomes clearer as
the complexity of the ANN architecture increases.
To simulate the true conditions of actual industrial practice,
we consider different operating points of motors in the ex-
periments. Moreover, random effects of the environment are
modeled adding white noise (measurement noise) to the state
variables of the machines. This noise is bounded by values
within the accuracy range of transducers currently available in
the industry.
The paper is organized as follows. The next section presents
a general view of the fault diagnosis system and the EANN is
discussed. Section III addresses the methodology to implement
the GA and details the genetic operators used. Section IV
compares the EANN system with alternative fault detection
and diagnosis approaches. The paper concludes with Section
V summarizing its main contributions and suggesting issues
for further investigation.
II. THE DIAGNOSIS SYSTEM -AGENERAL VIEW
Figure 1 gives an overview of the diagnosis system ad-
dressed in this paper. At startup, voltage, current and rotor
speed signals of induction motors are acquired. The data are
pre-processed and vectors whose variables are related to the
healthy state of the machines are obtained. The Acquisition
and Data Treatment block disposes the machines’ state vari-
ables to the Parameters Estimation,Optimization and Faults
Simulator modules. The latter inserts vectors whose entries
are related to the machines’ faulty states onto the database.
This way, the database is composed of healthy and faulty
data vectors. The faulty ones contain data related to different
stator fault severities ([1%, 5%] of shorted-turns) and different
fault locations (phases {a, b, c}). Both healthy and faulty
data are obtained under different machines’ operating points
and different levels of noise. A fair percentage of randomly
selected data vectors are admitted for the design of EANN,
while the others are admitted for testing. At the end, a
diagnosis report for each monitored machine is generated and
displayed on the microcomputer screen at any desired instant
of time.
Fig. 1. Overview of the fault diagnosis system
Before we proceed, we briefly describe the modules of the
system.
Acquisition and data treatment When requested, this
module acquires voltage, current and rotor speed signals
from induction motors operating in the field. The number
of motors connected to the system has been limited by
the number of channels of the acquisition board. The
acquired data are pre-processed, the signals offset are
removed and the magnitude correction factor is applied.
Low pass filters minimize noise and slot effects. Other
calculus like active power, reactive power, power factor,
slip and sequential components are carried out. Then, the
processed data are displayed on the microcomputer screen
prior to being saved in a user’s pre-defined file.
Parameters estimation module Parameter estimation
algorithms based on Least Squares and Extended Kalman
Filter operate in parallel aiming at providing all parame-
ters required by the faults simulator model.
Optimization module This module promotes fine ad-
justment of the state variables generated by the faults
simulator model for a particular real motor. The Con-
jugate Gradient method is employed to optimize certain
parameters in order to allow the system to better track the
state variables of a specific real machine. If the conditions
of the real process change, this module helps the fault
simulator model to track the evolution.
1722
Faults simulator module This module is constituted
by the motors’ models. It allows simulations of shorted-
turns in the stator windings, changeable loads, voltage
unbalancing, noise and windings asymmetries. Refer to
[17] for more detailed information.
Database The database consists of input-output pairs
where the input variables used to generate diagnosis
report include the abc stator currents, voltage-current
displacement angles, rotor slip, absolute value of the
negative sequence current and the angle of the positive
sequence current; the output variables comprehend the
stator shorted-turns percentage in phases a,band c.
EANN-GA This block consists of feedforward ANN
evolved via GA with one or two hidden layers. The
EANN have the function of mapping the input space into
the fault space, M:XY. The GA-based learning
strategy aims at evolving ANN architectures and weights.
After the learning process, the ANN best architecture and
its respective best vector of weights are chosen.
Diagnosis report The diagnosis report indicates the
following: ANN evolving time, number of training and
testing vectors used, architectures and weights of the net-
works for each monitored motor and performance indices.
The diagnosis report for each machine is displayed on the
microcomputer screen at any desired instant of time.
It is worth noting the diagnosis system is able to simulta-
neously monitor iinduction motors in field applications. This
has been done through a routing scheme. First, EANN 1
is evolved to model motor-1.Next,EANN 2is evolved
to model motor-2, and so on. After the evolution of the
EANN i, the monitoring mode starts. The diagnosis system
in monitoring mode only considers new data to perform fault
classification. Therefore, the monitoring response time is short
and the microcomputer availability is high.
III. EANN FOR FAULT DIAGNOSIS
This section describes extensions and adaptations of the
basic GA for evolving architectures and weights of neural
networks. The GA has presented the following stages:
Genetic representation or codification of potential solu-
tions,
Parameters definition e.g. population size, relative elitism,
penalty factors, mutation rate, stopping criteria,
Initialization of the populations with aprioriknowledge
about the expected behavior,
Application of genetic operators of mutation, recombina-
tion and selection,
Fitness evaluation,
Post-processing of the fittest individual.
The flowchart illustrated in Fig. 2 shows the procedures
taken for the evolution of ANNs. In accordance with the
flowchart, the internal loop evolves Ωgenerations of weights
individuals while the external loop evolves αgenerations of
architectures individuals. The fittest architecture individual and
its respective fittest vector of weights are established and post-
processed at the end of the evolution. The post-processed
network is then presented as solution of the optimization
problem. In the figure, Wand Arefer to the weights and
architectures populations, respectively.
Fig. 2. Evolving Flowchart
It is worth mentioning that the GA processing accomplishes
a global search for solutions while the post-processing stage
executes a local search around the best individual found so
far. This fact encouraged the development of the learning
algorithm in order to outperform alternative nonlinear mod-
eling techniques, thus providing overall improvements in the
diagnosis system.
A. Initialization and Parameterization
A schematic representation of the basic processing units of
the network being considered here is illustrated in Fig. 3. In
the figure, xjrefers to the j-th input of a neuron, wjis the
j-th connection weight, bis the bias level, net refers to the
weighted sum of the inputs to the neuron and ϕ(.)concerns
a sigmoidal logistic function giving rise to the output y.
Fig. 3. Schematic representation of a neuron of the suggested neural network
From prior knowledge about the problem in hand, we wish
that the nonlinear function ϕ(.)of all neurons were triggered
into the unsaturated region, that is net
=0for the sigmoid
function. Otherwise, the network could not be able to differen-
tiate input vectors at the very beginning of the evolution what
could possibly lead to a harder and slower learning. When,
for example, the BP is considered as the learning algorithm,
a possible mechanism to accelerate the learning convergence
1723
is to set small random initial weights. Analogously, when
considering real genetic programming, the allele (range of
possible values) of the genes of the chromosomes representing
weight vectors can be initially adjusted to small values around
0,e.g.[0.01,0.01], guaranteeing the same effect.
Some remarks about the suggested learning approach in-
clude: i) We set the initial architectures and weights pop-
ulations size to 20 individuals each. We consider the Pitts-
burgh approach in which all architectures and weights vectors
compete with each other to be the best solution of the
problem; ii) The genetic learning algorithm is elitist which
means the fittest solution in previous generations is always
preserved for next generations. This approach ensures that the
overall best individual remains in the population independently
of the application of genetic operators; iii) The number of
individuals within the populations of architectures and weights
are kept constant. Although the recombination operator makes
the weights vectors population doubles in size, the selection
operator reduces the population to the half.
Notice that there are several situations where GA global
search does not reach “good” performance. For this reason,
there have been several hybrid techniques proposed in the
literature [18] - [19] for post-processing the solution found
through local search techniques. Here in this paper, a proper
GA-based local search has been employed. It has been carried
out by using a special mutation operator further presented.
B. Phenotype Representation
The mapping of different phenotypes into genotypes is a
crucial stage in the development of GA-based models, most
specially when tackling constrained optimization problems.
For instance, mutation and recombination operators may give
rise to unfeasible solutions. In this type of problems special
care must be taken in both representation of individuals
and definition of genetic operators. Undesirable effects as,
for example, the requirement of extra manipulations of the
chromosomes, more complex objective function and prema-
ture convergence are immediate consequences of inadequate
representations.
Encoding in GA is the form in which chromosomes and
genes are expressed. There are basically two types of en-
coding, binary and real. The former had been originally
introduced and very much discussed while the later seems to
fit continuous optimization problems better and therefore has
been adopted in this study. Several successful applications of
real codification may be found in the literature e.g. [13], [20]
-[21].
Let Pand Gbe a phenotypic space (behavioral) and a
genotypic space (informational), respectively [22] - [23]. Phe-
notypes representing neural network architectures and weight
vectors have been encoded into genotypes by a direct mapping
MPGand assumed to be haploid chromosomes as illustrated
in Fig. 4. The chromosomes associated with the architectures
have been composed of genes Aand Bwhich represent the
number of neurons in the first and second hidden layers of
the network, respectively. In case either gene Aor gene B
assumes a zero value, then the EANN employs a single hidden
layer. The range of values, namely allele, that each gene of the
architecture chromosome may assume is [00,99] while that of
the weight chromosome is [1,1].
Fig. 4. Phenotype-genotype codification of architectures and weights
Note that the size Sof the weight chromosome depends on the
values of the genes of the architecture chromosome, besides
the number of inputs and outputs of the network, according to
S=IA +AB +BO .
C. Recombination Operators
The definition of the most appropriate recombination op-
erator for tackling different sorts of problems is a difficult
and challenging problem whose solution is still open. Hence,
we examine commonly used genetic operators in the scenario
of fault diagnosis viz. Arithmetic, Multi-Point and Local
Intermediate crossovers. These operators can be applied in the
sexual form where only parents are involved in the creation of
offspring, and global form where up to the whole population
contributes to generate new offspring. In this study, we opt for
the former approach that is 20 parents generate 20 children in
each algorithm iteration.
1) Arithmetic Crossover: This operator is particularly well
adapted for constrained numerical optimization problems with
convex feasible region ΞC.Letcn,n=1, ..., N , represent
the n-th individual of a population. As a consequence of two
individuals cn1and cn2belonging to ΞC, convex combinations
of cn1and cn2also belong to ΞC. This ensures that the
Arithmetic crossover produces only valid offspring.
As an example, consider a random list of parents to be
crossed over as shown in Fig. 5. Each crossover produces
two children whose genes inherit a linear combination of the
parents’ genes. In the figure, the variable amay assume values
in the range [0,1].
2) Multi-Point Crossover: In Multi-Point crossover opera-
tors, children inherit sets of successive genes from randomly
selected parents. prandomly selected points along the chro-
mosomes of the chosen parents divide them into p+1 parts.
1724
Fig. 5. Arithmetic crossover operator
Sequences of genes of each father are then combined gener-
ating offspring. An intuitive example of a p-point crossover
operator being applied on randomly chosen parents is shown
in Fig. 6.
Fig. 6. Multi-Point crossover operator
3) Local Intermediate Crossover: The Local Intermediate
operator is particularly useful when expecting that the learning
process converges to a unique solution. In this operator the
average value of the genes of randomly selected parents is
inherited by the single offspring:
c
n1=cn1+cn2
2.
Local Intermediate crossover implies that the children receive
inheritance of both parents in a same gene equally. Clearly,
if intermediate recombination is applied too often the chro-
mosomes become more and more similar. This may lead to
premature convergence of the chromosomes of the population,
especially when no other operator, such as mutation, is used
in order to keep evolutionary adaptation active.
D. Mutation Operators
Mutation is an operator that involves only a single individual
in contrast to recombination. This operator assures there will
always be a probability of reaching any point within the search
domain. Usually, when the current solution of the problem is
far from being the best according to a fitness measure, a higher
mutation rate is employed in order to find better solutions
farther from the current one. By the other side, when the
current solution of the problem is close from being the best,
a low mutation rate is adopted. This procedure gives an idea
about how to search for solutions only in promising regions.
Mutation operators commonly do not produce offspring.
The mutated individuals remain in the population for later
breeding. A given individual cnhas its corresponding mutated
value c
nfrom c
n=m(cn), where m(.)is a mutation func-
tion. A Gaussian mutation operator and a Random mutation
operator have been considered herein for the sake of analysis.
1) Gaussian Mutation: The Gaussian mutation is an oper-
ator very often applied in real-coded evolutionary algorithms.
This is mainly because it supports fine tuning of solutions.
A given individual cnhas its corresponding mutated value c
n
from
c
n=cn±M,
where Mis a normal density function N(mean,Γ)lwith
mean =0and standard deviation Γ.
The Gaussian mutation has been applied gene-to-gene with
a gene mutation probability rate between 1% and 10%, directly
proportional to the fitness of a chromosome.
2) Random Mutation: Random mutation is a member of
the family of random search optimization methods. The main
features that make this operator attractive are the enhancement
in term of computational speed and its immunity to local
minima. The difficulty of the random methods is that the
randomness must be controlled to ensure convergence, while
allowing it to be free enough to allow complete coverage of
the search space.
The random mutation creates a random solution c
nat the
vicinity of the current solution cnusing a uniform probability
distribution such that all genes of the newest individual c
n
are within [1,1] and [0,99] for the weights and architectures
chromosomes, respectively, or, in other words, such that the
mutated genes remain feasible with respect to these bounds.
The free change of the mutated genes could or could not give
rise to better solutions. Clearly, better solutions are considered
for next generations while worse solutions are neglected.
Similarly to the Gaussian mutation, the Random mutation has
been applied gene-to-gene with a probability rate from 1% up
to 10%.
3) Local Mutation: After having obtained a solution for the
evolutionary optimization problem, post-processing this solu-
tion can be done by several different local search approaches.
Local mutation concerns a simple and fast mutation approach
we have suggested in order to try to improve the solution found
so far. The basic principle here is to change some genes of
the current solution aiming at generating a new solution that
may or may not be better than the previous one.
In fact, the local mutation we have employed is the Gaussian
mutation with random mutation rates varying from 0.1% to
1%. It is applied to a single chromosome, though. A fitter
solution replaces the previous solution while worse solutions
are discarded. Indeed, the post-processing approach by local
mutation promotes local search around the best solution found
by the evolutionary process. Usually, a better solution has been
found in our experiments.
1725
E. Fitness Function
Because genetic algorithms mimic the principle of natural
selection, the fitness measure is used to choose relatively fitter
individuals in the population to evolve. The higher the fitness
value of an individual, the higher its survival probability [24].
In order to determine the fitness of chromosomes representing
weight vectors we use
F(cn)=γ(τξ
train +τζ
test),
where F(cn)is the fitness of the chromosome cn;ξand ζ
are control variables defined according to one’s interest on
emphasizing training or testing data modeling; moreover
τtrain =Ctrain
Ctrain +Ftrain
,
τtest =Ctest
Ctest +Ftest
,
γ=kef(S),
where Cand Frefer to correct and wrong/false stator fault de-
tections, respectively; τtrain and τtest are related to the neural
network learning and generalization ability, respectively; f(S)
is a function of the size Sof the weights chromosomes, and
γis then a penalty factor for large architectures.
The fitness of the chromosomes representing architectures
has been further given by the maximum individual fitness
among the whole population of weights chromosomes.
F. Selection Operator
Selection is the stage of a genetic algorithm in which indi-
viduals are chosen for later breeding. The selection operator
involves randomly choosing individuals of a population to
enter a mating pool. Generally, the operator is formulated in
such a way to ensure that individuals with higher fitness have
a greater probability of being selected for mating, but that
individuals with lower fitness still have a small probability
of being chosen. Having some probability of choosing worse
individuals from the population is important to assure that the
search process is global and it does not simply converge to
the nearest local minima.
The classical GA uses selection proportional to the fitness
usually implemented with Roulette Wheel [22]. In order to
better control the selective pressure of individuals and also to
avoid premature convergence of the population, the Tourna-
ment selection operator [25] - [26] has been considered.
1) Tournament Selection and Elitism: The tournament se-
lection strategy is used as a replacement for the commonly
used fitness-proportional selection strategy. Empirical evidence
suggests that the Tournament selection method often performs
better than roulette selection. Moreover, Tournament selection
is one of the fastest selection methods and it offers good
control over the selection pressure.
The selection operator we have adhere considers the number
of victories of each individual in Hmatches against H
randomly selected opponents of the population in one against
one mode. The winner of a match is the individual that presents
the better fitness compared to the direct opponent.
In our experiments, we have found interesting solutions
assuming H=5in such a way that individuals winning no
less than 4of theses matches remain in the population. The
winner individuals are inserted into a mating pool. The process
continues until the mating pool is full. The selective pressure
provided by this operator has been viewed as weak since a
good diversity of individuals remains for next generations.
Notice that, next generations may be composed of either
parents or children.
Elitism involves maintaining the best individual of the
parents’ population. This technique may increase the speed of
convergence of a GA because it ensures that the best solution
found so far is retained for the next generation. While this
operator could be applied more broadly, e.g. retaining the
2 or 3 best solutions, overuse of it can lead to premature
convergence to a sub-optimal solution.
G. Stopping Criteria
The definition of the stopping criteria is an important task
in evolutionary models. Early termination of the evolution
may generate not so good solutions. By the other side, late
termination may cause overfitting (memorizing data rather than
learning the correct distribution).
The evolutionary process is repeated until a stop condition
is reached. Examples of stop conditions are
fixed number of architectures generations reached - 5,
fixed number of weights generations reached - 50,
highest fitness reached, or the solution has attained a
plateau such that successive iterations no longer produce
better results - 20 weights generations.
Recall that the architectures population includes 20 chromo-
somes and that of weights is composed of 20/40 chromosomes.
This means the maximum number of analyzed points can be
calculated from (5 ×20) ×(50 ×40), which results in 2×105
feasible solutions within the search space.
IV. RESULTS
In this section we compare the mutation and recombination
operators of the EANN model. We have used the EANN
and alternative nonlinear modeling techniques for the same
purpose of detecting stator fault occurring in induction motors.
The effect of the application of the Arithmetic, Multi-Point
and Local Intermediate crossover operators was evaluated
keeping the other GA operators fixed. Figure 7 illustrates
the evolution of the population along generations using these
operators. We noticed that for 500 generations of weight
vectors, the Arithmetic crossover provided the overall fittest
individual and the best average fitness.
Likewise, fixing all other GA operators, the performance
of the Gaussian (Fig. 8(a)) and Random (Fig. 8(b)) muta-
tion operators under different mutation rates were compared.
Considering the Gaussian operator, the fittest individual was
reached under an 8% mutation rate and the best population
average fitness was obtained under a 5% rate. This reaffirms
1726
the idea that broad studies about genetic operators are very
relevant. The Random mutation under a 5% mutation rate
produced the fittest individual and the best average fitness.
Fig. 7. Comparison between crossover operators
(a) Gaussian mutation
(b) Random mutation
Fig. 8. Comparison between mutation operators under different probability
rates
After selecting the appropriate genetic operators from the
previous studies and putting them performing together, we
could infer about the best network architectures. Figure 9
shows the simultaneous evolution of several network archi-
tectures in function of the weights generations. Note that in
a small amount of weights generations, an 88.8% accuracy in
detecting stator faults correctly was reached by the architecture
[9; 21; 5; 3]. This notation indicates the number of neurons
in the input, first hidden, second hidden and output layers,
respectively.
Fig. 9. Process of simultaneously evolving several neural network architec-
tures
Our final experiment concerns comparing the diagnosis
system performance using the proposed EANN evolved via
GA and alternative techniques including MLP and Elman
neural networks, both trained via BP, and the Fuzzy C-Means
clustering technique. The models were fed with predominantly
healthy but faulty data from induction motors. Incipient inter-
turns short-circuit in the stator windings from 1% up to 5% of
shorted-turns in each of the stator phases were considered.
The database consisted of vectors acquired under different
machine operating points. Moreover, stochastic environment
was simulated aiming to evaluate the models’ ability to deal
with noise. For this purpose, measurement noise (zero-mean
white noise) derived from current, voltage and speed trans-
ducers was considered. It assumes values within the accuracy
range of transducers currently available in the industry. Figure
10 shows the performance of the MLP and EANN models on
detecting motor conditions correctly. The EANN outperformed
the MLP network in all points of the domain. Note that
using ‘very good’ quality transducers, the diagnosis system
correctly detected 94.7% of stator faults using EANN against
91.5% using MLP. In addition, the training time for 500.000
epochs of the MLP was computed as being approximately
19 hours (500.000 points analyzed into the search space)
while the evolving time for 5 architectures generations with 20
individuals each, 50 weights generations with 40 individuals
each, and post-processing of the solution using the proposed
GA was approximately 2 hours and 40 minutes (210.000 points
evaluated in the search space).
1727
Fig. 10. Correct faults detection under different noise conditions using MLP-
BP and EANN-GA
Results on using Elman neural network and Fuzzy C-Means
to detect stator faults presented similar curves to the ones
depicted in Fig. 10. The curves were omitted from the figure
for the sake of visual clarity. We noticed that the Elman
neural network attained a 92.9% accuracy with a training time
of about 25 hours. By the other side, the Fuzzy C-Means
technique achieved the worst performance, 91.0%, although it
revealed to be the fastest algorithm spending only 29 minutes
for training.
V. C ONCLUSION
An evolutionary approach has been suggested to evolve
architectures and weights of neural networks with emphasis
in induction motors fault detection. The learning algorithm
developed in this paper automatically provides the neural net-
work architecture and parameters within feasible time frames
and computational resources. Experiments have confirmed the
effectiveness of the approach when compared to conventional
fault detection techniques. Further work shall consider the
extension of the diagnosis system to capture the occurrence
of broken rotor bars and airgap eccentricity.
ACKNOWLEDGMENT
The first author acknowledges CAPES, Brazilian Ministry
of Education, for fellowship. The fourth author thanks the
Energy Company of Minas Gerais - CEMIG, Brazil, for grant -
P&D114. The third and fifth authors are grateful to CNPq, the
Brazilian National Research Council, for grants 303215/2007-
0 and 304857/2006-8, respectively.
REFERENCES
[1] Nandi, S.; Toliyat, A. “Condition Monitoring and Fault Diagnosis of
Electrical Motors - A Review”, IEEE Transaction on Energy Conversion,
vol. 20, no. 4, 2005. pp: 719-729.
[2] Chow, M.; Yee, S. O. “Methodology for on-line incipient fault detection
in single-phase squirrel-cage induction motors using artificial neural
networks”. IEEE Transactions on Energy Conversion, vol. 6-3, Sep. 1991.
pp: 536-545.
[3] Schoen, R. R.; Lin, B. K.; Habetler, T. G.; Schlag, J. H.; Farag, S. “An
unsupervised, on-line system for induction motor fault detection using
stator current monitoring”. IEEE Industry Applications Society Annual
Meeting, Oct. 1994. pp: 103-109.
[4] Tallam, R.; Habetler, T.; Harley, R.; Gritter, D.; Burton, B. “Neural
network based on-line stator winding turn fault detection for induction
motors”. IEEE Industry Applications Conf., Oct. 2000. pp: 375-380.
[5] Premrudeepreechacharn, S.; Utthiyoung, T.; Kruepengkul, K.;
Puongkaew, P. “Induction motor fault detection and diagnosis using
supervised and unsupervised neural networks”. IEEE International
Conference on Industrial Technology, Dec. 2002. pp: 93-96.
[6] Fuente, M.; Moya, E.; Alvarez, C.; Sainz, G. “Fault detection and
isolation based on hybrid modelling in an AC motor”. Proceedings of
the IEEE International Joint Conference on Neural Networks, vol. 3, Jul.
2004. pp: 1869-1874.
[7] Chow, M.-Y. “Methodologies of Using Neural Network and Fuzzy Logic
Technologies for Motor Incipient Fault Detection”. World Scientific,
Danvers, USA, 1997, 140p.
[8] Rumelhart, D. E.; Hinton, G. E.; Willians, R. J. “Learning Internal
Representations by Error Propagation”. Parallel Distributed Processing,
vol. 1, MIT Press, 1986.
[9] Chen, O. T.-C.; Sheu, B. J. “Optimization schemes for neural network
training”. 1994 IEEE Int. Joint Conference on Neural Networks, vol. 2,
Jun. 1994. pp: 817-822.
[10] Yao, X.; Liu, Y. “Towards designing artificial neural networks by
evolution”. Applied Mathematics and Computation, vol. 91, issue 1, Apr.
1998. pp: 83-90.
[11] Jenkins, W. M. “A neural network trained by genetic algorithm”.
International Conference on Computational Structures - Budapest, Aug.
1996. CIVIL-COMP Press, Edinburgh.
[12] Jenkins, W. M. “Neural network weight training by mutation”.Com-
puters & Structures, vol. 84, issues 31-32, Dec. 2006. pp: 2107-2112.
[13] Sexton, R. S.; Gupta, J. N. D. “Comparative evaluation of genetic al-
gorithm and backpropagation for training neural networks”. Information
Sciences, vol. 129, issues 1-4, Nov. 2000. pp: 45-59.
[14] Blum, C.; Socha, K. “Training feed-forward neural networks with
ant colony optimization: an application to pattern classification”.Int.
Conference on Hybrid Intelligent Systems, Nov. 2005, 6p.
[15] Mendes, R.; Cortez, P.; Rocha, M.; Neves, J. “Particle swarms for
feedforward neural network training”. Proceedings of the Int. Joint Conf.
on Neural Networks, vol. 2, May 2002. pp: 1895-1899.
[16] Gudise, V. G.; Venayagamoorthy, G. K. “Comparison of particle swarm
optimization and backpropagation as training algorithms for neural
networks”. Proceedings of the IEEE Swarm Intelligence Symposium, Apr.
2003. pp: 110-117.
[17] Leite, D. F.; Ara´
ujo M. V.; Secco, L. A.; Costa Jr., P. P. “Induction
Motors Modeling and Fuzzy Logic Based Turn-To-Turn Fault Detection
and Localization”. Proceedings of the IEEE International Conference on
Power Engineering, Energy and Electrical Drives, Apr. 2007. pp: 90-95.
[18] Chen, S.; Wu, Y.; Luk, L. “Combined genetic algorithm optimization
and regularized orthogonal least squares learning for radial basis func-
tion networks”. IEEE Transactions on Neural Networks, vol. 10 - 5, 1999.
pp: 1239-1243.
[19] Brown, A. D.; Card, H. C. “Cooperative coevolution of neural repre-
sentations”. Int. Journal of Neural Systems, vol. 10-4, 2000. pp: 311-320.
[20] Sexton, R. S.; Dorsey, R. E.; Johnson, J. D. “Toward global optimization
of neural networks: a comparison of the genetic algorithm and backprop-
agation”. Decision Support Systems, vol. 22, 1998. pp: 171-186.
[21] Montana, D. J.; Davis, L. “Training feedforward neural networks using
genetic algorithms”. Proceedings of the Int. Conference on Genetic
Algorithms, Morgan Kaufmann, 1989. pp. 379-384.
[22] Fogel, D. B. “Evolutionary Computation: Toward a New Philosophy of
Machine Intelligence”. 2nd Edition. The IEEE Press, 1999. pp: 290p.
[23] Atmar, W. “Notes on the Simulation of Evolution”. IEEE Transactions
on Neural Networks, vol. 5, no. 1, 1994. pp: 130-148.
[24] Chen, Z.; He, Y.; Chu, F.; Huang, J. “Evolutionary strategy for clas-
sification problems and its application in fault diagnostics”. Elsevier,
Engineering Appl. of Artificial Intelligence, vol. 16, 2003. pp: 31-38.
[25] Back, T. “Selective Pressure in Evolutionary Algorithms: A Character-
ization of Selection Mechanisms”. Proceedings Of the IEEE Conference
on Evolutionary Computation, Orlando, 1994. pp: 57-62.
[26] Goldberg, D. E.; Deb, K. “A Comparison of Selection Schemes Used
in Genetic Algorithms”. Foundations of Genetic Algorithms, G. J. E.
Rawlins (ed.), Morgan Kaufmann, 1991. pp: 69-93.
1728
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The recent surge in activity of neural network research in business is not surprising since the underlying functions controlling business data are generally unknown and the neural network offers a tool that can approximate the unknown function to any degree of desired accuracy. The vast majority of these studies rely on a gradient algorithm, typically a variation of backpropagation, to obtain the parameters (weights) of the model. The well-known limitations of gradient search techniques applied to complex nonlinear optimization problems such as artificial neural networks have often resulted in inconsistent and unpredictable performance. Many researchers have attempted to address the problems associated with the training algorithm by imposing constraints on the search space or by restructuring the architecture of the neural network. In this paper we demonstrate that such constraints and restructuring are unnecessary if a sufficiently complex initial architecture and an appropriate global search algorithm is used. We further show that the genetic algorithm cannot only serve as a global search algorithm but by appropriately defining the objective function it can simultaneously achieve a parsimonious architecture. The value of using the genetic algorithm over backpropagation for neural network optimization is illustrated through a Monte Carlo study which compares each algorithm on in-sample, interpolation, and extrapolation data for seven test functions.
Article
An evolutionary method of training a neural network is described and illustrated. Fundamental changes are needed in the usual training methods due to the need to provide the algorithm with discrete values of the variables (weights). In order to give the algorithm freedom to select weights from an unlimited range of values, mutation in integer variables produces a progressive ‘shift’ of the centre of the range of positive/negative values provided for selection. At each iteration the range of integer values offered to the algorithm is randomly selected. The variables are mutated in shuffled order; each successful mutation is captured by the algorithm, unsuccessful mutations are rejected. As the error progresses towards the target level, the rate of progress is controlled by progressively adapting the numerical range within which the mutation shifts are applied.The method is used to train illustrative networks to predict values of a simple trigonometric function, to provide an approximate analysis of reinforced concrete deep beams and to predict overall buckling loads for rectangular hollow steel sections. The results obtained using the new algorithm, are compared with those from conventional back-propagation (BP) training and with ‘exact’ results.
Article
Genetic algorithms (GAs) based evolutionary strategy is proposed for classification problems, which includes two aspects: evolutionary selection of the training samples and input features, and evolutionary construction of the neural network classifier. For the first aspect, the GA based k-means-type algorithm (GKMT) is proposed, which combines GA and k-means-type (KMT) to achieve the optimal selection of the training samples and input features simultaneously. By this algorithm, the “singular” samples will be eliminated according to the classification accuracy and the features that facilitate the classification will be enhanced. On the opposite, the useless features will be suppressed and even eliminated. For the second aspect, the hierarchical evolutionary strategy is proposed for the construction and training of the neural network classifier (HENN). This strategy uses the hierarchical chromosome to encode the structure and parameters of the neural network into control genes and parameter genes respectively, designs and trains the network simultaneously. Finally, the experimental study pertained to the fault diagnostics for the rotor-bearing system is given and the results presented show that the proposed evolutionary strategy for the classification problem is feasible and effective.
Article
Designing artificial neural networks (ANNs) for different applications has been a key issue in the ANN field. At present, ANN design still relies heavily on human experts who have sufficient knowledge about ANNs and the problem to be solved. As ANN complexity increases, designing ANNs manually becomes more difficult and unmanageable. Simulated evolution offers a promising approach to tackle this problem. This paper describes an evolutionary approach to design ANNs. The ANNs designed by the evolutionary process are referred to as evolutionary ANNs (EANNs). They represent a special class of ANNs in which evolution is another fundamental form of adaptation in addition to learning (also known as weight training). This paper describes an evolutionary programming (EP) based system to evolve both architectures and connection weights (including biases) of ANNs. Five mutation operators have been proposed in our evolutionary algorithm. In order to improve the generalisation ability of evolved ANNs, these five operators are applied sequentially and selectively. Validation sets have also been used in the evolutionary process in order to improve generalisation further. The evolutionary algorithm allows ANNs to grow as well as shrink during the evolutionary process. It incorporates the weight learning process as part of its mutation process. The whole EANN system can be regarded as a hybrid evolution and learning system. Extensive experimental studies have been carried out to test this EANN system. This paper gives some of the experimental results which show the effectiveness of the system.
Conference Paper
Multilayered feedforward neural networks possess a number of properties which make them particu­ larly suited to complex pattern classification prob­ lems. However, their application to some real- world problems has been hampered by the lack of a training algonthm which reliably finds a nearly globally optimal set of weights in a relatively short time. Genetic algorithms are a class of optimiza­ tion procedures which are good at exploring a large and complex space in an intelligent way to find values close to the global optimum. Hence, they are well suited to the problem of training feedfor­ ward networks. In this paper, we describe a set of experiments performed on data from a sonar image classification problem. These experiments both 1) illustrate the improvements gained by using a ge­ netic algorithm rather than backpropagation and 2) chronicle the evolution of the performance of the genetic algorithm as we added more and more domain-specific knowledge into it.
Book
This Third Edition provides the latest tools and techniques that enable computers to learn The Third Edition of this internationally acclaimed publication provides the latest theory and techniques for using simulated evolution to achieve machine intelligence. As a leading advocate for evolutionary computation, the author has successfully challenged the traditional notion of artificial intelligence, which essentially programs human knowledge fact by fact, but does not have the capacity to learn or adapt as evolutionary computation does. Readers gain an understanding of the history of evolutionary computation, which provides a foundation for the author's thorough presentation of the latest theories shaping current research. Balancing theory with practice, the author provides readers with the skills they need to apply evolutionary algorithms that can solve many of today's intransigent problems by adapting to new challenges and learning from experience. Several examples are provided that demonstrate how these evolutionary algorithms learn to solve problems. In particular, the author provides a detailed example of how an algorithm is used to evolve strategies for playing chess and checkers. As readers progress through the publication, they gain an increasing appreciation and understanding of the relationship between learning and intelligence. Readers familiar with the previous editions will discover much new and revised material that brings the publication thoroughly up to date with the latest research, including the latest theories and empirical properties of evolutionary computation. The Third Edition also features new knowledge-building aids. Readers will find a host of new and revised examples. New questions at the end of each chapter enable readers to test their knowledge. Intriguing assignments that prepare readers to manage challenges in industry and research have been added to the end of each chapter as well. This is a must-have reference for professionals in computer and electrical engineering; it provides them with the very latest techniques and applications in machine intelligence. With its question sets and assignments, the publication is also recommended as a graduate-level textbook. © 2006 The Institute of Electrical and Electronics Engineers, Inc.
Article
In view of several limitations of gradient search techniques (e.g. backpropagation), global search techniques, including evolutionary programming and genetic algorithms (GAs), have been proposed for training neural networks (NNs). However, the effectiveness, ease-of-use, and efficiency of these global search techniques have not been compared extensively with gradient search techniques. Using five chaotic time series functions, this paper empirically compares a genetic algorithm with backpropagation for training NNs. The chaotic series are interesting because of their similarity to economic and financial series found in financial markets.