ArticlePDF Available

Abstract and Figures

Implementation of model-based fault diagnosis systems can be a difficult task due to the complex dynamics of most systems, an appealing alternative to avoiding modeling is to use machine learning-based techniques for which the implementation is more affordable nowadays. However, the latter approach often requires extensive data processing. In this paper, a hybrid approach using recent developments in neural ordinary differential equations is proposed. This approach enables us to combine a natural deep learning technique with an estimated model of the system, making the training simpler and more efficient. For evaluation of this methodology, a nonlinear benchmark system is used by simulation of faults in actuators, sensors, and process. Simulation results show that the proposed methodology requires less processing for the training in comparison with conventional machine learning approaches since the data-set is directly taken from the measurements and inputs. Furthermore, since the model used in the essay is only a structural approximation of the plant; no advanced modeling is required. This approach can also alleviate some pitfalls of training data-series, such as complicated data augmentation methodologies and the necessity for big amounts of data.
Content may be subject to copyright.
applied
sciences
Article
Fault Diagnosis via Neural Ordinary Differential Equations
Luis Enciso-Salas * , Gustavo Pérez-Zuñiga and Javier Sotomayor-Moriano


Citation: Enciso-Salas, L.; Pérez-
Zuñiga, G.; Sotomayor-Moriano, J.
Fault Diagnosis via Neural Ordinary
Differential Equations. Appl. Sci. 2021,
11, 3776. https://doi.org/10.3390/
app11093776
Academic Editor: Mohamed
Benbouzid
Received: 8 January 2021
Accepted: 3 March 2021
Published: 22 April 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: c
2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Departamento de Ingeniería, Pontificia Universidad Católica del Perú, Avenida Universitaria 1801, San Miguel,
Lima 15088, Peru; gustavo.perez@pucp.pe (G.P.-Z.); jsotom@pucp.edu.pe (J.S.-M.)
*Correspondence: lenciso@pucp.pe
Abstract:
Implementation of model-based fault diagnosis systems can be a difficult task due to the
complex dynamics of most systems, an appealing alternative to avoiding modeling is to use machine
learning-based techniques for which the implementation is more affordable nowadays. However,
the latter approach often requires extensive data processing. In this paper, a hybrid approach using
recent developments in neural ordinary differential equations is proposed. This approach enables
us to combine a natural deep learning technique with an estimated model of the system, making
the training simpler and more efficient. For evaluation of this methodology, a nonlinear benchmark
system is used by simulation of faults in actuators, sensors, and process. Simulation results show that
the proposed methodology requires less processing for the training in comparison with conventional
machine learning approaches since the data-set is directly taken from the measurements and inputs.
Furthermore, since the model used in the essay is only a structural approximation of the plant; no
advanced modeling is required. This approach can also alleviate some pitfalls of training data-series,
such as complicated data augmentation methodologies and the necessity for big amounts of data.
Keywords: fault diagnosis; deep learning; neural ordinary differential equations
1. Introduction
Fault diagnosis systems are of great importance in industrial processes, especially in
those with high stakes systems, where it is important to perform continuous diagnostics
and monitoring of the main variables. Furthermore, it could also be used to perform fault
prediction for improving the maintenance plan and to avoid undesired malfunctions. In
the fault detection literature, a great number of methods have been dedicated to model-
based techniques [
1
,
2
] with great success in many cases, although a known drawback
of such methods is their limitation to processes with relatively simple dynamics and
sensiblity to model approximations. Examples of model-based methods are: the structural
approach, which searches for irreducible relations that attains sensibility to the selected
faults [
3
,
4
], and eigenstructure-based fault diagnosis, which deals with a minimization
problem for the sensibility of perturbations against faults [
5
,
6
]. The same minimization
problem is approached using Linear Matrix Inequalities (LMI) [
7
], and another common
approach is to use observers, such as sliding mode observers [
8
], Kalman-based fault
detection [
9
], and also residuals generation using parity space relations [
10
,
11
]. Likewise,
for complex processes with functional constraints, methods have been developed that
consider decentralized and distributed diagnostic architectures [12,13].
Another important group of fault detection techniques, regarded as data-based ap-
proaches, are used when the model of the plant under evaluation is not available. They are
dedicated to the analysis of collected data to detect unusual trends in it, and thus identify
faults in the system. Principal Component Analysis (PCA) performs a dimensionality re-
duction of the data map for finding faults in the system, although it requires high amounts
of data [
14
,
15
]. Support Vector Machines (SVM) is a technique for classification that per-
forms high generalization even with small data-sets but can suffer from data imbalance,
although learning ability can be added such as in the multikernel SVM [
16
]. Artificial
Appl. Sci. 2021,11, 3776. https://doi.org/10.3390/app11093776 https://www.mdpi.com/journal/applsci
Appl. Sci. 2021,11, 3776 2 of 16
neural networks (ANN) are employed using supervised training and can detect faults after
learning hidden relations in the data [
17
]; however, they lack visual representation and can
suffer from overfitting. Use of Deep Learning (DL) techniques have become increasingly
popular in the latter years due to an increase in power computation, higher availability of
information, and high accuracy, many DL models have been tested such as Dense Neural
Networks [
18
], Deep Belief Networks [
19
21
], Convolutional Neural Networks [
22
,
23
],
Recurrent Neural Networks [
24
,
25
], and Generative Neural Networks [
26
], however, in
most of the examples the DL model is adapted only to the plant in the study, therefore a
change of plant will require a change of the methodology.
Some works have also been devoted to the elaboration of hybrid techniques, model-
based plus data-based systems for fault diagnosis. These techniques aim to have combined
advantages from both previously mentioned approaches, they use mainly machine learning
techniques for fault identification coupled with model-based fault detectors. In [
27
], a
hybrid model-based and data-driven system is elaborated for fault detection and diagnosis,
although both methods are used independently. In [
28
], Jung proposed the use of SVM to
improve fault identification, and in [
29
], the merging mechanism between residuals and
SVM is improved for fault location. In [
30
], the hybrid mechanism is used for the selection
of residual generators. Meanwhile, in [
31
], a Modified PCA was combined with filters to
detect faults. On the other hand, the use of DL techniques in hybrid fault diagnosis systems
has not been extensively developed, although the use of DL is recognized for improving
accuracy in many other recognition tasks [32].
Deep neural networks are composed of finite transitions, in [
33
] the Neural Ordinary
Differential Equations (Neural ODE) was presented using an idea inspired in ResNets [
34
]
to give continuity to the network layers using ODE solvers, this development has been
resembled and employed in many fields such as [
35
] for continuous graph flows that can be
applied in generative tasks, moreover, it can be used to represent dynamics such as in [
36
]
for Hamiltonian processes, in [
37
] for an extension of stochastic processes, and
in [38]
for
inclusion in Graph Neural Networks as a tool to have a continuous flow of node features.
These groundbreaking developments make the application of DL networks for hybrid fault
detection systems very natural.
In this research, we embedded a preliminary plant model in the Neural ODE network;
this network receives the measurements and inputs of the plant, given in state-space
representation, and mirrors its corresponding dynamics. The training is performed in two
phases; the Neural ODE is trained in the first stage in order to mimic the plant dynamics,
and later it is conditioned to be used for the fault diagnosis system, which is trained in the
second stage. The preliminary model is employed on behalf of two desired aspects: to have
a more robust network behavior and reduce the training complexity. Since the Neural ODE
allows us to represent the dynamics of a process, then the first aspect is demonstrated using
an approximated (preliminary) plant model. We show that this is beneficial in comparison
to model-less neural networks. Additionally, the Neural ODE’s internal dynamics permits
us to have a visual representation of the network state.
The article is divided as follows, in Section 2we present the theory for Neural ODE.
Section 3is dedicated to model representation and how Neural ODE is used for the
implementation. In Section 4, a nonlinear benchmark model is used to illustrate the
application of the approach. In Section 5, a visual representation of the network is proposed.
In Section 6the characteristics of the methodology are discussed. Finally in Section 7, some
conclusions are given.
2. Graph Neural Ordinary Differential Equations
Resnet networks are shown to be one inspiration for the key concept of continuum
layers in the neural ODE network. However, the neural ODE training poses the problem of
finding an efficient training procedure; in this regard, the adjoint method has been proposed
in the literature as a manner to calculate the gradients for the backward propagation;
besides, we establish the procedure to include external inputs into the system.
Appl. Sci. 2021,11, 3776 3 of 16
2.1. ResNets Inspiration
ResNet [
34
] was a breakthrough in the training of deeper neural networks, the concept
was simple and based on the use of residual functions. These residual functions are
employed to obtain referenced signals that improve the training of networks with many
layers, this is illustrated in Figure 1, a ResNet block is described by (1)
hk+1=hk+f(hk,θk), (1)
where
hk
represents the state at layer kand
θ
represents the parameter of this layer in
the network.
weight layer
weight layer
+
Figure 1. ResNet block.
Moreover, considering a network with
D
layers, stacking the blocks and representing
the discrete transitions at step
t
in an iterative process, then we have the relation given
in (2)
Ht+1=Ht+F(Ht,Θ), (2)
where
t{0, · · · ,T}
and
HTRD
, transitions can be seen as an Euler discretization of
the continuous transformation [39].
Neural ODE [
33
] represents continuous transitions of the network, therefore
(2)
can
be expressed as a dynamical transition and define the solution by the following Cauchy
problem for an initial state X0:
(˙
H(t) = F(t,H(t),Θ)
H(0) = X0
(3)
A discussion for the well-posedness of (3) is presented in [33,38].
2.2. Training Using ODE Solvers
Solution to
(3)
is performed through training using ODE solvers. The loss
L
is calcu-
lated using the forward propagation and a first ODE solver as (4).
L(z(t1)) = Lz(t0) + Zt1
t0
F(z(t0),t0,t1,Θ)dt=L(ODESolve(z(t0),F,t0,t1,Θ)) , (4)
where
z(t)
is the hidden state at each instant,
t0
and
t1
are the initial and final time
respectively.
For the backward propagation, Chen et al. [
33
] proposes the use of adjoint state,
a(t) = L/z(t)
, for the calculation of the propagation gradients. The adjoint sensitivity
method is applied and the adjoint dynamics are calculated using a backward solver for the
relation given in (5)
Appl. Sci. 2021,11, 3776 4 of 16
a(t)
t=a(t)>F(z(t),t,Θ)
t. (5)
Computation of the gradient change with respect to the parameters requires to solve
relation (6) also backward
L
Θ=Zt0
t1
a(t)>F(z(t),t,Θ)
t. (6)
Finally, Equations (4)–(6) are implemented as an augmented dynamics that permit to
evaluate the respective integrals for
z
,
a
and
L
Θ
jointly in a single call to the ODE solver.
More elaborated details can be found in the original article [33].
2.3. External Inputs
In a system process, some external inputs had to be considered in the system. They
are added as dynamical variable into the derivatives, thus modifying (3)
(˙
H(t) = F(t,H(t),Θ) + u(t)
H(0) = Xe
(7)
Note that this does not modify the ODE implementation but an additional term has
to be included for relation
(4)
, therefore for the integral calculation, the total loss in the
system is calculated using (8).
L(z(t1)) = Lz(t0) + Zt1
t0
F(z(t0),F,t0,t1,Θ)dt +Zt1
t0
u(t)dt. (8)
Meanwhile, Equations
(5)
and
(6)
remain the same. Hence, in the ODE layer imple-
mentation, as employed in Section 4, Equations (5)–(8) are used.
3. Model Representation
The system representation used is a state-space for nonlinear systems; such repre-
sentation is employed to have a defined number of system variables that will coincide
with the neural ODE outputs; moreover, we describe the signals used in the training of the
neural ODE.
3.1. State-Space
We consider a nonlinear state-space representation in the following form:
˙
x(t) = f(x) + g(u), (9)
y(t) = h(x), (10)
where
x(t)Rn
is the state vector,
f(
.
)
is a function
RnRn
,
u(t)Rm
the input vector,
g(.)is a function RmRn,y(t)Rpthe output vector, and h(.)is a function RnRp.
Relation
(9)
is termed the process equation and relation
(10)
is termed the measurement
equation.
3.2. Neural ODE Networks
It is clear that any state-space system in the form given by
(9)
is composed of a set
of ODE systems, here each state
xi
, where
i={1, 2, · · · ,n}N
, can be modeled as an
output of the neural ODE. Training is challenging since states are not always available, one
reasonable solution is to generate an embedding using the measurement equation given by
(10), this is the part of the model that we include in the network.
The neural ODE thus has
n
outputs for which loss calculation is done using the
measurements fed through
h(x)
. Meanwhile, the inputs
u(t)
are incorporated as dynamic
external inputs in the network. Also, note that knowledge of function
f(x)
is not required
Appl. Sci. 2021,11, 3776 5 of 16
for the training, since weights and transition functions are learned by the neural ODE.
The network can be then especially sensible to faults in the process functions and in the
measurement functions as will be proved in the following sections.
4. Fault Diagnosis via Neural ODE Applied to Nonlinear Benchmark System
The fault diagnosis system for a non-linear plant based on the 4-tank model is pre-
sented. Hence, the model is first presented, and thereafter the generation and processing
of the data-sets are described; since this is a solution for actuators, sensors, and process
faults, training and results for each case are presented separately. The schemes used for the
neural ODE training and the fault diagnosis training are also presented.
4.1. System Model
A non-linear four-tank plant (Figure 2) is a MIMO benchmark model to evaluate FDI
algorithms, first proposed in [
40
], studied for control [
41
] and fault diagnosis systems [
42
].
The system has 2 pumps which are controlled to input water flow into the system, the
valve formation distributes the flow into the tanks in a proportion given by the constants
γi
and
ki
. The states of the plant are the levels in each tank (
hi
) and they are related by the
following relations:
˙
h1(t) = a1p2gh1(t)
A1+a4p2gh4(t)
A1+γ1k1u1
A1(t),
˙
h2(t) = a2p2gh2(t)
A2+a3p2gh3(t)
A2+γ2k2u2(t)
A2,
˙
h3(t) = a3p2gh3(t)
A3+(1γ1)k1u1(t)
A3,
˙
h4(t) = a4p2gh4(t)
A4+(1γ2)k2u2(t)
A4,
where the coefficients
Ai
,
ai
,
γi
and
ki
are described in Table 1. Only parameters related to
the inputs are used for the training.
T3T4
T1 T2
LT1 LT2
LT3 LT4
Figure 2. Four tank plant scheme.
Appl. Sci. 2021,11, 3776 6 of 16
Table 1. Process Parameters.
Parameter Units
Bottom area, Ai, for i=1, 2, 3, 4 cm2
Outlet pipe cross section, ai, for i=1, 2 cm
Outlet pipe cross section, ai, for i=3, 4 cm
Gravity constant, gcm2/s
Pump constants, ki, for i=1, 2 cm3/(Vs)
Main valve positions, γifor i=1, 2
Heights 1 and 2, h1o,h2ocm
Heights 3 and 4, h3o,h4ocm
Inputs 1 and 2, u1o,u2ocm3/s
Hence, for this system, the measurements and state vectors in (11) are considered.
y(t) =
h1(t)
h2(t)
h3(t)
h4(t)
+ν(t),x(t) =
h1(t)
h2(t)
h3(t)
h4(t)
. (11)
The faults considered are two actuators faults, four sensor faults, and four process faults,
we will treat them separately for the training but a complete system can be easily composed
by joining the neural models.
4.2. Data-Set
The data-set for the Four Tank Process are generated using the exact model of the sys-
tem, including typical dynamic possibilities present in the real system, such as controllers,
set-point modifications, measurement noise, perturbations, and also actuators, sensors and
process faults. A total of 1,000,000 samples are obtained for each case (actuator, sensor, and
process fault scenarios), considering only the known signals (measurements and inputs)
and the labels for each fault. An extract of the whole data-set is shown in Figure 3, and its
corresponding actuator faults are shown in Figure 4.
Figure 3.
An extract from the data-set of the four tank system. Above: Inputs, Below: Measurements.
Appl. Sci. 2021,11, 3776 7 of 16
Figure 4.
Actuator faults in an extract of the data-set, shaded areas indicate that a fault has occurred,
and signals are normalized.
4.3. Actuator Faults
4.3.1. Training
Training is performed in two stages, in the first stage the Neural ODE [
33
] is trained
using the block representation presented in Figure 5, the measurements (
y(t)
) pass through
4 linear layers, where the first three layers are followed by activation functions of
Tanh
type, meanwhile, the inputs (
u(t)
) are considered using a linear layer. For this stage, the
training is done in 2000 epochs, with a learning rate that starts at 10
3
and is varied at
epoch 1000 to 10
4
. The ODE tolerance is kept at 10
7
during the whole process. Inputs
(u(t)) are added sequentially using a linear layer that is kept constant.
For the second stage of training, one additional linear layer with a
Tanh
activation
function and a softmax classifier for four classes (two actuator faults) are added. The model
considered is presented in Figure 6. The input for this stage is the propagated derivatives
of the internal states (
˙
ˆ
y
) and the derivatives for the measurements (
˙
y
). This training is
performed in 20,000 epochs starting with a learning rate of 10
3
that is reduced at epoch
15,000 to 10
4
. For simplicity, we kept the model of the first stage, the neural ODE, constant
during this second stage training. However, is possible to continue adapting the neural
ODE in this stage or even during evaluation for possible model changes.
Linear + Tanh
Linear + Tanh
Linear + Tanh
Linear
Linear
+
ODE
Figure 5. Neural network model for ODE training.
Appl. Sci. 2021,11, 3776 8 of 16
Linear + Tanh
Linear + Tanh
Linear + Tanh
Linear
Linear
+
Linear + tanh
Softmax
+
Figure 6. Neural network model for fault diagnosis training.
4.3.2. Results
In order to evaluate the results, we first show in Figure 7the generated dynamics
using the neural network for 2000 points with initial conditions
x= [
0, 0, 0, 0
]
, that is
compared in the same plot with the real system responses, this demonstrates that the
network is representing essential information from the model.
For the diagnosis, a total of 20,000 samples from the data-set are used, results are
shown in Figure 8, where a comparative of the real faults and detected faults is presented,
the network reached 98.73% of accuracy for this test. In the total data-set with 1,000,000
samples, the accuracy was 98.5%.
Figure 7. First stage. Continuous line: true dynamics, dotted line: learned dynamics.
Appl. Sci. 2021,11, 3776 9 of 16
Figure 8.
Actuator faults detection, the faults are represented in the shaded areas, states are given in
black for reference. Above: faults labeled in the data-set. Below: faults detected by the system.
4.4. Sensor Faults
4.4.1. Training
The first stage of this training is not required since the model is already embedded in
the neural ODE from training in Section 4.3.1 using the same scheme presented in Figure 7.
For the second stage of training, the model considered is the same as presented in Figure 6
but using a softmax classifier with 16 classes (4 sensor faults). This training is performed in
10,000 epochs starting with a learning rate 10
3
that is reduced at epoch 8000 to 10
4
, for
simplicity we also kept the model of the first stage constant during this training.
4.4.2. Results
The results for a group of 20,000 samples of the data-set are shown in Figure 9, where
a comparative of the real faults and detected faults are given. In this case, the network
reached 98.36% of accuracy for this test. On the total data-set, with 1,000,000 samples, the
accuracy was 98.1%.
Figure 9.
Sensor faults detection, the faults are represented in the shaded areas, states are given in
black for reference. Above: faults labeled in the data-set. Below: faults detected by the system.
Appl. Sci. 2021,11, 3776 10 of 16
4.5. Process Faults
4.5.1. Training
In this case, the first stage of training was also not required. For the second stage
of training, the model considered was is based as well in the scheme of Figure 6, and
the softmax classifier used 16 classes (4 process faults). This training was performed in
30,000 epochs starting with a learning rate 10
3
that is reduced at epoch 20,000 to 10
4
, the
model of the first stage is kept constant during this training.
4.5.2. Results
The results for a group of 20,000 samples of the data-set are shown in Figure 10, where
a comparative of the real faults and detected faults are given. In this case, the network
reached 98.75% of accuracy for the test. On the total data-set, with 1,000,000 samples, the
accuracy was 98.11%.
Figure 10.
Process faults detection, the faults are represented in the shaded areas, states are given in
black for reference. Above: faults labeled in the data-set. Below: faults detected by the system.
5. Visual Representation of the Network
A common pitfall of using DL networks is the lack of visual representation of the
neural network which is often seen as a black box. With the proposed approach we
furthermore could establish an additional representation for the network, since dynamics
in the network follows closely that of the plant, an interesting representation is the phase
diagrams generated by the Neural ODE. To illustrate this concept, we present the phase
diagram for normal states of the system in Figures 11 and 12 against faulty conditions for
actuators and sensors in Figures 13 and 14 respectively. These diagrams are generated
using the ODE projection for 200 samples. It is noticeable that continuous predictions are
obtained when normal conditions are happening. On the other hand, the predictions are
not continuous in the case of faults.
Appl. Sci. 2021,11, 3776 11 of 16
Figure 11. Phase diagram for normal prediction in actuators data-set.
Figure 12. Phase diagram for normal prediction in sensors data-set.
Appl. Sci. 2021,11, 3776 12 of 16
Figure 13. Phase diagram during fault in actuator.
Figure 14. Phase diagram during a fault in sensor.
6. Discussion
Training methodology of DL structures is vital to reach models with high performance.
A review of the literature demonstrates that training can be very demanding for DL models.
In [
43
], a common procedure for fault detection using DL models is proposed, feature
extraction and neural network type are regarded as two key elements of the process. In [
44
]
an autoencoder is used with 4 different and selectable behaviors. In [
21
], the training is
divided into many steps in accordance with the aging of the system. In [
20
], the training
is performed using ensemble learning, which uses multiple learners. In [
25
], a feature
extraction and a dimensionality reduction are performed previously over the data-set.
Appl. Sci. 2021,11, 3776 13 of 16
In [
24
], a group of 14 features is extracted and then selected using a criterion before the
training. All these approaches shown the complexity that features selection requires.
Our approach aims to alleviate these complicated procedures with the inclusion
of some characteristics of the model dynamics to be learned by the Neural ODE. We
incorporated in the training an approximation of the system using features in form of the
measurement equations and inputs of the process. Those characteristics allow us to reduce
the epochs and complexity and furthermore to work with data-series in a natural manner,
as was demonstrated in Section 4.
Some final tests were performed using classical SVM, ANN, and DL models with the
data-set of actuator faults, Table 2shows the results for each model tested, we include
weighted accuracy as a measure for accuracy considering weights for each class, class 0 for
no fault is weighted 0.08 and other classes are weighted 0.95, 0.96 and 0.99, respectively.
Additionally, in all the deep learning methods, except for the Neural ODE method, data
augmentation is previously performed using noise injection, dropout, pooling, and time-
warping [
45
]; and batch normalization is included for the fully connected layers. It is
noticeable from the results that the Neural ODE model allows improving the efficiency of
the network for this case.
From Table 2, the SVM method reaches a high accuracy for this case, but it is clearly
due to false negatives, as is appreciated in its corresponding weighted accuracy. The
ANN and DL models had a lower accuracy; hence, it is likely required a more advanced
pre-processing of the data-set to reach better results. Meanwhile, the use of neural ODE
and dense network, proposed in this work, shows that, even excluding pre-processing and
data augmentation, the model can achieve higher accuracies.
Table 2. Accuracy for different Machine learning Models.
Model Accuracy Weighted Accuracy
Suport Vector Machine
(Polynomial Kernel) 93.3% 62.657%
Artificial Neural Network
(4 Fully connected Layers) 80.9% 74.8%
Convolutional Neural Network
(2 Convolutional Layers + 1 Fully connected Layer) 77.5% 55.8%
Recurrent Neural Network
(4 LSTM Layers + 1 Fully connected Layer) 80.53% 73.8%
Neural ODE & Dense Network
(Neural ODE + 2 Fully connected Layers) 98.5% 90%
7. Conclusions
Over the past few years, the development of new techniques for deeper networks such
as ResNet has inspired the training of neural networks using ODE solvers as a manner
to generate a continuous layer representation and even further allowed to improve the
representation of systems dynamics by neural ODE networks. Therefore, a hybrid approach
was presented in this article for fault diagnosis of sensors, actuators, and process faults.
Results of simulations show that high performances of fault diagnosis can be obtained using
neural ODE networks not requiring extensive data-processing and advanced modeling.
In a nonlinear benchmark system, the training efficiency was demonstrated for each fault
type, and shows an improvement when compared with classical Deep Learning models.
Here, the accuracy of the network is demonstrated to be higher than 90%. In addition, a
visual representation is proposed by using phase diagram representations. Likewise, the
data needed to train effectively the network is reduced which permits to avoid elaborated
procedures such as feature extraction, pre-processing and data augmentation.
Appl. Sci. 2021,11, 3776 14 of 16
Author Contributions:
All the authors contributed to the design of the model, the result analysis,
and the writing and review of the paper. Specifically, L.E.-S. was in charge of doing the simulation
and training of the network, and also of the evaluation of the FDI system in the benchmark system,
G.P.-Z. of the introduction and state of the art and G.P.-Z. and J.S.-M. of the overall ideas of the
exposed research and the general conception of the paper. All authors have read and agreed to the
published version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement:
The data presented in this study is available on request from the
corresponding author.
Acknowledgments:
The authors are grateful for the finantial support from Proyecto Concytec—
Banco Mundial “Mejoramiento y Ampliación de los Servicios del Sistema Nacional de Ciencia
Tecnología e Innovación Tecnológica” 8682-PE, a través de su unidad ejecutora Fondecyt, and from
Contrato de Adjudicación de fondos N
10-2018-FONDECYT/BM-Programas de Doctorados en
Áreas Estratégicas y Generales.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
ODE Ordinary Differential Equations
LMI Linear Matrix Inequalities
PCA Principal Component Analysis
SVM Support Vector Machine
ANN Artifical Neural Network
ResNet Residual Networks
MIMO Multiple Inputs Multiple Outputs
FDI Fault Detection and Idetification
References
1. Ding, S.X. Model-Based Fault Diagnosis Techniques, 2nd ed.; Springer: London, UK, 2013.
2.
Pérez-Zuñiga, G.; Rivas-Perez, R.; Sotomayor-Moriano, J.; Sánchez-Zurita, V. Fault Detection and Isolation System Based on
Structural Analysis of an Industrial Seawater Reverse Osmosis Desalination Plant. Processes 2020,8, 1100. [CrossRef]
3.
Krysander, M.; Åslund, J.; Nyberg, M. An efficient algorithm for finding minimal overconstrained subsystems for model-based
diagnosis. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2008,38, 197–206. [CrossRef]
4.
Pérez, C.G.; Travé-Massuyès, L.; Chanthery, E.; Sotomayor, J. Decentralized diagnosis in a spacecraft attitude determination and
control system. J. Phys. Conf. Ser. 2015,659, 1–12. [CrossRef]
5.
Dai, X.; Gao, Z.; Breikin, T.; Wang, H. Disturbance attenuation in fault detection of gas turbine engines: A discrete robust observer
design. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2009,39, 234–239. [CrossRef]
6.
Gao, Z.; Cecati, C.; Ding, S.X. A survey of fault diagnosis and fault-tolerant techniques-part II: Fault diagnosis with knowledge-
based and hybrid/active approaches. IEEE Trans. Ind. Electron. 2015,62, 3768–3774. [CrossRef]
7.
Karimi, H.R.; Zapateiro, M.; Luo, N. A linear matrix inequality approach to robust fault detection filter design of linear systems
with mixed time-varying delays and nonlinear perturbations. J. Frankl. Inst. 2010,347, 957–973. [CrossRef]
8.
Liu, C.; Jiang, B.; Zhang, K. Sliding mode observer-based actuator fault detection for a class of linear uncertain systems. In
Proceedings of the 2014 IEEE Chinese Guidance, Navigation and Control Conference, CGNCC 2014, Yantai, China, 8–10 August
2014; pp. 230–234. [CrossRef]
9.
Guo, F.; Ren, X.; Li, Z.; Han, C. Kalman filter based fault detection of dual motor systems. In Proceedings of the Chinese Control
Conference, CCC 2017, Dalian, China, 26–28 July 2017; pp. 7133–7137. [CrossRef]
10.
Nguang, S.K.; Zhang, P.; Ding, S. Parity relation based fault estimation for nonlinear systems: An LMI approach. IFAC Proc. Vol.
(IFAC-PapersOnline) 2006,6, 366–371. [CrossRef]
11.
Basri, H.M.; Lias, K.; Abidin, W.A.W.Z.; Tay, K.M.; Zen, H. Fault Detection Using Dynamic Parity Space Approach H. In
Proceedings of the 2012 IEEE International PEOCO2012, Melaka, Malaysia, 6–7 June 2012.
Appl. Sci. 2021,11, 3776 15 of 16
12.
Pérez, C.G.; Chanthery, E.; Travé-massuyès, L.; Sotomayor, J. Fault-Driven Minimal Structurally Overdetermined Set in a
Distributed Context To cite this version: HAL Id: Hal-01392572, In Proceedings of the 27th International Workshop on Principles
of Diagnosis: DX-2016, Denver, CO, USA, 4–7 October 2016.
13. Pérez-Zuniga, C.; Chantery, E.; Travé-Massuyes, L.; Sotomayor, J.; Artigues, C. Decentralized Diagnosis via Structural Analysis
and Integer Programming. IFAC-PapersOnLine 2018,51, 168–175. [CrossRef]
14.
Garcia-Alvarez, D. Fault detection using Principal Component Analysis (PCA) in a Wastewater Treatment Plant (WWTP). In
Proceedings of the 62-th International Student’s Scientific Conference, San Diego, CA, USA, 17–19 November 2009; p. 6.
15.
Wang, X.; Kruger, U.; Irwin, G.W.; McCullough, G.; McDowell, N. Nonlinear PCA with the local approach for diesel engine fault
detection and diagnosis. IEEE Trans. Control Syst. Technol. 2008,16, 122–129. [CrossRef]
16.
Wang, X.; Liu, X.; Japkowicz, N.; Matwin, S. Ensemble of multiple kernel SVM classifiers. In Lecture Notes in Computer Science
(Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer International Publishing:
Cham, Switzerland, 2014; Volume 8436, pp. 239–250. [CrossRef]
17.
Mohd Amiruddin, A.A.A.; Zabiri, H.; Taqvi, S.A.A.; Tufa, L.D. Neural network applications in fault diagnosis and detection: An
overview of implementations in engineering-related systems. Neural Comput. Appl. 2020,32, 447–472. [CrossRef]
18.
Hoang, D.T.; Kang, H.J. A survey on Deep Learning based bearing fault diagnosis. Neurocomputing
2019
,335, 327–335. [CrossRef]
19.
Mandal, S.; Santhi, B.; Sridhar, S.; Vinolia, K.; Swaminathan, P. Nuclear Power Plant Thermocouple Sensor-Fault Detection
and Classification Using Deep Learning and Generalized Likelihood Ratio Test. IEEE Trans. Nuclear Sci.
2017
,64, 1526–1534.
[CrossRef]
20.
Liang, T.; Wu, S.; Duan, W.; Zhang, R. Bearing fault diagnosis based on improved convolutional deep belief network. IOP Conf.
Ser. J. Phys. 2018. [CrossRef]
21.
Jia, F.; Lei, Y.; Lin, J.; Zhou, X.; Lu, N. Deep neural networks: A promising tool for fault characteristic mining and intelligent
diagnosis of rotating machinery with massive data. Mech. Syst. Signal Process. 2016,72–73, 303–315. [CrossRef]
22.
Wei, Z.; Gaoliang, P.; Chuanhao, L. Bearings Fault Diagnosis Based on Convolutional Neural Networks with 2- D Representation
of Vibration Signals as Input. MATEC Web Conf. 2017,13001, 1–5.
23.
Zhang, B.; Li, W.; Hao, J.; Li, X.L.; Zhang, M. Adversarial adaptive 1-D convolutional neural networks for bearing fault diagnosis
under varying working condition. arXiv 2018, arXiv:1805.00778.
24.
Guo, L.; Li, N.; Jia, F.; Lei, Y.; Lin, J. A recurrent neural network based health indicator for remaining useful life prediction of
bearings. Neurocomputing 2017,240, 98–109. [CrossRef]
25.
Abed, W.; Sharma, S.; Sutton, R.; Motwani, A. A Robust Bearing Fault Detection and Diagnosis Technique for Brushless DC
Motors Under Non-stationary Operating Conditions. J. Control Autom. Electr. Syst. 2015,26, 241–254. [CrossRef]
26.
Zhang, C.; Xu, L.; Li, X.; Wang, H. A Method of Fault Diagnosis for Rotary Equipment Based on Deep Learning. In Proceedings
of the 2018 Prognostics and System Health Management Conference, PHM-Chongqing 2018, Chongqing, China, 26–28 October
2018; pp. 958–962. [CrossRef]
27.
Frank, S.; Heaney, M.; Jin, X.; Robertson, J.; Cheung, H.; Elmore, R.; Henze, G.P. Hybrid Model-Based and Data-Driven Fault
Detection and Diagnostics for Commercial Buildings. In Proceedings of the ACEEE Summer Study on Energy Efficiency in
Buildings, Pacific Grove, CA, USA, 21–26 August 2016; pp. 12.1–12.14.
28.
Jung, D.; Ng, K.Y.; Frisk, E.; Krysander, M. A combined diagnosis system design using model-based and data-driven methods. In
Proceedings of the Conference on Control and Fault-Tolerant Systems, SysTol, Barcelona, Spain, 7–9 September 2016; pp. 177–182.
[CrossRef]
29.
Jung, D.; Ng, K.Y.; Frisk, E.; Krysander, M. Combining model-based diagnosis and data-driven anomaly classifiers for fault
isolation. Control Eng. Pract. 2018,80, 146–156. [CrossRef]
30.
Jung, D.; Sundstrom, C. A Combined Data-Driven and Model-Based Residual Selection Algorithm for Fault Detection and
Isolation. IEEE Trans. Control Syst. Technol. 2019,27, 616–630. [CrossRef]
31.
Jiang, N.; Hu, X.; Li, N. Graphical temporal semi-supervised deep learning–based principal fault localization in wind turbine
systems. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng. 2020,234, 985–999. [CrossRef]
32.
Li, D.; Wang, Y.; Wang, J.; Wang, C.; Duan, Y. Recent advances in sensor fault diagnosis: A review. Sens. Actuators A Phys.
2020
,
309, 111990. [CrossRef]
33.
Chen, R.T.Q.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D. Neural Ordinary Differential Equations. arXiv
2018
, arXiv:1806.07366.
34.
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [CrossRef]
35. Deng, Z.; Nawhal, M.; Meng, L.; Mori, G. Continuous Graph Flow. arXiv 2019, arXiv:908.02436.
36. Greydanus, S.; Dzamba, M.; Yosinski, J. Hamiltonian Neural Networks. arXiv 2019, arXiv:1906.01563v3.
37.
Tzen, B.; Raginsky, M. Neural Stochastic Differential Equations: Deep Latent Gaussian Models in the Diffusion Limit. arXiv
2019
,
arXiv:1905.09883v2.
38.
Poli, M.; Massaroli, S.; Park, J.; Yamashita, A.; Asama, H.; Park, J. Neural ordinary differential equations. In Proceedings of the
Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020; pp. 6571–6583.
39.
Ruthotto, L.; Haber, E. Deep Neural Networks Motivated by Partial Differential Equations. J. Math. Imaging Vis.
2020
,62, 352–364.
[CrossRef]
Appl. Sci. 2021,11, 3776 16 of 16
40.
Johansson, K.H. The quadruple-tank process: A multivariable laboratory process with an adjustable zero. IEEE Trans. Control
Syst. Technol. 2000,8, 456–465. [CrossRef]
41.
Sotomayor-Moriano, J.; Pérez-Zúñiga, G.; Soto, M. A Virtual Laboratory Environment for Control Design of a Multivariable
Process. IFAC PapersOnLine 2019,52, 218–223. [CrossRef]
42.
Sánchez-Zurita, V.; Pérez-Zúñiga, G.; Sotomayor-Moriano, J. Reconfigurable Model Predictive Control applied to the Quadruple
Tank Process. In Proceedings of the 15th European Workshop on Advanced Control and Diagnosis, Palazzo Grassi, Bologna,
21–22 November 2019.
43.
Jager, G.; Zug, S.; Brade, T.; Dietrich, A.; Steup, C.; Moewes, C.; Cretu, A.M. Assessing neural networks for sensor fault detection.
In Proceedings of the CIVEMSA 2014—2014 IEEE Conference on Computational Intelligence and Virtual Environments for
Measurement Systems and Applications, Ottawa, ON, Canada, 5–7 May 2014; pp. 70–75. [CrossRef]
44.
Shao, H.; Jiang, H.; Zhao, H.; Wang, F. A novel deep autoencoder feature learning method for rotating machinery fault diagnosis.
Mech. Syst. Signal Process. 2017,95, 187–204. [CrossRef]
45.
Wen, Q.; Sun, L.; Song, X.; Gao, J.; Wang, X.; Xu, H. Time Series Data Augmentation for Deep Learning: A Survey. arXiv
2020
,
arXiv:2002.12478.
... On the other hand, FDI of plant components could use data-based methods that exploit available experimental (historical) data [22]; model-based methods that are based on comparing the actual outputs of the monitored system with the outputs obtained from an analytical mathematical model [23]; and knowledge-based methods which generally use the relationships between faults and symptoms in the form of rules [24]. Particularly, FDI in large scale distributed plants is a difficult task due not only to the large amount of components, but also to its interconnected subsystems with coupled variables between them [25], here by using model-based approaches, a centralized or a distributed architecture in the design of the FDI system may be considered. ...
Article
Full-text available
Fault detection and isolation (FDI) in oil pipeline systems (OPS) is a very critical issue because faults in these systems such as leaks or equipment malfunctions may cause significant safety accidents and economic losses. These are the challenging factors, along with the environmental regulations for developing efficient FDI approaches for OPS. This paper proposes a model-based distributed FDI approach, which uses a structural model of the system in conjunction with algorithms to generate diagnostic tests that may be implemented in local diagnosers along the OPS. The proposed approach allows detection and isolation of faults in pipeline sections (pipeline segments), pump stations, as well as process control equipment. In this way, simulation of the obtained diagnostic tests in a benchmark application shows that all faults of interest (pipeline segment faults and sensor faults) are detected and isolated.
Article
In recent years, the increasing complexity and diversity of data-based fault detection and isolation (FDI) methods usually require high computational efforts in the pre-processing stage, large amounts of data, and, most of the time, some feature extraction to obtain relevant information for the data-based algorithms. This paper proposes using the Neural Ordinary Differential Equations (NODE) framework to represent the dynamics of the studied plant and later employ such representation in FDI system design. Such an approach enables loss optimization to be performed jointly in the plant dynamics and external inputs without previous use of complex pre-processing and is useful for working with nonlinear systems. The approach is first validated using a simulated Unmanned Aerial Vehicle (UAV) and later applied to a data-set that contains actuators and sensors faults. Ultimately, the proposed approach is compared with other usual machine learning techniques, showing better performance metrics.
Conference Paper
Full-text available
Deep learning performs remarkably well on many time series analysis tasks recently. The superior performance of deep neural networks relies heavily on a large number of training data to avoid overfitting. However, the labeled data of many real-world time series applications may be limited such as classification in medical time series and anomaly detection in AIOps. As an effective way to enhance the size and quality of the training data, data augmentation is crucial to the successful application of deep learning models on time series data. In this paper, we systematically review different data augmentation methods for time series. We propose a taxonomy for the reviewed methods, and then provide a structured review for these methods by highlighting their strengths and limitations. We also empirically compare different data augmentation methods for different tasks including time series classification, anomaly detection, and forecasting. Finally, we discuss and highlight five future directions to provide useful research guidance.
Article
Full-text available
Currently, the use of industrial seawater reverse osmosis desalination (ISROD) plants has increased in popularity in light of the growing global demand for freshwater. In ISROD plants, any fault in the components of their control systems can lead to a plant malfunction, and this condition can originate safety risks, energy waste, as well as affect the quality of freshwater. This paper addresses the design of a fault detection and isolation (FDI) system based on a structural analysis approach for an ISROD plant located in Lima (Peru). Structural analysis allows obtaining a plant model, which is useful to generate diagnostic tests. Here, diagnostic tests via fault-driven minimal structurally overdetermined (FMSO) sets are computed, and then, binary integer linear programming (BILP) is used to select the FMSO sets that guarantee isolation. Simulations shows that all the faults of interest (sensors and actuators faults) are detected and isolated according to the proposed design.
Article
Full-text available
This paper describes the development of a virtual laboratory environment (VLE) that allows students to perform control design practice in a virtual plant from remote locations through a web browser. The proposed VLE facilitates to learn concepts; such as, design of controllers and system identification of multivariable processes using a simulation environment, and an industrial device with a reliable model of a benchmark plant. Architecture of the VLE is explained and evidence of its use is showed. The proposed VLE represents an education tool that is user friendly, wide availability, with graphical interface capabilities and low cost maintenance, that allows to improve student skills by connecting the theory and practice.
Article
Full-text available
The use of artificial neural networks (ANN) in fault detection analysis is widespread. This paper aims to provide an overview on its application in the field of fault identification and diagnosis (FID), as well as the guiding elements behind their successful implementations in engineering-related applications. In most of the reviewed studies, the ANN architecture of choice for FID problem-solving is the multilayer perceptron (MLP). This is likely due to its simplicity, flexibility, and established usage. Its use managed to find footing in a variety of fields in engineering very early on, even before the technology was as polished as it is today. Recurrent neural networks, while having overall stronger potential for solving dynamic problems, are only suggested for use after a simpler implementation in MLP was attempted. Across various ANN applications in FID, it is observed that preprocessing of the inputs is extremely important in obtaining the proper features for use in training the network, particularly when signal analysis is involved. Normalization is practically a standard for ANN use, and likely many other decision-based learning methods due to its ease of use and high impact on speed of convergence. A simple demonstration of ANN’s ease of use in solving a unique FID problem was also shown.
Article
Full-text available
Traditional intelligent fault diagnosis works well when the labeled training data (source domain) and unlabeled testing data (target domain) are drawn from the same distribution. However, in many real-world applications, the working conditions can vary between training and testing time. In this paper, we address the issues of intelligent fault diagnosis when the data at training and testing time do not come from the same distribution as a domain adaptation problem using Domain Adaptive Convolutional Neural Networks or DACNN. Our proposed DACNN consists of three parts: a source feature extractor, a target feature extractor, and a label classifier. We adopt a two-stage training process to obtain strong faultdiscriminative and domain-invariant capacity. First, we obtain fault-discriminative features by pre-training the source feature extractor with labeled source training examples to minimize the label classifier error. Then, in the domain adaptive fine-tuning stage, we train the target feature extractor to minimize the squared Maximum Mean Discrepancy (MMD) between the output of the source and target feature extractor, such that the instances sampled from the source and target domains have similar distributions after the mapping. Furthermore, to enable training efficiency in domain adaptation, the layers between the source and target feature extractors in our DACNN are partially untied during the training stage. Experiments on the bearing and gearbox fault data showed that DACNN can achieve high fault diagnosis precision and recall under different working conditions, outperforming other intelligent fault diagnosis methods. We also demonstrate the ability to visualize the learned features and the networks to better understand the reasons behind the remarkable performance of our proposed model.
Article
Full-text available
Centralized fault diagnosis architectures are sometimes prohibitive for large-scale interconnected systems such as distribution systems, telecommunication networks, water distribution networks, fluid power systems. This paper presents a decentralized fault diagnosis method for continuous systems that only requires the knowledge of local models and limited knowledge of their neighboring subsystems. The notion of Fault-Driven Minimal Structurally Overdetermined (FMSO) set is used as the corner stone of the design of residual generators for the design of decentralized fault diagnosis for systems that have constraints of confidentiality, distance or limited access to some information. Binary integer linear programming (BILP) is used to optimize the choice of FMSO sets in each local subsystem.
Article
As an essential component of data acquisition systems, sensors have been widely used, especially in industrial and agricultural sectors. However, sensors are also prone to faults due to their harsh working environment. Therefore, the early identification of sensor faults is critical for making corrective actions to mitigate the impact. This paper provides a comprehensive review on the contemporary fault diagnosis techniques and helps researchers and practitioners to understand the current state of the art development in this emerging field. The paper introduces the common fault types and causes in sensors, and different types’ methods for fault diagnosis used in industry and agriculture sectors. It discusses the advantages and disadvantages of these methods, highlights the current challenges, and offers recommendations for future research directions.
Article
Principal fault localization of the faults chain, as a branch of fault diagnosis in wind turbine system, has been an essential problem to ensure the reliability and security in the real wind farms recently. It can be solved by machine learning techniques with historical data labeled with principal faults. However, most real data are unlabeled, since the labeled is expensive to obtain, which increases the difficulty to localize the principal fault if just using unlabeled data and few labeled data. So, in this article, a novel approach using unlabeled data is proposed for principal fault localization of the faults chain in wind turbine systems. First, a deep learning model, stacked sparse autoencoders, is introduced to learn and extract high-level features from data. Then, we present a graphical temporal semi-supervised learning algorithm to develop the pseudo-labeled data set with an unlabeled data set. Considering the temporal correlation of wind power data, we add a time weight vector and apply the cosine-similarity in the proposed algorithm. Finally, based on the pseudo-labeled data set, a classifier model is built and trained for the principal fault localization of the faults chain. The proposed approach is verified by the real buffer data set collected from two wind farms in China, and the experimental results show its effectiveness in practice.
Article
Nowadays, Deep Learning is the most attractive research trend in the area of Machine Learning. With the ability of learning features from raw data by deep architectures with many layers of non-linear data processing units, Deep Learning has become a promising tool for intelligent bearing fault diagnosis. This survey paper intends to provide a systematic review of Deep Learning based bearing fault diagnosis. The three popular Deep Learning algorithms for bearing fault diagnosis including Autoencoder, Restricted Boltzmann Machine, and Convolutional Neural Network are briefly introduced. And their applications are reviewed through publications and research works on the area of bearing fault diagnosis. Further applications and challenges in this research area are also discussed.