ArticlePDF Available

Improved neural network performance using principal component analysis on Matlab

Authors:

Abstract and Figures

Most of the real-world data samples used to train artificial neural networks (ANNs) consist of correlated information caused by overlapping input instances. Correlation in sampled data normally creates confusion over ANNs during the learning process and thus, degrades their generalization capability. This paper proposes the Principal Component Analysis (PCA) method for elimination of correlated information in data. Since it has been well-known that Electrical Capacitance Tomography (ECT) data are highly correlated due to overlapping sensing areas, the PCA technique has been examined on the ECT data for oil fraction estimation from gas-oil flows. After application of PCA, the uncorrelated ECT data were used to train a Multi-Layer Perceptron (MLP) ANN system. Then, the trained MLP was tested upon unseen ECT data. The results demonstrated that the elimination of correlated information in the sample data by way of the PCA method improved the MLP's estimation performance and reduced the training time.
Content may be subject to copyright.
Improved Neural Network Performance Using Principal Component Analysis on Matlab
International Journal of The Computer, the Internet and Management Vol.16. N.o.2 (May-August, 2008) pp 1-8
1
Improved Neural Network Performance Using Principal
Component Analysis on Matlab
Junita Mohamad-Saleh
Senior Lecturer
School of Electric and Electronic
Engineering, Universiti Sains Malaysia,
Engineering Campus, 14300 Nibong Tebal,
Seberang Perai Selatan, Malaysia.
E-mail:
Hjms@eng.usm.my
Brian S. Hoyle
Professor in Vision System
School of Electronic & Electrical
Engineering, University of Leeds, LS2 9JT,
United Kingdom.
E-mail:b.s.hoyle@leeds.ac.uk
Abstract
Most of the real-world data samples
used to train artificial neural networks
(ANNs) consist of correlated information
caused by overlapping input instances.
Correlation in sampled data normally
creates confusion over ANNs during the
learning process and thus, degrades their
generalization capability. This paper
proposes the Principal Component Analysis
(PCA) method for elimination of correlated
information in data. Since it has been well-
known that Electrical Capacitance
Tomography (ECT) data are highly
correlated due to overlapping sensing areas,
the PCA technique has been examined on the
ECT data for oil fraction estimation from
gas-oil flows. After application of PCA, the
uncorrelated ECT data were used to train a
Multi-Layer Perceptron (MLP) ANN system.
Then, the trained MLP was tested upon
unseen ECT data. The results demonstrated
that the elimination of correlated
information in the sample data by way of the
PCA method improved the MLP’s estimation
performance and reduced the training time.
Keywords: Principal component analysis,
Multi-layer perceptron, tomography, fraction
estimation, Matlab.
1. Introduction
An ANN is a system consisting of
processing elements (PE) with links between
them. A certain arrangement of the PEs and
links produce a certain ANN model, suitable
for certain tasks. A Multi-Layer Perceptron
(MLP) is a kind of feed-forward ANN model
(i.e. forward direction links), consisting of
three adjacent layers; the input, hidden and
output layers [1]. Each layer has several
PEs. Figure 1 illustrates the structure of a
MLP.
MLPs learn from input-output samples
to become ‘clever’ i.e. capable of giving
outputs based on inputs which it has not seen
before. The learning process employs a
learning algorithm, during which the MLP
develops a mapping function between the
inputs and outputs. Basically, in a learning
process, the input PEs receive data from the
external environment (denoted by x
1
, x
2
, …
x
n
in Figure 1) and pass them to the hidden
PEs, which are responsible for simple, yet,
useful mathematical computations involving
the weight of the links (denoted by w
11
, w
21
,
… in the figure) and the input values. The
results from the hidden PEs are mapped onto
appropriate threshold function of each PE
and the final outputs are produced. The
Junita Mohamad-Saleh and Brian S. Hoyle
2
output values then become inputs to all PEs
in the adjacent layer (either the second
hidden layer or the output layer), and the
computation processes are repeated through
out the layers until finally, output values are
produced at the output PEs (denoted by y
1
,
y
2
, … in Figure 1). At this stage, an output
error value is calculated by computing the
difference between the MLP’s and the actual
outputs. The entire training process is
iterative in nature, and stops when an
acceptably small error is achieved. At
completion of a learning process, the MLP
should be able to give output solution(s) for
any given set of input data based on the
generalized mapping that it has developed.
Figure 1 - A schematic diagram of a Multi-
Layer Perceptron (MLP) neural network.
The performance of a MLP very much
depends on its generalization capability,
which in turn is dependent upon the data
representation. One important characteristic
of data representation is uncorrelated. In
other words, a set of data presented to a MLP
ought not consist of correlation information.
This is because correlated data reduce the
distinctiveness of data representation and
thus, introduce confusion to the MLP model
during the learning process and hence,
producing one that has low generalization
capability to resolve unseen data [2]. This
suggests a need for eliminating correlation in
the sample data before they are being
presented to an MLP. This can be achieved
by applying the Principal Component
Analysis (PCA) technique [3] onto input data
sets prior to the MLP training process as well
as interpretation stage. This is the technique
examined in this research.
The PCA technique was first introduced
by Karl Pearson in 1901, but he did not
propose the practical calculation method for
two or more variables, which were useful for
various applications. It was not until 1930s
that the calculation methods involving two
and more variables had been described.
Basically, the PCA technique consists of
finding linear transformations, y
1
,y
2
,y
3
, …, y
p
of the original components, x
1
,x
2
,x
3
,…,x
p
that
have a property of being uncorrelated. In
other words, the y components are chosen in
such a way that y
1
has maximum variance, y
2
has maximum variance subject to being
uncorrelated with y
1
, and so forth. The first
step in the PCA algorithm is to normalize the
components so that they have zero mean and
unity variance. Then, an orthogonalization
method is used to compute the principal
components of the normalized components.
The PCA method has also been widely
applied in other published papers involving
the use of ANNs as a means of reducing the
dimensionality of input space [4-5].
For the purpose of studying the effects
of the PCA technique upon MLP’s
performance, the Electrical Capacitance
Tomography (ECT) data had been chosen.
2. Electrical Capacitance Tomography:
Problem
ECT is a technique used to obtain the
internal spatial and temporal distribution of
materials within a cross-section of a process
equipment, based on the electrostatic field
theory [6]. A schematic diagram of the ECT
sensor is as shown in Figure 2. The numbers
denote the electrode sensors.
Improved Neural Network Performance Using Principal Component Analysis on Matlab
International Journal of The Computer, the Internet and Management Vol.16. N.o.2 (May-August, 2008) pp 1-8
3
Practically, a change in the distribution
of materials within a sensing area produces a
change in the capacitance measurements
between two electrode sensors [7]. The
change is sensed by the data acquisition unit,
which is responsible for obtaining the
changes in capacitance readings between all
possible pairs of primary electrodes for
various material distributions.
Figure 2 - Cross-sectional diagram of the
ECT sensor model used in this research.
Raw ECT data consist of correlated
information. caused by the overlapping of
the sensing regions of several electrode pairs.
Figure 3 illustrates the occurrence of
overlapping among several sensing regions
each time capacitance measurements are
made between primary electrode 1 and all
other electrodes of a 12-electrode ECT
sensor.
The lines at each ends of electrode 1
connecting to the ends of all other electrodes
show the sensing regions between electrode
1 and the other electrodes. Schematically, it
can be seen that almost all sensing regions
involve overlapping electric field lines. This
phenomenon contributes to correlated
information in the change in the capacitance
measurements of ECT data. Therefore, it is
necessary for the ECT data to be pre-
processed in order to eliminate the
correlation before they can be used by a
MLP.
6
1
3
4
2
5
7
8
9 10
11
12
Figure 3 – A schematic diagram showing the
overlapping sensing regions between
electrode 1 and all other electrodes.
3. Experimental Methods
The effects of applying the PCA method
to a MLP system was investigated for
estimation of oil fraction from gas-oil flows
based on ECT data. Oil fraction refers to the
ratio of the cross-sectional area covered by
oil, to the entire cross-sectional area of the
pipe. For ease of evaluation, the fraction has
been normalized within 0 to 1, in which case,
0 means no oil, and 1 means that the cross-
section is full of oil. For the investigation
purpose, simulated ECT data corresponding
to various gas-oil flows had been generated
using an ECT simulator. The data were then
divided into three datasets; the training,
validation and test. The training set was
used to train the MLP, the validation set was
used for early-stopping of the training
process and the test set was used to evaluate
the MLP performance after completion of
the training process.
Two types of PCA data processors had
been implemented for the purpose. The first
one is called the PCA pre-processor, which
is responsible for pre-processing raw ECT
data, to eliminate correlation in the training
samples. The second is called PCA post-
processor, used to transform the validation
and test datasets according to their principal
components. The implementation and
simulation were carried out with the aid of
Junita Mohamad-Saleh and Brian S. Hoyle
4
built-in functions supported by MATLAB
®
Neural Network Toolbox [8].
3.1 Implemention of PCA Pre-Processor
Recall that the PCA technique uses the
SVD method to order the input data in
descending order of importance and
uncorrelation. This way the most important
and less uncorrelated input components are
given higher priority than the less important
and highly correlated ones. The use of the
PCA function in Matlab involves specifying
a fraction value corresponding to the desired
percentage of the least contribution of the
input components. For example, a fraction
value of 0.02 means that the input
components which contribute less than 2%
of the total variation in the data set will be
discarded. From this point onwards, this
fraction value will simply be referred to as
the “PC variance”.
Before a set of ECT measurements can
be used for ANN training, they have to be
pre-processed to extract relevant features
from the data. Figure 4 illustrates the PCA
data pre-processing procedures used in this
research. The ECT measurements, C (in a
matrix notation) were first normalized, so
that they had zero mean and unity variance.
Then the SVD method were used to compute
the principle components using the
normalized ECT measurements, N, the mean
and variance values. This generated a
transformation matrix, TransMat and
produced a transformed set of measurements,
Ntrans, consisting of orthogonal or simply,
uncorrelated components. Matrix TransMat
was stored for later use during the data post-
processing stage. The uncorrelated
components of matrix Ntrans were ordered
according to the magnitude of their
variances. They were then passed to an MLP
together with their corresponding target
output values for a network training process
based on a selected PC variance value.
Several MLPs were trained using different
PC variance values in order to determine the
optimum percentage value of the total
variation in the dataset.
Normalisation
C
SVD
N
Ntrans
MLP
TransMat
Mean & Variance
PC variance value
Figure 4 – The stages of PCA data pre-
processing.
3.2 Implemention of PCA Post-Processor
During each training process, an MLP’s
validation and generalization performances
on sets of validation and test patterns were
assessed. Each vector of the validation or
test ECT data has to be post-processed using
the post-PCA technique before it can be used
by a trained-ANN to estimate a flow
parameter. See Figure 5 for illustration of
the post-processing procedures.
Like the pre-processing procedure, the
validation or test ECT data, C
val/test
were first
normalized so that they had mean zero and
unity variance. Then, the normalized
measurements, N
val/test
were post-processed
based on the transformation matrix,
TransMat (obtained during the pre-
processing stage) to produce a new set of
transformed matrix, Ntrans
val/test
consisting
of uncorrelated components. Using the PC
variance value currently in used by the MLP
undergoing a training process, and the
uncorrelated ECT measurements,
Ntrans
val/test
, a reduced set of uncorrelated
ECT measurements was generated.
Normalisation
C
val/test
Transformation
N
val/test
Ntrans
val/test
MLP
TransMat
PC variance value
Figure 5 – The stages of PCA data post-
processing
Improved Neural Network Performance Using Principal Component Analysis on Matlab
International Journal of The Computer, the Internet and Management Vol.16. N.o.2 (May-August, 2008) pp 1-8
5
The trained-MLP used these reduced
uncorrelated measurements together with its
optimum network weights obtained from the
training process to estimate oil fraction from
gas-oil flows based on unseen ECT data.
3.3 Training, Testing and Selecting MLPs
In this investigation, all MLPs were
trained and tested with the same sets of
training and test patterns, respectively. For
each PC variance value, several MLPs were
trained with the uncorrelated training data
(NTrans) corresponding to outputs of oil
fraction values, using the Bayesian
Regularization training algorithm. The
reason for training several MLPs is to obtain
the optimum MLP structure, as well as to
examine the optimum PC variance value for
the task. Optimum structure in this case
means the optimum number of neurons or
processing elements an MLP should have in
its hidden layer in order to perform well.
This criterion is important in order to
produce a MLP with the best generalization
capability. Optimum PC variance value
determines the optimum number of
uncorrelated input values each set of data
should have to facilitate MLP learning.
During the training process, the duration of
training time was recorded.
After the MLPs had been trained, their
generalization performances towards a set of test
data were assessed in order to select the best
MLP for the estimation of oil fraction. In doing
so, the test ECT data were first post-processed
using the PCA data post-processor already
implemented. After post-processing, a set of
reduced uncorrelated test data were produced and
fed into the MLPs to obtain the oil fraction
values corresponding to each test set. Each
MLP’s performance was calculated based on the
Mean Absolute Error (MAE) given by
=
=
P
i
i
O
i
TMAE
1
||
(1)
where P is the total number of test
patterns, T
i
is the actual flow parameter value
for the i
th
test pattern, and O
i
is the MLP’s
estimation of the flow parameter for the i
th
test pattern.
The MAEs of all MLPs were evaluated
and the MLP which gave the least MAE was
selected as the best-performed MLP to
represent the task. This MLP’s was then
evaluated
4. Results and Discussion
The graph in Figure 6 shows the MAE
of MLP which have been trained with
uncorrelated data based on the various PC
variance values. It can be seen that the MAE
values tend to reduce with increasing PC
variance until 0.05%. It was found that after
this point, the MAE values start to increase
again and never come down.
Figure 7 shows the number of input
components produced for each PC variance
value investigated. It can be observe from
the figure that the number of input
components reduces with increasing values
of PC variance used. This is expected since
a larger PC variance value means more input
components are correlated, and thus, more
are eliminated.
Figures 6 and 7 are comparable in terms
of the MAE and number of input
components used for training. Comparing
these two figures, it can be clearly seen that
the MAE reduces with fewer number of
input components, until 0.05% PC variance
value (corresponding to 27 input
components, i.e. the point is denoted with
(0.05,27) in Figure 7), after which, the
MAEs increase again. This suggests that too
many input components introduce too much
correlation in the data, resulting in data
confusion and thus, the MLP is not able to
Junita Mohamad-Saleh and Brian S. Hoyle
6
properly make the distinction between the
various features in the data (i.e. for PC
variance of less than 0.05%). At point
0.05% of PC variance value with 27 input
components, the MLPs gave the lowest MAE
of about 0.09%. It suggests that this number
of inputs is sufficiently optimal for the MLP
to learn distinct features in the data and
perform better input/output mapping. After
the PC variance value of 0.05%, the number
of input components becomes too few,
generating a condition of information lacking
that the MLP can not become sufficiently
“intelligent” to estimate the oil fraction. The
overall results demonstrate that a MLP
generalize better when the number of input
data presented to it, is optimally sufficient
and does not consist of too many correlated
components.
Figure 8 shows the average training time
required to train a MLP. It can be seen that
the network training time increases
exponentially as the number of input
components increases.
The results demonstrates that the
application of PCA technique onto the input
data reduces the number of input
components and consequently, reduces the
network training time as it deals with fewer
weight parameters. This is especially
important when training a MLP to solve a
complex problem involving a large input
dimensionality, or a problem consisting of
several thousands of training samples.
Clearly, the PCA technique is useful for
improving the generalization performance of
a MLP as well as reducing the network’s
training time.
Figure 6 - Generalization performance of MLPs trained with ECT data of various PCA
variance.
0.08%
0.09%
0.10%
0.11%
0.12%
0.13%
0.14%
0.15%
0.001 0.01 0.1 1
log of PCA variance (%)
MAE of test data
Improved Neural Network Performance Using Principal Component Analysis on Matlab
International Journal of The Computer, the Internet and Management Vol.16. N.o.2 (May-August, 2008) pp 1-8
7
Figure 7 – The number of input components produced at various PC variance values.
20
25
30
35
40
45
50
55
60
65
0 0.1 0.2 0.3 0.4 0.5 0.6
PCA variance (% )
Mean of training time (s) .
Figure 8 - The average training time of ten MLPs for different PC variance values.
5. Conclusions
The work concerns the use of PCA
technique for elimination of correlation in
the raw ECT data in order to boost the
learning capability and generalization of a
MLP system. Two PCA data processors
have been implemented in the Matlab
environment for the purpose investigating
the effects of eliminating correlation in the
ECT measurements for the task of estimating
oil fraction from gas-oil flows.
The results have shown it is feasible to
use the PCA technique to eliminate
correlation in raw ECT data, resulting in
improved MLP oil fraction estimation
capability. Besides boosting the
generalization capability of a MLP, the PCA
technique also reduces the network training
time due to the reduction in the input space
dimensionality. Therefore, the findings
suggest that PCA data processing method is
useful for improving the performance of
(0.05,27)
10
15
20
25
30
35
40
45
50
55
0.001 0.01 0.1 1
log of PCA variance (%)
No. of input components .
Junita Mohamad-Saleh and Brian S. Hoyle
8
MLP systems, particularly in solving
complex problems involving a large number
of input data.
Acknowledgement
The author would like to acknowledge
the source of funding provided by the
Universiti Sains Malaysia for this research
work.
References
[1] Haykin S., (1999) Neural Networks: A
Comprehensive Foundation, Macmillan
College, London.
[2] Bishop C. M. (1994), “Neural networks
and their applications”, Review of
Scientific Instruments, vol. 65, no. 6, pp.
1803-1832.
[3] Jolliffe I. T. (1986), Principal
Components Analysis, Springer-Verlag,
New York.
[4] Charytoniuk W. and Chen M. S. (2000),
“Neural Network Design for Short-term
Load Forecasting”, Proceedings of the
International Conference on Electric
Utility Deregulation and Restructuring
and Power Technologies, London, pp.
554 –561.
[5] Tabe H., Simons S. J. R., Savery J.,
West R. M. and Williams R. A. (1999),
“Modelling of Multiphase Processes
Using Tomographic Data for
Optimisation and Control”, Proceedings
of 1
st
World Congress on Industrial
Process Tomography, April 14-17,
Buxton, pp. 84-89.
[6] Beck M. S. and Williams R. A. (1996),
“Process Tomography: A European
innovation and its applications”,
Measurement Science and Technology,
vol. 7, pp. 215-224.
[7] Xie C. G., Huang S. M., Hoyle B. S.,
Thorn R., Lenn C., Snowden D. and
Beck M. S. (1992), “Electrical
capacitance tomography for flow
imaging: system model for development
of image reconstruction algorithms and
design of primary sensors”, IEE
Proceedings G, vol.139, no. 1, pp. 89-
98.
[8] Demuth H. and Beale M. (1998),
MATLAB
®
Neural Network Toolbox
User’s Guide Version 3.0. The Math
Works Inc..
____
... On the other hand, RNN systems can have this sort of input structure. They come about of a hub can be utilized once more as inputs for prior layers [11][12][13][14][15][16][17][18]. This uncommon highlight may appear up clearly when altering utilizing boisterous estimations, which we are going conversation around afterward [16][17][18][19][20][21][22][23][24][25]. ...
Article
Full-text available
Gastric cancer is an important health problem and is the fourth most common cancer and the second leading cause of cancer-related deaths worldwide. The incidence of stomach cancer is increasing and it can be dealt with using new methods in prediction and diagnosis. Our goal is to implement an artificial neural network to predict new cancer cases. Gastric cancer is anatomically divided into true gastric adenocarcinomas (non-cardiac gastric cancers) and gastric-esophageal- connective cancer (adenocardia (cardiac) gastric cancers). We use MATLAB R2018 software (MathWorks) to implement an artificial neural network. We used. The data were repeatedly and randomly divided into training (70%) and validation (30%) subsets. Our predictions emphasize the need for detailed studies on the risk factors associated with gastric cell carcinoma to reduce the incidence and has also provided an accuracy of about 99.998%.
... Authors [31], the gas classification neural network was trained using 1175 train and 375 test vectors obtained from the SPCA transformed data of their eight element SnO 2 sensor response. As a result, to be able to use PCA/SPCA trained neural classifiers in order to test freshly generated samples, [32], a multi-layer perceptron (MLP) classifier has been developed along with a pre-processor, a postprocessor, and a training heuristic. There are various other application where we can apply this technique [33][34][35]. ...
Chapter
Full-text available
Gas sensor arrays have been used to identify gases based on their dynamic response. Pre-processing of data plays a critical role in identifying different gases. There are several pre-processing techniques for data found in the literature, all of which are complex and sophisticated. Six different volatile organic compounds were exposed to eight metal-oxide (MOX) gas sensors under tightly controlled operating conditions at different concentration levels. In addition, the generated dataset can be used to tackle a variety of chemical sensing challenges such as gas identification, sensor failure, or system calibration. It is a novel way to classify mixed gases. According to the statistical parameters calculated, the proposed GMDH neural network was found to exhibit acceptable accuracy in this study. The mean square (\({R}^{2}\)) value for the proposed GMDH model obtained (\({R}^{2}\)) train set = 0.99346, (\({R}^{2}\)) test set = 0.992, (\({R}^{2}\)) complete set: 0.99261, represents a good match between the actual and output amounts of the proposed model, which is interpreted as the efficiency of the GMDH algorithm in predicting ethylene in mixtures. It has been verified that the proposed GMDH technique produces accurate classification outcomes.KeywordsClassificationGas sensorMixed gas dataNeural networkRegression
... The use of neural networks in combination with other machine learning techniques (Linear Discriminant Analysis, Principal Component Analysis) improved the NN performance [31][32][33]. The methodology, the main results, and discussion are presented in the following sections of our paper. ...
Chapter
From an economic point of view, shortly digitalization will lead not only to progressive growth, but also to important transformations of jobs and to the reorganization of the way of carrying out the activity of retail, transport, and banking services. Our research aimed to identify and analyze the correlations between digitization and economic growth of EU countries in the period 2019–2021. In this paper, we use deep learning and principal component analysis as an efficient technique to improve the accuracy of classification for the set of EU countries classified according to The Digital Economy and Society Index. The used databases were Eurostat and World Bank. We selected 15 indicators on which we first trained a 2-layer neural network and we obtained a classifier with 92.52% accuracy. Then, we applied principal component analysis and reduced the original dataset to 3 principal components which retain together 78.21% of the initial variability. We train a 2-layer neural network on the score matrix given by the three retained principal components. The results revealed that the classification improved from 92.52 to 100%.KeywordsDigitalizationEconomic growthPrincipal component analysisNeural networksDeep learning
... By having fewer domain values models can be trained and can run faster. This is seen in research like [5] and [6]. But there are examples of PCA improving accuracy for Neural Networks. ...
Conference Paper
This research uses Principal Component Analysis (PCA) in conjunction with a Neural Network to increase the accuracy of predicting student test scores. Much research has been conducted attempting to predict student test scores using a standard, well-known dataset. The dataset includes student demographic and educational data and test scores for Mathematics and Language. Multiple predictive algorithms have been used with a Neural Network being the most common. In this research PCA was used to reduce the domain space size using varying sizes. This began with just 1 attribute and increased to the full size of the original set’s domain values. The reduced domain values and the original domain values were independently used to train a Neural Network and the Mean Absolute errors were compared. Because results may vary depending upon which records in the dataset are training versus testing, 50 trials were conducted for each reduction size. Results were average and statistical tests were applied. Results show that using PCA prior to training the Neural Network can decrease the mean absolute error by up to 15%.
Conference Paper
Full-text available
An artificial neural network approach was used to correlate empirical Colburn j factors and Fanning friction factors for liquid water flow in straight pipes with internal helical fins. Calculation of heat transfer and friction in spiral pipes is one of the serious challenges for activists in this field. So far, various methods have been presented to answer this problem, including the use of artificial intelligence and algorithms based on it. In this research, experimental data related to the heat transfer of spiral tubes have been used to develop a new model based on a neural network to predict the amount of heat transfer and friction in spiral tubes. The performance of neural networks was superior compared to the corresponding power law regressions. Subsequently, artificial neural networks were used to predict data from other researchers, but the results were less accurate. In this regard, due to the high capacity of the Reluctant neural network, this structure has been used using real data as input. Finally, to check the efficiency of the neural network model, the results have been compared with the real sample. The prediction of the increase in heat transfer efficiency with a low error percentage (range 1-0.99) for regression in comparison with the experimental sample, indicates the sufficient compliance of the proposed model with the real model and the efficiency of the network.
Preprint
Full-text available
Machine-learning based methods are increasingly employed for the prediction of storm surges and development of early warning systems for coastal flooding. The evaluation of the quality of such methods need to explicitly consider the uncertainty of the prediction, which may stem from the inaccuracy in the forecasted inputs to the model as well as from the uncertainty inherent to the model itself. Defining the range of validity of the prediction is essential for the correct application of such models. Here, a methodology is proposed for building a robust model for forecasting storm surges accounting for the relevant sources of uncertainty. The model uses as inputs the mean sea level pressure and wind velocity components at 10 m above sea level. A set of Artificial Neural Networks are used in conjunction with an adaptive Bayesian model selection process to make robust storm surge forecast predictions with associated confidence intervals. The input uncertainty, characterised by comparing hindcast data and one day forecasted data, is propagated through the model via a Monte Carlo based approach. The application of the proposed methodology is illustrated by considering 24 hour target forecast predictions of storm surges for Millport, in the Firth of Clyde, Scotland, UK. It is shown that the proposed approach improves significantly the predictive performance of existing Artificial Neural Network based models and provides a meaningful confidence interval that characterises both model and input uncertainty.
Article
Plant proteins have attracted significant attention due to various health concerns and food safety issues related to animal-based proteins. Different physical and chemical approaches have been applied to plant proteins to improve their functionality, including chemical, physical, or combination of treatments. One of the main properties of interest is the zeta potential as a measure of surface charge. It can be used to optimize the suspension formulations, estimate the emulsion stability, or predict the food surface interactions. Here, we used a single protein database sequence of a plant protein (11S proglobulin) to obtain the net electrostatic surface potentials at different pH. These values were used for modelling a primary neural network to find a correlation between the measured zeta potential and the calculated electrostatic potential values. The network created by this approach had a high correlation coefficient (R=0.99) for predicting the electrostatic or zeta potential at various pH for the amarantin. The study demonstrates the potential use of an artificial neural network and its analysis to predict zeta potential values over a pH range, once the network is trained with appropriate datasets, which can potentially be implemented over a range of other plant proteins. The limitation in theoretical protein models, including the complexity of the protein structures and pH-dependent changes of amino acids, must be considered when developing such models. This approach can be explored further to consider protein interactions in the presence of buffer or with added electrolytes, including changes in the surface charge of the molecules.
Chapter
This chapter provides a conceptual overview of Machine Learning (ML) from an Industrial Process Tomography (IPT) application viewpoint. It begins with introductory discussion on how this can be advantageously applied in IPT applications where direct on-line sensing of process information is needed. A basic foundational review compares the principles of a classical (preparation, measurement, reconstruction, image process to process info) tomography method with a corresponding ML approach. A review of key concepts introduces neurons and artificial neural networks; a single-layer example (yielding a directly estimated image); and a multilayer neural network to yield process data (component fraction/classification). The discussion addresses process characteristics in emergent properties of trained neural networks and their association, classification, and generalization capabilities to estimate enhanced process information. This approach is compared with classical IPT methods. The general overview is followed by comments on practical development of applications with notes on current toolsets and software and hardware implementations. Two contrasting case studies which illustrate the key principles through two applications which feature a product in a pipeline but require contrasting process information are included: Firstly, in an oil and gas application in which 2 and 3 component flows are sensed using neural networks to reveal detailed composition characteristics. Secondly, in a novel application of the in-line sensing of rheology in production flows in many business sectors. Finally, a forward look is provided of the likely important future contributions of ML as a key translator for process knowledge to increased IPT capability.
Article
Full-text available
Tomographic sensors are ideally suited to the on-line control of multiphase processes. Little work to date has been undertaken to determine what type and style of information is required from an image to provide effective process control. In this paper, a possible modelling strategy is presented; namely, a combination of Principal Component Analysis (PCA) and Neural Networks (NN) is used to convert multivariate data from tomographic images into useful information suitable for the control and optimisation of chemical processes.
Article
Neural networks provide a range of powerful new techniques for solving problems in pattern recognition, data analysis, and control. They have several notable features including high processing speeds and the ability to learn the solution to a problem from a set of examples. The majority of practical applications of neural networks currently make use of two basic network models. We describe these models in detail and explain the various techniques used to train them. Next we discuss a number of key issues which must be addressed when applying neural networks to practical problems, and highlight several potential pitfalls. Finally, we survey the various classes of problem which may be addressed using neural networks, and we illustrate them with a variety of successful applications drawn from a range of fields. It is intended that this review should be accessible to readers with no previous knowledge of neural networks, and yet also provide new insights for those already making practical use of these techniques.