ArticlePDF Available

New RBF neural network classifier with optimized hidden neurons number

Authors:

Abstract and Figures

This article presents a noticeable performances improvement of a neural classifier based on an RBF network. Based on the Mahalanobis distance, this new classifier increases relatively the recognition rate while decreasing remarkably the number of hidden layer neurons. We obtain thus a new very general RBF classifier, very simple, not requiring any adjustment parameter, and presenting an excellent ratio performances/neurons number. A comparative study of its performances is presented and illustrated by examples on artificial and real databases.
Content may be subject to copyright.
New RBF neural network classifier with optimized hidden neurons number
Larbi Beheim, Adel Zitouni, Fabien Belloir
Laboratoire d’Automatique et Microélectronique
Université de Reims Champagne-Ardenne
Campus du Moulin de la Housse,
B.P. 1039, 51687 Reims Cedex 2
FRANCE
Abstract: - This article presents a noticeable performances improvement of a neural classifier based on an RBF
network. Based on the Mahalanobis distance, this new classifier increases relatively the recognition rate while
decreasing remarkably the number of hidden layer neurons. We obtain thus a new very general RBF classifier, very
simple, not requiring any adjustment parameter, and presenting an excellent ratio performances/neurons number. A
comparative study of its performances is presented and illustrated by examples on artificial and real databases.
Key-Words: - RBF neural networks, Mahalanobis distance, clustering, training algorithms, hidden neurons
number optimization, burying tag identification.
1 Introduction
The radial basic functions neural net (RBF) has become,
for these last years, a serious alternative to the traditional
Multi-Layer Perceptron network (MLP) in the
multidimensional approximation problems. RBF
Network was employed since the Seventies under the
name of potential functions and it is only later than [1]
and [2] rediscovered this particular structure in the
neuronal form. Since, this type of network profited from
many theoretical studies such as [3], [4] and [5]. In
pattern recognition, RBF network is very attractive
because of its locality property which makes it possible
to discriminate complex classes such as nonconvex ones.
We consider in this article the Gaussian RBF classifier
of which each m output s
j
is evaluated according to the
following formula:
()()
1
11
1
() () exp
2
hh
NN
T
jljllj lll
ll
s
X w X w XC XC
ϕ
==

==Σ


∑∑
(1)
Where X=[x
1
… x
n
]
T
R
N
is a prototype to be classified,
N
h
represents the total number of hidden neurons. Each
one of these nonlinear neurons is characterized by a
center C
l
R
N
and a covariance matrix
Σ
L
.
From a training set S
train
={X
p
,
ω
p
}, p=1…N made up of
prototypes couples X
p
and its membership class
ω
p
{1…,m}, the supervised training problem of RBF
classifier amounts determining his structure, i.e. the
number of hidden neurons N
h
and the different
parameters intervening in the equation of outputs (1).
Whereas these parameters can be calculated by different
heuristics, the estimate of N
h
is often delicate. For that,
many methods were developed among which we can
quote [6], [7] and [8]. That they basic or are very
sophisticated, these methods generally require a very
significant load of calculation without however
guaranteeing significant performances. Moreover, they
often require a certain number of parameters that must
be fixed a priori and optimized for a particular problem.
So these methods cannot be applied systematically and
without particular precautions to any type of
classification problem. The article [9] proposed a very
simple algorithm which generates automatically a
powerful RBF network without any optimization nor
introduction of parameters fixed a priori. Indeed, the
algorithm automatically selects the number of the hidden
layer neurons. Although this network is characterized by
its great simplicity, it presents a major limitation
however owing to the fact that it requires a rather
significant number of neurons in the hidden layer. This
limitation makes it very heavy and requiring very
significant training times for the very large databases.
In this article we propose a solution to this problem by
introducing the Mahalanobis distance. We thus obtain a
new very general, very simple RBF network and
presenting an excellent performances/neurons number
ratio.
The organization of the article is as follows: In section 2,
we describe the principle of construction of new RBF
network and we present the associated algorithm. Its
operation is illustrated for an example on an artificial
database. In section 3, we study its performances, as well
on artificial problems as real ones.
2 Algorithm
In this section, we describe the principle of construction
proposed as well as the algorithm allowing its
implementation. We illustrate then his operation in a
problem of classification including two classes of which
one is not convex.
2.1 Principle
The principle of the algorithm rests on [9]. According to
the exponential nature of the functions
ϕ
l
(.) of each
hidden neuron, the activation state of each one of them
decrease quickly when the vector of entry X moves away
from the neuron center. So only an area of the entry
space centered in C
l
will provide a significantly non null
activation state. Contrary to [9], our algorithm regards
this area as a hyperellipsoide centered in C
l
.. Indeed, the
use of the Mahalanobis distance makes it possible to take
into account the statistical distribution of the prototypes
around the centers C
l
and thus a better representation of
classes shape. Our algorithm proposes to divide a
nonconvex class into a set of hyperellipsoides called
clusters. Each cluster corresponds to a hidden neuron
and it is characterized thus by a center placing it in the
entry space, a matrix of covariance indicating the
privileged directions and a width calculating the
extension of the hyperellipsoide. In the continuation, we
will not make any more the distinction between a neuron
and a cluster.
2.2 Description
Before describing the algorithm of construction of the
RBF classifier, we will introduce some notations used
thereafter. At the k
th
iteration, one defines C
(k)
ij
like the i
th
center (i=1… m
(k)
j
) characterizing the class
j
. With each
center a matrix with covariance is associated
Σ
(k)
ij
and a
width L
(k)
ij
. We note H
(k)
ij
the hyperellipsoide of center
C
(k)
ij
such as:
()()
{
}
() ()1
,
T
kk
ij p j p ij ij p ij ij
H
XXC XCL
=∈ Σ <
(2)
Each class is characterized by the area R
(k)
j
defined as
the union of all the hyperellipsoides H
(k)
ij
(i=1… m
(k)
j
).
We will also use the distance d(R
(k)
j
,X) between a point
X
∈Ω
j
and its associated area R
(k)
j
. This one is defined as
the Mahalanobis distance between X and the nearest
center C
(k)
ij
of R
(k)
j
:
()
()()
)()(
min,
1
)(
1
)(
)(
k
ij
k
ij
XRd
CXCX
k
ij
T
mi
k
j
k
j
=
=
(3)
Step 0 (initialization): For k=0, one define m clusters
whose centers correspond to the barycentres of different
classes
j
(N
j
is the element number of
j
) :
m
(0)
j
=1et
==
jpX
p
j
j
mjX
N
C 1,
1
)0(
1
(4)
()()
mjCXCX
N
jpX
T
j
p
j
p
j
j
1,
1
1
)0(
1
)0(
1
)0(
1
=
=
(5)
Step 1 (adjustment of the widths): The width L
(k)
ij
relating to the center C
(k)
ij
is defined like the half
Mahalanobis distance between this center and the nearest
center of another class :
()()
()
()
() () () ( 1)1 () ()
,1
1,1
1
min
2
k
t
k
j
T
kkkkkk
ij ij st ij ij st
tjs m
imjm
LCCCC
−−
≠=
==
=−Σ
……
(6)
Step 2 (search for an orphan point): We seek a point
X
i
S
train
not belonging to its associated area R
(k)
ω
i
and
most distant from this one:
()
s
k
SX
i XRdX
s
trains
,max arg
)(
ω
= (7)
If such a point does not exist, go to the step 5.
Step 3 (creation of a new center): Point X
i
found at step 2
becomes a new center composing the class
ω
i
:
1
)()(
+=
k
j
k
j
mm , i
k
jm
XC
k
j
=
)(
,
)(
(8)
Step 4 (Reorganization of the centers): The K-means
clustering algorithm [17] is applied to the points of S
train
pertaining to the class
ω
i
in order to distribute as well
as possible the m
(k)
j
centers. Calculate the new
covariance matrices:
()()
()
() () ()
()
1
()1
k
pij
T
kkk
ij p ij p ij
k
XH
ij
XC XC
Card H
Σ=
(9)
Do k=k+1 and go to the step 1.
Step 5 (determination of the weights): The weights
matrix W
*
which minimizes an error function, here
selected as the sum square errors of classification, is
given by:
1
* TT
WHHHT

=

(10)
where H and T are the matrices respectively gathering
the activation function stats and the target outputs. These
last are fixed at 1 when they correspond to the class of
the point and 0 elsewhere.
2.3 Discussion
The initialization of the algorithm (step 0) could have
proceeded by the random placement of a number of
given centers. This technique is very current in the
definition of an RBF network. The fact of choosing the
initial centers as barycentres of the points X
p
makes it
possible to avoid this unforeseeable character and
provides moreover the number of these centers. Thus,
the result of the algorithm depends only on the
composition of the training data. In certain cases where
the classes are nonconvex, it may be that the barycentre
of a class is inside another class. This situation is not
prejudicial for the algorithm since this center will be
moved during following iterations. The covariance
matrix corresponding to each center is obtained from the
associated training data. We will further see that other
definitions of this matrix can give different results. In
step 1, L
(k)
ij
is defined relatively to the minimal distance
between the center C
(k)
ij
and centers of another class.
This means that a partial covering between the clusters
of the same class is authorized. From a practical point of
view, that makes it possible to optimize the space
occupation of the attributes by the various zones of
receptivity and thus to reduce the number of clusters
necessary to compose each class. The elliptic volume
covered by each cluster is maximum without
encroaching on neighboring classes. In step 2, the fact of
choosing the furthest point from the region R
(k)
j
makes it
possible to improve the effectiveness of the algorithm of
K-means clustering used at step 4. It should be noted that
this one relates only to the centers constituting the same
class since the other centers did not change a position. It
guarantees moreover a fast development of this area. In
the network training (step 5), the target outputs are fixed
arbitrarily at 1 when they correspond to the class of the
point and 0 elsewhere. The motivation of this practice is
artificially to create a brutal fall of the membership
degree at the geometrical border of the class.
After k iterations, all the points of S
train
belong to a
cluster, the algorithm generated m+k clusters defining as
many subclasses. The RBF network thus built comprises
then N
h
=m+k hidden neurons. We can note that the
algorithm converges necessarily. Indeed, in the "worst
case" where none the classes is separable, there will be
creation of a cluster for each point of S
train
.
2.4 Illustration of operation
We will illustrate the significant phases of the algorithm
on a classification problem of two concentric classes
from the databases of "ELENA" project [10] [11]. This
base makes it possible to determine the capacity of a
classifier to separate two classes not overlapping but of
which one is included in the second.
Fig. 1a. Initialization of the algorithm.
Fig. 1b. 1st iteration of the algorithm.
Fig. 1c. 2nd iteration of the algorithm.
Fig. 1d. Result of classification of the algorithm.
The RBF network comprises 2 inputs and 2 outputs. The
figure 1a shows the 2 initial centers {C
1
,C
2
} obtained
following step 0. We can see that the two centers are
almost confused. Each cluster induced is delimited by an
ellipse of the width calculated at step 1. Obviously, the
cluster of center C
2
is not sufficient to entirely represent
the class
2
. This one thus will be subdivided in several
subclasses. With the first iteration of the algorithm, the
point noted X
i
on the figure 1b is the furthest from the
center C
2
and is out of the corresponding cluster. The
addition of a new center compared to this point led, after
application of the K-means, to the new distribution
{C
1
,C
2
,C
3
} illustrated by the figure 1b. The point X
j
is
now the furthest from the center C
3
on this figure. After
application of the K-means on this new configuration
one leads to the figure 1c. After 4 iterations, the 2 classes
are discriminated perfectly and the neuronal classifier
comprises a total of 5 neurons (see figure 1d). After
having determined the number of centers necessary and
their positions, the weights of the network are calculated
according to equation of step 5.
The algorithm thus manages to separate the two classes
with only 5 neurons against 108 neurons for the old
algorithm using the Euclidean distance and with a
slightly higher rate of recognition: 98% against 97.75%
for the old RBF.
3 Results
The object of this section is to evaluate the performances
of the RBF classifier built by the algorithm presented in
section 2. For that, we applied the classifier to various
problems of classification comprising a variable number
of attributes and classes and bearing on synthetic data as
well as resulting from the real world.
3.1 Benchmarks
The benchmarks carried out here are studied in detail in
ELENA project [10]. For each problem of classification,
we have the results concerning the RBF classifier
generated by the algorithm proposed as well as the
performances of certain classifiers studied in [11]. It is
about the classifier of the "k-nearest neighbor" (kNN)
[12] who gives the best approximation of the Bayes
recognition error and the of the Multi-Layer Perceptrons
classifier (MLP) very widespread in the pattern
recognition per connexionnist model [13]. The Learning
Vector Quantization classifier (LVQ) proposed by
Kohonen is a simple adaptive method of vector
quantization. For other types of neural classifiers, see
[11] and the included references. The RBFE classifier
acts of that proposed in [9] using the euclidean distance.
For each classifier, we calculate the average error of
recognition (in %) on the test set obtained on 5 different
experiments with the method of the "hold out" for
counting the classification errors. The experimental
protocol, which respects that used in ELENA project,
consists in learning the classifier on half of the data then
testing its performances on the second half of the base.
The first database is created artificially to highlight
certain properties or gaps of the tested classifiers. The
objective of the "Clouds" problem is to study the
influence of two interlaced classes with nonlinear
borders. The three last databases result from real
problems. The "Phoneme" problem relates to the speech
recognition studied in European project "ROARS project
SPIRIT" [14]. The principal difficulty of this problem is
great dissymmetry in the number of authorities of each
classes. We will not present the "Iris" data very known
in the pattern recognition [15]. To finish, the data of the
"Texture" file relates to the recognition of 11 natural
micro-textures such as grass, sand, paper or certain
textiles [16]. Various information concerning the
statistics and the analysis in principal components of
these various data files can be found in [10] and the
references included.
The figure 2 presents results on these various problems.
The performances of the RBF classifier are slightly
lower than the other classifiers for this first problem.
This is explained by the significant interlacing of the two
classes. The algorithm generates a neurons number close
to the points number of training data and the capacities
of generalization on the test set are thus very bad.
The error rate of our classifier RBF is generally weakest
for each of the last three problems. This is checked
whatever the number of classes to be distinguished and
the quantity of available data for the training.
Fig. 2. Classification results on four different bases.
3.2 Study according to the neurons number
The object of this study is to show excel it
performances/nombre report/ratio of neurons which our
new classifier has. Not having the number of neurons of
network MLP, this study is limited to only networks
RBF of the preceding section.
Table 1 presents a comparison between the old RBF and
the new one. We can see on this table the "compact"
quality of our new classifier who gives comparable error
rates or even lower while minimizing the hidden neurons
number Nh. So, training times are much less significant.
For the "Textures" database for example, the error rate is
divided by 4, while the number of neurons is divided by
39. It was necessary less than two minutes to training our
classifier and more than one hour for the old one.
Times of training are given here for an execution of the
algorithm under MATLAB on a PC AMD Athlon XP
1800+.
Database
Classifier Error(%) N
h
Learning time(s)
Old classifier 13,60 162 60
Clouds
This classifie
r
13,25 72 30
Old classifier 10,90 227 100
Phoneme
This classifie
r
10,43 59 24
Old classifier 2,90 24 0,5
Iris
This classifie
r
1,94 3 0,1
Old classifier 1,73 858 3900
Texture
This classifie
0,41 22 100
Table. 1. Comparison performances, neurons number
and training times of the two RBF classifiers.
3.3 The choice of the covariance matrix
One of the limits of this classifier is the estimate of the
covariance matrix. The larger the size of the clusters is
and the better is the estimate of this matrix. So the
calculation of this matrix can sometimes reduce the rates
of recognition. To cure this problem, other calculations
of this matrix can be proposed to take into account more
prototypes during the estimate of this matrix. Table 2
gives examples of calculation, errors and the
corresponding number of neurons. One can see on this
table that a different choice of the covariance matrix that
proposed in section 2 can increase or decrease the rate
error but the number of hidden neurons can only
increase. But this number remains always largely lower
than the number of neurons proposed by the old RBF.
Covariance matrix Error (%) N
h
Σ=cov(C) (a)
0,14 247
Σ=cov(J
i
) (b) 0,27 38
Σ=cov(JC
i
) (c) 0,41 22
Σ=cov(J) (d) 0,32 283
Table 2. Error rate of the base Textures according to the
choice of the matrix of covariance: (a) covariance of the
centers, (b) covariance of the data of each class (c)
covariance of the data of each center (d) covariance of
the total database.
3.4 Application in code identification
The goal of our application is to detect and identify
reliably different buried metallic codes with a smart eddy
current sensor. Based on the principle of the induction
balance, our detector measures the magnetic fields
modifications emitted by a coil. These modifications are
due to the presence of the metal codes buried on the top
of the drains. A code is built from a succession of
different metal pieces separated by empty spaces. Thus
the identification of the codes allows the identification
and the localization of the pipes (like water, gas,…) [18].
Several material improvements were carried out on our
detector [21], but the identification of the codes always
poses problems because of the similarity between the
codes, the non-linearity of the answer according to the
depth and the choice of a suitable coding of the signals
[22]. To solve these problems, various methods of
classifications were proposed. The first methods was
based one the fuzzy logic theory, the Kohonen SOM,
and an RBF classifier. The methods based on the fuzzy
logic theory are the well-known Fuzzy Pattern Matching
(FPM) [19] and the distributed rules (DR.) [20]
developed among others by Ishibuchi. Among all these
methods it is the classifier RBFE (Euclidean RBF) who
gives the best results. But, these results remain
insufficient for the great depths. It is for that we
developed this new classifier to try to decrease rate
errors and neurons number for a future integration of the
classifier on programmable microchips.
A comparison is made between these different methods
and the new RBF classifier.
This classifier RBFE SOM FPM DR
Error (%)
5.0 6.2 11.3 8.3 7.1
N
h
68 135 - - -
Table 3. Results of code misclassification for the 5
pattern recognition methods implemented.
For a burying depth up to 80 cm, we obtain the results
given in the table 3. We can notice that the result of the
new RBF classifier is better than the others, and always
with less number of hidden neurones.
4 Conclusion
We proposed a noticeable performances improvement of
a neural classifier based an RBF network. The new
classifier is very general and simple. It generates
automatically a powerful RBF network without any
introduction of parameters fixed a priori.
The number of hidden neurons is very optimized what
will allow its use for the very large databases. Indeed,
the new classifier obtains excellent recognition results
for a variety of different databases.
References:
[1] D.S. Broomhead and D. Lowe, Multivariable
functional interpolation and adaptive networks,
Complex Systems, Vol.2, 1988, pp. 321-355.
[2] J. Moody and C.J. Darken, Fast Learning in
Networks of Locally-Tuned Processing Units. Neural
Computation, Vol.1, 1898, pp. 281-294.
[3] F. Girosi and T. Poggio, Networks and The Best
Approximation Property, Technical Report C.B.I.P.
No. 45, Artificial Intelligence Laboratory,
Massachusetts Institute of Technology, 1989.
[4] Park J. and Sandberg I.W., Universal Approximation
Using Radial-Basis-Function Networks, Neural
Computation, Vol.3, 1991, pp. 246-257.
[5] M. Bianchini, P. frasconi and M. Gori, Learning
without Local Minima in Radial Basis Function
Networks, IEEE Transactions on Neural Networks,
Vol.6:3, 1995, pp. 749-756.
[6] B. Fritzke, Supervised Learning with Growing Cell
Structures, In Advances in Neural Processing
Systems 6, J.C. Cowan, Tesauro G. and Alspector J.
(eds.), Morgan Kaufmann, San Mateo, CA., 1994.
[7] B. Fritzke, Transforming Hard Problems into
Linearly Separable one with Incremental Radial
Basis Function Networks, In M.J. Vand Der Heyden,
J. Mrsic-Flögel and K. Weigel (eds.), HELNET
International Workshop on Neural Networks,
Proceedings Volume I/II 1994/1995, VU University
Press, 1996
[8] C.G. Looney, Pattern Recognition Using Neural
Network - Theory and Algorithms for Engineers and
Scientits, Oxford University Press, Oxford - New
York, 1997.
[9] F. Belloir, A. Fache and A. Billat, A General
Approach to Construct RBF Net-Based
Classifier, Proc. of the European Symposium on
Artificial Neural Networks (ESANN’99), April 21-
23, Bruges Belgium, 1999, pp. 399-404.
[10] C. Aviles-Cruz, A. Guerin-Dugué, J.L. Voz and
D. Van Cappel, Deliverable R3-B1-P Task B1:
Databases, Technical Report ELENA ESPRIT Basic
Research Project Number 6891, June 1995.
[11] F. Blayo, Y. Cheneval, A. Guerin-Dugué,
R. Chentouf, C.Aviles-Cruz, J. Madrenas,
M. Moreno and J.L. Voz, Deliverable R3-B4-P Task
B4: Benchmarks, Technical Report ELENA ESPRIT
Basic Research Project Number 6891, June 1995.
[12] R. Duda and P. Hart, Pattern Recognition and
Scene Analysis, J. Wiley & sons Edition, 1973.
[13] D.E. Rumelhart and J.L. McClelland, Parallel
Distributed Processing: Explorations in the
Microstructure of Cognition, MIT Press, 1986.
[14] P. Alinat, Periodic Progress Report 4, Technical
Report, ROARS Project ESPRIT II-Number 5516,
Thomson Report TS. ASM 93/S/EGS/NC/079,
February 1993.
[15] G.W. Gates, The Reduced Nearest Neighbor Rule,
IEEE Trans. on Information Theory, Vol. May, 1972,
pp. 431-433.
[16] A. Guerin-Dugué and C. Aviles-Cruz, High Order
Statistics from Natural Textured Images, ATHOS
Workshop on System Identification and High Order
Statistics, Sophia-Antipolis, France, September 1993.
[17] S.P. Lloyd, Least Square Quantization in PCM,
IEEE Transaction on Information Theory, Vol. IT-
28:2, 1982, pp. 129-137.
[18] F. Belloir, F. Klein and A. Billat, Pattern
Recognition Methods for Identification of Metallic
Codes Detected by Eddy Current Sensor, Signal and
Image Processing (SIP'97), Proceedings of the
IASTED International Conference, 1997, pp. 293-
297.
[19] M. Grabisch and Sugeno, A Comparison of some
Methods of Fuzzy Classification on Real Data, Proc.
Of IIZUKA'92, Iizuka, Japan, July 1992, pp. 659-
662.
[20] Ishibuchi H., Nosaki K. and Tanaka H., Selecting
Fuzzy If-Then Rules for Classification Problems
Using Genetic Algorithms, IEEE Tansactions on
Fuzzy Systems, Vol.3, N°3, 1995.
[21] L. Beheim, A. Zitouni, F. Belloir, Problem of
Optimal Pertinent Parameter Selection in Buried
Conductive Tag Recognition, Proceedings of
WISP’2003, IEEE International Symposium on
Intelligent Signal Processing, Budapest (Hungary), 4-
6 September 2003, pp. 87-91.
[22] F. Belloir, L. Beheim, A. Zitouni, N. Liebaux, D.
Placko, Modélisation et Optimisation d'un Capteur à
Courants de Foucault pour l'Identification d'Ouvrages
Enfouis, 3e Colloque Interdisciplinaire en
Instrumentation (C2I’2004), Cachan (France), 29-30
janvier 2004.
... An apparent drawback of the architecture is its "fluffiness" due to the nonreuse of the neurons for the "missed" test-time data input, making the RBF ANNs less dense or compact than Deep ReLU ANNs. Still, RBF is a viable architecture used in niche applications [12,1]. ...
Chapter
Full-text available
We investigate a number of Artificial Neural Network architectures (well-known and more “exotic”) in application to the long-term financial time-series forecasts of indexes on different global markets. The particular area of interest of this research is to examine the correlation of these indexes’ behaviour in terms of Machine Learning algorithms cross-training. Would training an algorithm on an index from one global market produce similar or even better accuracy when such a model is applied for predicting another index from a different market? The demonstrated predominately positive answer to this question is another argument in favour of the long-debated Efficient Market Hypothesis of Eugene Fama.
... Four classification algorithms were employed to predict the maturity class of the samples using the full spectral reflectance data. The algorithms used were Radial Basis Function Neuronal Network (RBFNN) (Beheim et al., 2004), Partial Least Squares Discriminant Analysis (PLS-DA) (Brereton and Lloyd, 2014), Support Vector Machine (SVM) (Weston and Watkins, 1998), and Linear Discriminant Analysis (LDA) (Fisher, 1936). These algorithms were chosen based on previous studies on bell peppers, although not specifically for maturity stage classification or prediction. ...
... The spectral function can also be reconstructed from finite correlation data by implementing the radial basis function network (RBFN), which is an MLP model based on the RBF [127,128]. The RBFN has been widely used in feature extraction, classification, regression, etc. [129][130][131][132]. In Ref. [123], the spectral function ( ) was approximately described by a linear combination of RBFs: ...
Article
Full-text available
Although seemingly disparate, high-energy nuclear physics (HENP) and machine learning (ML) have begun to merge in the last few years, yielding interesting results. It is worthy to raise the profile of utilizing this novel mindset from ML in HENP, to help interested readers see the breadth of activities around this intersection. The aim of this mini-review is to inform the community of the current status and present an overview of the application of ML to HENP. From different aspects and using examples, we examine how scientific questions involving HENP can be answered using ML.
... Так, зокрема, до основних переваг використання RBF мережі можна віднести [1]: спрощену структуру мережі (наявність лише одного прихованого шару); високу швидкість навчання; здатність навчатися на неоднорідній вибірці даних; здатність моделювати випадкові процеси. Але проблемою RBF-мережі є вибір кількості радіально-базисних функцій [10][11][16][17]. У [10] зазначено, що число необхідних радіально-базисних функцій росте експоненціально із зростанням числа вхідних змінних. ...
... The transformation function of the hidden unit is the RBF radial basis function, which is radially symmetric to the center point. It has a decaying non-negative nonlinear function; the third layer is the output layer, which responds to the effects of the input mode [14]. The transformation from the input space to the hidden layer space is nonlinear, while the transformation from the hidden layer space to the output layer space is linear. ...
Article
Full-text available
The subject of this paper is a programmable con trol system for a robotic manipulator. Considering the complex nonlinear dynamics involved in practical applications of systems and robotic arms, the traditional control method is here replaced by the designed Elma and adaptive radial basis function neural network—thereby improving the system stability and response rate. Related controllers and compensators were developed and trained using MATLAB-related software. The training results of the two neural network controllers for the robot programming trajectories are presented and the dynamic errors of the different types of neural network controllers and two control methods are analyzed.
Article
Forecasting renewable energy efficiency significantly impacts system management and operation because more precise forecasts mean reduced risk and improved stability and reliability of the network. There are several methods for forecasting and estimating energy production and demand. This paper discusses the significance of artificial neural network (ANN), machine learning (ML), and Deep Learning (DL) techniques in predicting renewable energy and load demand in various time horizons, including ultra-short-term, short-term, medium-term, and long-term. The purpose of this study is to comprehensively review the methodologies and applications that utilize the latest developments in ANN, ML, and DL for the purpose of forecasting in microgrids, with the aim of providing a systematic analysis. For this purpose, a comprehensive database from the Web of Science was selected to gather relevant research studies on the topic. This paper provides a comparison and evaluation of all three techniques for forecasting in microgrids using tables. The techniques mentioned here assist electrical engineers in becoming aware of the drawbacks and advantages of ANN, ML, and DL in both load demand and renewable energy forecasting in microgrids, enabling them to choose the best techniques for establishing a sustainable and resilient microgrid ecosystem.
Article
Full-text available
This tutorial–review on applications of artificial neural networks in photonics targets a broad audience, ranging from optical research and engineering communities to computer science and applied mathematics. We focus here on the research areas at the interface between these disciplines, attempting to find the right balance between technical details specific to each domain and overall clarity. First, we briefly recall key properties and peculiarities of some core neural network types, which we believe are the most relevant to photonics, also linking the layer’s theoretical design to some photonics hardware realizations. After that, we elucidate the question of how to fine-tune the selected model’s design to perform the required task with optimized accuracy. Then, in the review part, we discuss recent developments and progress for several selected applications of neural networks in photonics, including multiple aspects relevant to optical communications, imaging, sensing, and the design of new materials and lasers. In the following section, we put a special emphasis on how to accurately evaluate the complexity of neural networks in the context of the transition from algorithms to hardware implementation. The introduced complexity characteristics are used to analyze the applications of neural networks in optical communications, as a specific, albeit highly important example, comparing those with some benchmark signal-processing methods. We combine the description of the well-known model compression strategies used in machine learning, with some novel techniques introduced recently in optical applications of neural networks. It is important to stress that although our focus in this tutorial–review is on photonics, we believe that the methods and techniques presented here can be handy in a much wider range of scientific and engineering applications.
Conference Paper
Full-text available
The structural pattern recognition passes by an extraction stage of a certain number of characteristic features of the form. In the majority of the cases, it is not necessary to use the whole of the primitives extracted to obtain good performances of the recognition system. One uses the criteria of feature selection like Fisher or the criteria based on covariance matrix to determine the optimal primitives which characterize the best the form. We will show that these criteria give the most discriminating primitives but not necessarily the most optimal for a given classifier.
Article
A further modification to Cover and Hart's nearest neighbor decision rule, the reduced nearest neighbor rule, is introduced. Experimental results demonstrate its accuracy and efficiency.
Article
We propose a network architecture which uses a single internal layer of locally-tuned processing units to learn both classification tasks and real-valued function approximations (Moody and Darken 1988). We consider training such networks in a completely supervised manner, but abandon this approach in favor of a more computationally efficient hybrid learning method which combines self-organized and supervised learning. Our networks learn faster than backpropagation for two reasons: the local representations ensure that only a few units respond to any given input, thus reducing computational overhead, and the hybrid learning rules are linear rather than nonlinear, thus leading to faster convergence. Unlike many existing methods for data analysis, our network architecture and learning rules are truly adaptive and are thus appropriate for real-time use.
Article
There have been several recent studies concerning feedforward networks and the problem of approximating arbitrary functionals of a finite number of real variables. Some of these studies deal with cases in which the hidden-layer nonlinearity is not a sigmoid. This was motivated by successful applications of feedforward networks with nonsigmoidal hidden-layer units. This paper reports on a related study of radial-basis-function (RBF) networks, and it is proved that RBF networks having one hidden layer are capable of universal approximation. Here the emphasis is on the case of typical RBF networks, and the results show that a certain class of RBF networks with the same smoothing factor in each kernel node is broad enough for universal approximation.
Article
This paper proposes a genetic-algorithm-based method for selecting a small number of significant fuzzy if-then rules to construct a compact fuzzy classification system with high classification power. The rule selection problem is formulated as a combinatorial optimization problem with two objectives: to maximize the number of correctly classified patterns and to minimize the number of fuzzy if-then rules. Genetic algorithms are applied to this problem. A set of fuzzy if-then rules is coded into a string and treated as an individual in genetic algorithms. The fitness of each individual is specified by the two objectives in the combinatorial optimization problem. The performance of the proposed method for training data and test data is examined by computer simulations on the iris data of Fisher