ArticlePDF Available

New RBF neural network classifier with optimized hidden neurons number

January 2004

January 2004

Authors:

Université de Reims Champagne-Ardenne

This article presents a noticeable performances improvement of a neural classifier based on an RBF network. Based on the Mahalanobis distance, this new classifier increases relatively the recognition rate while decreasing remarkably the number of hidden layer neurons. We obtain thus a new very general RBF classifier, very simple, not requiring any adjustment parameter, and presenting an excellent ratio performances/neurons number. A comparative study of its performances is presented and illustrated by examples on artificial and real databases.

b. 1st iteration of the algorithm.

…

a. Initialization of the algorithm.

…

d. Result of classification of the algorithm.

…

c. 2nd iteration of the algorithm.

…

Classification results on four different bases.

…

Figures - uploaded by Fabien Belloir

Content may be subject to copyright.

Content uploaded by Fabien Belloir

Content may be subject to copyright.

New RBF neural network classifier with optimized hidden neurons number

Larbi Beheim, Adel Zitouni, Fabien Belloir

Laboratoire d’Automatique et Microélectronique

Université de Reims Champagne-Ardenne

Campus du Moulin de la Housse,

B.P. 1039, 51687 Reims Cedex 2

FRANCE

Abstract: - This article presents a noticeable performances improvement of a neural classifier based on an RBF

network. Based on the Mahalanobis distance, this new classifier increases relatively the recognition rate while

decreasing remarkably the number of hidden layer neurons. We obtain thus a new very general RBF classifier, very

simple, not requiring any adjustment parameter, and presenting an excellent ratio performances/neurons number. A

comparative study of its performances is presented and illustrated by examples on artificial and real databases.

Key-Words: - RBF neural networks, Mahalanobis distance, clustering, training algorithms, hidden neurons

number optimization, burying tag identification.

1 Introduction

The radial basic functions neural net (RBF) has become,

for these last years, a serious alternative to the traditional

Multi-Layer Perceptron network (MLP) in the

multidimensional approximation problems. RBF

Network was employed since the Seventies under the

name of potential functions and it is only later than [1]

and [2] rediscovered this particular structure in the

neuronal form. Since, this type of network profited from

many theoretical studies such as [3], [4] and [5]. In

pattern recognition, RBF network is very attractive

because of its locality property which makes it possible

to discriminate complex classes such as nonconvex ones.

We consider in this article the Gaussian RBF classifier

of which each m output s

is evaluated according to the

following formula:

()()

() () exp

jljllj lll

X w X w XC XC

−



==−−Σ−





∑∑

(1)

Where X=[x

… x

]

∈

is a prototype to be classified,

represents the total number of hidden neurons. Each

one of these nonlinear neurons is characterized by a

center C

∈

and a covariance matrix

From a training set S

train

={X

}, p=1…N made up of

prototypes couples X

and its membership class

∈

{1…,m}, the supervised training problem of RBF

classifier amounts determining his structure, i.e. the

number of hidden neurons N

and the different

parameters intervening in the equation of outputs (1).

Whereas these parameters can be calculated by different

heuristics, the estimate of N

is often delicate. For that,

many methods were developed among which we can

quote [6], [7] and [8]. That they basic or are very

sophisticated, these methods generally require a very

significant load of calculation without however

guaranteeing significant performances. Moreover, they

often require a certain number of parameters that must

be fixed a priori and optimized for a particular problem.

So these methods cannot be applied systematically and

without particular precautions to any type of

classification problem. The article [9] proposed a very

simple algorithm which generates automatically a

powerful RBF network without any optimization nor

introduction of parameters fixed a priori. Indeed, the

algorithm automatically selects the number of the hidden

layer neurons. Although this network is characterized by

its great simplicity, it presents a major limitation

however owing to the fact that it requires a rather

significant number of neurons in the hidden layer. This

limitation makes it very heavy and requiring very

significant training times for the very large databases.

In this article we propose a solution to this problem by

introducing the Mahalanobis distance. We thus obtain a

new very general, very simple RBF network and

presenting an excellent performances/neurons number

ratio.

The organization of the article is as follows: In section 2,

we describe the principle of construction of new RBF

network and we present the associated algorithm. Its

operation is illustrated for an example on an artificial

database. In section 3, we study its performances, as well

on artificial problems as real ones.

2 Algorithm

In this section, we describe the principle of construction

proposed as well as the algorithm allowing its

implementation. We illustrate then his operation in a

problem of classification including two classes of which

one is not convex.

2.1 Principle

The principle of the algorithm rests on [9]. According to

the exponential nature of the functions

(.) of each

hidden neuron, the activation state of each one of them

decrease quickly when the vector of entry X moves away

from the neuron center. So only an area of the entry

space centered in C

will provide a significantly non null

activation state. Contrary to [9], our algorithm regards

this area as a hyperellipsoide centered in C

.. Indeed, the

use of the Mahalanobis distance makes it possible to take

into account the statistical distribution of the prototypes

around the centers C

and thus a better representation of

classes shape. Our algorithm proposes to divide a

nonconvex class into a set of hyperellipsoides called

clusters. Each cluster corresponds to a hidden neuron

and it is characterized thus by a center placing it in the

entry space, a matrix of covariance indicating the

privileged directions and a width calculating the

extension of the hyperellipsoide. In the continuation, we

will not make any more the distinction between a neuron

and a cluster.

2.2 Description

Before describing the algorithm of construction of the

RBF classifier, we will introduce some notations used

thereafter. At the k

iteration, one defines C

(k)

like the i

center (i=1… m

(k)

) characterizing the class

Ω

. With each

center a matrix with covariance is associated

(k)

and a

width L

(k)

. We note H

(k)

the hyperellipsoide of center

(k)

such as:

()()

{

}

() ()1

ij p j p ij ij p ij ij

XXC XCL

−

=∈Ω − Σ −<

(2)

Each class is characterized by the area R

(k)

defined as

the union of all the hyperellipsoides H

(k)

(i=1… m

(k)

We will also use the distance d(R

(k)

,X) between a point

∈Ω

and its associated area R

(k)

. This one is defined as

the Mahalanobis distance between X and the nearest

center C

(k)

of R

(k)

()

()()

)()(

min,

)(

XRd

CXCX

−

∑

−

…

(3)

Step 0 (initialization): For k=0, one define m clusters

whose centers correspond to the barycentres of different

classes

Ω

is the element number of

Ω

) :

(0)

=1et

∑

Ω∈

jpX

mjX

C …1,

)0(

(4)

()()

mjCXCX

jpX

…1,

)0(

=−−

−

=∑

∑

Ω∈

(5)

Step 1 (adjustment of the widths): The width L

(k)

relating to the center C

(k)

is defined like the half

Mahalanobis distance between this center and the nearest

center of another class :

()()

()

() () () ( 1)1 () ()

1,1

min

kkkkkk

ij ij st ij ij st

tjs m

imjm

LCCCC

−−

≠=

=−Σ−

…

……

(6)

Step 2 (search for an orphan point): We seek a point

∈

train

not belonging to its associated area R

(k)

and

most distant from this one:

()

i XRdX

trains

,max arg

)(

∈

= (7)

If such a point does not exist, go to the step 5.

Step 3 (creation of a new center): Point X

found at step 2

becomes a new center composing the class

Ω

)()(

mm , i

)(

(8)

Step 4 (Reorganization of the centers): The K-means

clustering algorithm [17] is applied to the points of S

train

pertaining to the class

Ω

in order to distribute as well

as possible the m

(k)

centers. Calculate the new

covariance matrices:

()()

()

() () ()

()

()1

pij

kkk

ij p ij p ij

XC XC

Card H

∈

Σ= − −

−

∑

(9)

Do k=k+1 and go to the step 1.

Step 5 (determination of the weights): The weights

matrix W

which minimizes an error function, here

selected as the sum square errors of classification, is

given by:

* TT

WHHHT

−





(10)

where H and T are the matrices respectively gathering

the activation function stats and the target outputs. These

last are fixed at 1 when they correspond to the class of

the point and 0 elsewhere.

2.3 Discussion

The initialization of the algorithm (step 0) could have

proceeded by the random placement of a number of

given centers. This technique is very current in the

definition of an RBF network. The fact of choosing the

initial centers as barycentres of the points X

makes it

possible to avoid this unforeseeable character and

provides moreover the number of these centers. Thus,

the result of the algorithm depends only on the

composition of the training data. In certain cases where

the classes are nonconvex, it may be that the barycentre

of a class is inside another class. This situation is not

prejudicial for the algorithm since this center will be

moved during following iterations. The covariance

matrix corresponding to each center is obtained from the

associated training data. We will further see that other

definitions of this matrix can give different results. In

step 1, L

(k)

is defined relatively to the minimal distance

between the center C

(k)

and centers of another class.

This means that a partial covering between the clusters

of the same class is authorized. From a practical point of

view, that makes it possible to optimize the space

occupation of the attributes by the various zones of

receptivity and thus to reduce the number of clusters

necessary to compose each class. The elliptic volume

covered by each cluster is maximum without

encroaching on neighboring classes. In step 2, the fact of

choosing the furthest point from the region R

(k)

makes it

possible to improve the effectiveness of the algorithm of

K-means clustering used at step 4. It should be noted that

this one relates only to the centers constituting the same

class since the other centers did not change a position. It

guarantees moreover a fast development of this area. In

the network training (step 5), the target outputs are fixed

arbitrarily at 1 when they correspond to the class of the

point and 0 elsewhere. The motivation of this practice is

artificially to create a brutal fall of the membership

degree at the geometrical border of the class.

After k iterations, all the points of S

train

belong to a

cluster, the algorithm generated m+k clusters defining as

many subclasses. The RBF network thus built comprises

then N

=m+k hidden neurons. We can note that the

algorithm converges necessarily. Indeed, in the "worst

case" where none the classes is separable, there will be

creation of a cluster for each point of S

train

2.4 Illustration of operation

We will illustrate the significant phases of the algorithm

on a classification problem of two concentric classes

from the databases of "ELENA" project [10] [11]. This

base makes it possible to determine the capacity of a

classifier to separate two classes not overlapping but of

which one is included in the second.

Fig. 1a. Initialization of the algorithm.

Fig. 1b. 1st iteration of the algorithm.

Fig. 1c. 2nd iteration of the algorithm.

Fig. 1d. Result of classification of the algorithm.

The RBF network comprises 2 inputs and 2 outputs. The

figure 1a shows the 2 initial centers {C

} obtained

following step 0. We can see that the two centers are

almost confused. Each cluster induced is delimited by an

ellipse of the width calculated at step 1. Obviously, the

cluster of center C

is not sufficient to entirely represent

the class

Ω

. This one thus will be subdivided in several

subclasses. With the first iteration of the algorithm, the

point noted X

on the figure 1b is the furthest from the

center C

and is out of the corresponding cluster. The

addition of a new center compared to this point led, after

application of the K-means, to the new distribution

} illustrated by the figure 1b. The point X

now the furthest from the center C

on this figure. After

application of the K-means on this new configuration

one leads to the figure 1c. After 4 iterations, the 2 classes

are discriminated perfectly and the neuronal classifier

comprises a total of 5 neurons (see figure 1d). After

having determined the number of centers necessary and

their positions, the weights of the network are calculated

according to equation of step 5.

The algorithm thus manages to separate the two classes

with only 5 neurons against 108 neurons for the old

algorithm using the Euclidean distance and with a

slightly higher rate of recognition: 98% against 97.75%

for the old RBF.

3 Results

The object of this section is to evaluate the performances

of the RBF classifier built by the algorithm presented in

section 2. For that, we applied the classifier to various

problems of classification comprising a variable number

of attributes and classes and bearing on synthetic data as

well as resulting from the real world.

3.1 Benchmarks

The benchmarks carried out here are studied in detail in

ELENA project [10]. For each problem of classification,

we have the results concerning the RBF classifier

generated by the algorithm proposed as well as the

performances of certain classifiers studied in [11]. It is

about the classifier of the "k-nearest neighbor" (kNN)

[12] who gives the best approximation of the Bayes

recognition error and the of the Multi-Layer Perceptrons

classifier (MLP) very widespread in the pattern

recognition per connexionnist model [13]. The Learning

Vector Quantization classifier (LVQ) proposed by

Kohonen is a simple adaptive method of vector

quantization. For other types of neural classifiers, see

[11] and the included references. The RBFE classifier

acts of that proposed in [9] using the euclidean distance.

For each classifier, we calculate the average error of

recognition (in %) on the test set obtained on 5 different

experiments with the method of the "hold out" for

counting the classification errors. The experimental

protocol, which respects that used in ELENA project,

consists in learning the classifier on half of the data then

testing its performances on the second half of the base.

The first database is created artificially to highlight

certain properties or gaps of the tested classifiers. The

objective of the "Clouds" problem is to study the

influence of two interlaced classes with nonlinear

borders. The three last databases result from real

problems. The "Phoneme" problem relates to the speech

recognition studied in European project "ROARS project

SPIRIT" [14]. The principal difficulty of this problem is

great dissymmetry in the number of authorities of each

classes. We will not present the "Iris" data very known

in the pattern recognition [15]. To finish, the data of the

"Texture" file relates to the recognition of 11 natural

micro-textures such as grass, sand, paper or certain

textiles [16]. Various information concerning the

statistics and the analysis in principal components of

these various data files can be found in [10] and the

references included.

The figure 2 presents results on these various problems.

The performances of the RBF classifier are slightly

lower than the other classifiers for this first problem.

This is explained by the significant interlacing of the two

classes. The algorithm generates a neurons number close

to the points number of training data and the capacities

of generalization on the test set are thus very bad.

The error rate of our classifier RBF is generally weakest

for each of the last three problems. This is checked

whatever the number of classes to be distinguished and

the quantity of available data for the training.

Fig. 2. Classification results on four different bases.

3.2 Study according to the neurons number

The object of this study is to show excel it

performances/nombre report/ratio of neurons which our

new classifier has. Not having the number of neurons of

network MLP, this study is limited to only networks

RBF of the preceding section.

Table 1 presents a comparison between the old RBF and

the new one. We can see on this table the "compact"

quality of our new classifier who gives comparable error

rates or even lower while minimizing the hidden neurons

number Nh. So, training times are much less significant.

For the "Textures" database for example, the error rate is

divided by 4, while the number of neurons is divided by

39. It was necessary less than two minutes to training our

classifier and more than one hour for the old one.

Times of training are given here for an execution of the

algorithm under MATLAB on a PC AMD Athlon XP

1800+.

Database

Classifier Error(%) N

Learning time(s)

Old classifier 13,60 162 60

Clouds

This classifie

13,25 72 30

Old classifier 10,90 227 100

Phoneme

This classifie

10,43 59 24

Old classifier 2,90 24 0,5

Iris

This classifie

1,94 3 0,1

Old classifier 1,73 858 3900

Texture

This classifie

0,41 22 100

Table. 1. Comparison performances, neurons number

and training times of the two RBF classifiers.

3.3 The choice of the covariance matrix

One of the limits of this classifier is the estimate of the

covariance matrix. The larger the size of the clusters is

and the better is the estimate of this matrix. So the

calculation of this matrix can sometimes reduce the rates

of recognition. To cure this problem, other calculations

of this matrix can be proposed to take into account more

prototypes during the estimate of this matrix. Table 2

gives examples of calculation, errors and the

corresponding number of neurons. One can see on this

table that a different choice of the covariance matrix that

proposed in section 2 can increase or decrease the rate

error but the number of hidden neurons can only

increase. But this number remains always largely lower

than the number of neurons proposed by the old RBF.

Covariance matrix Error (%) N

Σ=cov(C) (a)

0,14 247

Σ=cov(J

) (b) 0,27 38

Σ=cov(JC

) (c) 0,41 22

Σ=cov(J) (d) 0,32 283

Table 2. Error rate of the base Textures according to the

choice of the matrix of covariance: (a) covariance of the

centers, (b) covariance of the data of each class (c)

covariance of the data of each center (d) covariance of

the total database.

3.4 Application in code identification

The goal of our application is to detect and identify

reliably different buried metallic codes with a smart eddy

current sensor. Based on the principle of the induction

balance, our detector measures the magnetic fields

modifications emitted by a coil. These modifications are

due to the presence of the metal codes buried on the top

of the drains. A code is built from a succession of

different metal pieces separated by empty spaces. Thus

the identification of the codes allows the identification

and the localization of the pipes (like water, gas,…) [18].

Several material improvements were carried out on our

detector [21], but the identification of the codes always

poses problems because of the similarity between the

codes, the non-linearity of the answer according to the

depth and the choice of a suitable coding of the signals

[22]. To solve these problems, various methods of

classifications were proposed. The first methods was

based one the fuzzy logic theory, the Kohonen SOM,

and an RBF classifier. The methods based on the fuzzy

logic theory are the well-known Fuzzy Pattern Matching

(FPM) [19] and the distributed rules (DR.) [20]

developed among others by Ishibuchi. Among all these

methods it is the classifier RBFE (Euclidean RBF) who

gives the best results. But, these results remain

insufficient for the great depths. It is for that we

developed this new classifier to try to decrease rate

errors and neurons number for a future integration of the

classifier on programmable microchips.

A comparison is made between these different methods

and the new RBF classifier.

This classifier RBFE SOM FPM DR

Error (%)

5.0 6.2 11.3 8.3 7.1

68 135 - - -

Table 3. Results of code misclassification for the 5

pattern recognition methods implemented.

For a burying depth up to 80 cm, we obtain the results

given in the table 3. We can notice that the result of the

new RBF classifier is better than the others, and always

with less number of hidden neurones.

4 Conclusion

We proposed a noticeable performances improvement of

a neural classifier based an RBF network. The new

classifier is very general and simple. It generates

automatically a powerful RBF network without any

introduction of parameters fixed a priori.

The number of hidden neurons is very optimized what

will allow its use for the very large databases. Indeed,

the new classifier obtains excellent recognition results

for a variety of different databases.

References:

[1] D.S. Broomhead and D. Lowe, Multivariable

functional interpolation and adaptive networks,

Complex Systems, Vol.2, 1988, pp. 321-355.

[2] J. Moody and C.J. Darken, Fast Learning in

Networks of Locally-Tuned Processing Units. Neural

Computation, Vol.1, 1898, pp. 281-294.

[3] F. Girosi and T. Poggio, Networks and The Best

Approximation Property, Technical Report C.B.I.P.

No. 45, Artificial Intelligence Laboratory,

Massachusetts Institute of Technology, 1989.

[4] Park J. and Sandberg I.W., Universal Approximation

Using Radial-Basis-Function Networks, Neural

Computation, Vol.3, 1991, pp. 246-257.

[5] M. Bianchini, P. frasconi and M. Gori, Learning

without Local Minima in Radial Basis Function

Networks, IEEE Transactions on Neural Networks,

Vol.6:3, 1995, pp. 749-756.

[6] B. Fritzke, Supervised Learning with Growing Cell

Structures, In Advances in Neural Processing

Systems 6, J.C. Cowan, Tesauro G. and Alspector J.

(eds.), Morgan Kaufmann, San Mateo, CA., 1994.

[7] B. Fritzke, Transforming Hard Problems into

Linearly Separable one with Incremental Radial

Basis Function Networks, In M.J. Vand Der Heyden,

J. Mrsic-Flögel and K. Weigel (eds.), HELNET

International Workshop on Neural Networks,

Proceedings Volume I/II 1994/1995, VU University

Press, 1996

[8] C.G. Looney, Pattern Recognition Using Neural

Network - Theory and Algorithms for Engineers and

Scientits, Oxford University Press, Oxford - New

York, 1997.

[9] F. Belloir, A. Fache and A. Billat, A General

Approach to Construct RBF Net-Based

Classifier, Proc. of the European Symposium on

Artificial Neural Networks (ESANN’99), April 21-

23, Bruges Belgium, 1999, pp. 399-404.

[10] C. Aviles-Cruz, A. Guerin-Dugué, J.L. Voz and

D. Van Cappel, Deliverable R3-B1-P Task B1:

Databases, Technical Report ELENA ESPRIT Basic

Research Project Number 6891, June 1995.

[11] F. Blayo, Y. Cheneval, A. Guerin-Dugué,

R. Chentouf, C.Aviles-Cruz, J. Madrenas,

M. Moreno and J.L. Voz, Deliverable R3-B4-P Task

B4: Benchmarks, Technical Report ELENA ESPRIT

Basic Research Project Number 6891, June 1995.

[12] R. Duda and P. Hart, Pattern Recognition and

Scene Analysis, J. Wiley & sons Edition, 1973.

[13] D.E. Rumelhart and J.L. McClelland, Parallel

Distributed Processing: Explorations in the

Microstructure of Cognition, MIT Press, 1986.

[14] P. Alinat, Periodic Progress Report 4, Technical

Report, ROARS Project ESPRIT II-Number 5516,

Thomson Report TS. ASM 93/S/EGS/NC/079,

February 1993.

[15] G.W. Gates, The Reduced Nearest Neighbor Rule,

IEEE Trans. on Information Theory, Vol. May, 1972,

pp. 431-433.

[16] A. Guerin-Dugué and C. Aviles-Cruz, High Order

Statistics from Natural Textured Images, ATHOS

Workshop on System Identification and High Order

Statistics, Sophia-Antipolis, France, September 1993.

[17] S.P. Lloyd, Least Square Quantization in PCM,

IEEE Transaction on Information Theory, Vol. IT-

28:2, 1982, pp. 129-137.

[18] F. Belloir, F. Klein and A. Billat, Pattern

Recognition Methods for Identification of Metallic

Codes Detected by Eddy Current Sensor, Signal and

Image Processing (SIP'97), Proceedings of the

IASTED International Conference, 1997, pp. 293-

297.

[19] M. Grabisch and Sugeno, A Comparison of some

Methods of Fuzzy Classification on Real Data, Proc.

Of IIZUKA'92, Iizuka, Japan, July 1992, pp. 659-

662.

[20] Ishibuchi H., Nosaki K. and Tanaka H., Selecting

Fuzzy If-Then Rules for Classification Problems

Using Genetic Algorithms, IEEE Tansactions on

Fuzzy Systems, Vol.3, N°3, 1995.

[21] L. Beheim, A. Zitouni, F. Belloir, Problem of

Optimal Pertinent Parameter Selection in Buried

Conductive Tag Recognition, Proceedings of

WISP’2003, IEEE International Symposium on

Intelligent Signal Processing, Budapest (Hungary), 4-

6 September 2003, pp. 87-91.

[22] F. Belloir, L. Beheim, A. Zitouni, N. Liebaux, D.

Placko, Modélisation et Optimisation d'un Capteur à

Courants de Foucault pour l'Identification d'Ouvrages

Enfouis, 3e Colloque Interdisciplinaire en

Instrumentation (C2I’2004), Cachan (France), 29-30

janvier 2004.

“It Looks All the Same to Me”: Cross-Index Training for Long-Term Financial Series Prediction

Chapter

Full-text available

Feb 2024

Stanislav Selitskiy

We investigate a number of Artificial Neural Network architectures (well-known and more “exotic”) in application to the long-term financial time-series forecasts of indexes on different global markets. The particular area of interest of this research is to examine the correlation of these indexes’ behaviour in terms of Machine Learning algorithms cross-training. Would training an algorithm on an index from one global market produce similar or even better accuracy when such a model is applied for predicting another index from a different market? The demonstrated predominately positive answer to this question is another argument in favour of the long-debated Efficient Market Hypothesis of Eugene Fama.

Band selection pipeline for maturity stage classification in bell peppers: From full spectrum to simulated camera data

Article

Nov 2023
J FOOD ENG

High-energy nuclear physics meets machine learning

Article

Full-text available

Jun 2023

Although seemingly disparate, high-energy nuclear physics (HENP) and machine learning (ML) have begun to merge in the last few years, yielding interesting results. It is worthy to raise the profile of utilizing this novel mindset from ML in HENP, to help interested readers see the breadth of activities around this intersection. The aim of this mini-review is to inform the community of the current status and present an overview of the application of ML to HENP. From different aspects and using examples, we examine how scientific questions involving HENP can be answered using ML.

STUDYING THE POSSIBILITY OF USING RBF FOR DETERMINING SMURF ATTACKS BASED ON THE KDDCUP DATABASE

Article

Jan 2022

Improving the Accuracy of a Robot by Using Neural Networks (Neural Compensators and Nonlinear Dynamics)

Article

Full-text available

Aug 2022

The subject of this paper is a programmable con trol system for a robotic manipulator. Considering the complex nonlinear dynamics involved in practical applications of systems and robotic arms, the traditional control method is here replaced by the designed Elma and adaptive radial basis function neural network—thereby improving the system stability and response rate. Related controllers and compensators were developed and trained using MATLAB-related software. The training results of the two neural network controllers for the robot programming trajectories are presented and the dynamic errors of the different types of neural network controllers and two control methods are analyzed.

Weak Relation Enforcement for Kinematic-Informed Long-Term Stock Prediction with Artificial Neural Networks

Chapter

Jun 2024

Stanislav Selitskiy

Explicit Model Memorisation to Fight Forgetting in Time-series Prediction

Conference Paper

Mar 2024

Stanislav Selitskiy

State-of-the-art review on energy and load forecasting in microgrids using artificial neural networks, machine learning, and deep learning techniques

Article

Sep 2023
ELECTR POW SYST RES

Forecasting renewable energy efficiency significantly impacts system management and operation because more precise forecasts mean reduced risk and improved stability and reliability of the network. There are several methods for forecasting and estimating energy production and demand. This paper discusses the significance of artificial neural network (ANN), machine learning (ML), and Deep Learning (DL) techniques in predicting renewable energy and load demand in various time horizons, including ultra-short-term, short-term, medium-term, and long-term. The purpose of this study is to comprehensively review the methodologies and applications that utilize the latest developments in ANN, ML, and DL for the purpose of forecasting in microgrids, with the aim of providing a systematic analysis. For this purpose, a comprehensive database from the Web of Science was selected to gather relevant research studies on the topic. This paper provides a comparison and evaluation of all three techniques for forecasting in microgrids using tables. The techniques mentioned here assist electrical engineers in becoming aware of the drawbacks and advantages of ANN, ML, and DL in both load demand and renewable energy forecasting in microgrids, enabling them to choose the best techniques for establishing a sustainable and resilient microgrid ecosystem.

Artificial neural networks for photonic applications—from algorithms to implementation: tutorial

Article

Full-text available

Sep 2023

This tutorial–review on applications of artificial neural networks in photonics targets a broad audience, ranging from optical research and engineering communities to computer science and applied mathematics. We focus here on the research areas at the interface between these disciplines, attempting to find the right balance between technical details specific to each domain and overall clarity. First, we briefly recall key properties and peculiarities of some core neural network types, which we believe are the most relevant to photonics, also linking the layer’s theoretical design to some photonics hardware realizations. After that, we elucidate the question of how to fine-tune the selected model’s design to perform the required task with optimized accuracy. Then, in the review part, we discuss recent developments and progress for several selected applications of neural networks in photonics, including multiple aspects relevant to optical communications, imaging, sensing, and the design of new materials and lasers. In the following section, we put a special emphasis on how to accurately evaluate the complexity of neural networks in the context of the transition from algorithms to hardware implementation. The introduced complexity characteristics are used to analyze the applications of neural networks in optical communications, as a specific, albeit highly important example, comparing those with some benchmark signal-processing methods. We combine the description of the well-known model compression strategies used in machine learning, with some novel techniques introduced recently in optical applications of neural networks. It is important to stress that although our focus in this tutorial–review is on photonics, we believe that the methods and techniques presented here can be handy in a much wider range of scientific and engineering applications.

Merging machine learning and geostatistical approaches for spatial modeling of geoenergy resources

Article

Aug 2023
INT J COAL GEOL

A general approach to construct RBF net-based classifier.

Conference Paper

Full-text available

Jan 1999

Problem of optimal pertinent parameter selection in buried conductive tag recognition

Conference Paper

Full-text available

Oct 2003

The structural pattern recognition passes by an extraction stage of a certain number of characteristic features of the form. In the majority of the cases, it is not necessary to use the whole of the primitives extracted to obtain good performances of the recognition system. One uses the criteria of feature selection like Fisher or the criteria based on covariance matrix to determine the optimal primitives which characterize the best the form. We will show that these criteria give the most discriminating primitives but not necessarily the most optimal for a given classifier.

Universal approximation using radial basis function networks

Article

Jan 1990

The Reduced Nearest Neighbor Rule

Article

May 1972

GEOFFREY W. GATES

A further modification to Cover and Hart's nearest neighbor decision rule, the reduced nearest neighbor rule, is introduced. Experimental results demonstrate its accuracy and efficiency.

Pattern Recognition and Scene Analysis

Article

Jan 1973

Least square quantization in PCM

Article

Jan 1982

S. P. Lloyd

Fast Learning in Networks of Locally-Tuned Processing Units

Article

Jun 1989

We propose a network architecture which uses a single internal layer of locally-tuned processing units to learn both classification tasks and real-valued function approximations (Moody and Darken 1988). We consider training such networks in a completely supervised manner, but abandon this approach in favor of a more computationally efficient hybrid learning method which combines self-organized and supervised learning. Our networks learn faster than backpropagation for two reasons: the local representations ensure that only a few units respond to any given input, thus reducing computational overhead, and the hybrid learning rules are linear rather than nonlinear, thus leading to faster convergence. Unlike many existing methods for data analysis, our network architecture and learning rules are truly adaptive and are thus appropriate for real-time use.

Multivariable Functional Interpolation and Adaptive Networks

Article

Jan 1988

Universal Approximation Using Radial-Basis-Function Networks

Article

Jun 1991

There have been several recent studies concerning feedforward networks and the problem of approximating arbitrary functionals of a finite number of real variables. Some of these studies deal with cases in which the hidden-layer nonlinearity is not a sigmoid. This was motivated by successful applications of feedforward networks with nonsigmoidal hidden-layer units. This paper reports on a related study of radial-basis-function (RBF) networks, and it is proved that RBF networks having one hidden layer are capable of universal approximation. Here the emphasis is on the case of typical RBF networks, and the results show that a certain class of RBF networks with the same smoothing factor in each kernel node is broad enough for universal approximation.

Selecting fuzzy if-then rules for classification problems using genetic algorithms

Article

Sep 1995

This paper proposes a genetic-algorithm-based method for selecting a small number of significant fuzzy if-then rules to construct a compact fuzzy classification system with high classification power. The rule selection problem is formulated as a combinatorial optimization problem with two objectives: to maximize the number of correctly classified patterns and to minimize the number of fuzzy if-then rules. Genetic algorithms are applied to this problem. A set of fuzzy if-then rules is coded into a string and treated as an individual in genetic algorithms. The fitness of each individual is specified by the two objectives in the combinatorial optimization problem. The performance of the proposed method for training data and test data is examined by computer simulations on the iris data of Fisher

New RBF neural network classifier with optimized hidden neurons number

Abstract and Figures

Recommended publications

A New RBF Classifier for Buried Tag Recognition.

Buried Tag Identification with a new RBF Classifier

Modeling of Intelligent Electromagnetic Sensor for Buried Tags

Problem of optimal pertinent parameter selection in buried conductive tag recognition