ArticlePDF Available

Analyzing the Efficiency of Segment Boundary Detection Using Neural Networks

Authors:

Abstract and Figures

—This paper describes the architecture of a neural network for edge detection. Different filters for first-layer neurons are compared. Neural network learning based on a cosine measure algorithm shows much worse results than an error backpropagation algorithm. Optimal parameters for the first-layer neuron operation are given. The proposed architecture fulfills the stated tasks on edge selection.
Content may be subject to copyright.
ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2019,Vol. 5 5,No. 4,pp. 1–10.c
Allerton Press, Inc., 2019.
Original Russian Text c
A.V. Kugaevskikh, A.A. Sogreshilin, 2019, published in Avtometriya, 2019, Vol. 55, No. 4, pp. 118–128.
ANALYSIS AND SYNTHESIS OF SIGNALS AND IMAGES
Analyzing the Efficiency of Segment Boundary
Detection using Neural Networks
A. V. Kugaevskikha,b,c and A. A. Sogreshilinb
aNovosibirsk State Technical University,
pr. Karla Marksa 20, Novosibirsk, 630073 Russia
bNovosibirsk State University,
ul. Pirogova 1, Novosibirsk, 630090 Russia
3Institute of Automation and Electrometry, Siberian Branch, Russian Academy of Sciences,
pr. Akademika Koptyuga 1, Novosibirsk, 630090 Russia
E-mail: a-kugaevskikh@yandex.ru
Received April 4, 2019; revision received May 23, 2019; accepted for publication May 27, 2019
Abstract—This paper describes the architecture of a neural network for edge detection. Different fil-
ters for first-layer neurons are compared. Neural network learning based on a cosine measure algorithm
shows much worse results than an error backpropagation algorithm. Optimal parameters for first-layer
neuron operation are given. The proposed architecture fulfills the stated tasks on edge selection.
Keywords: edge selection, Gabor filter, cosine measure, neural networks, wavelet sombrero, hyperbolic
tangent.
DOI:?
INTRODUCTION
Image brightness segmentation is a basis for computer vision systems. The results of brightness segmen-
tation are used to construct color consistency mechanisms, motion analysis, and recognition via detection
of local features usually represented by line intersections. The solution of the edge detection problem is based
on a sharp difference in the brightness at the segment boundary. This sharp decrease can be determined
by the Sobel filter and the Canny algorithm [1]. Brightness variation in a certain vicinity serves as a basis
for angle detectors, such as a Harris detector [2], a Ferstner detector [3], and a SUSAN detector [4], while
the angle detectors are noisy and often give false alarms.
In neural network models, the edge detection and angle detection mechanisms (for example, a Gaussian
lteroraGaborfilter)werealsoused to determine local features. The models including HMAX [5] and
LPREEN [6] were based on a Gabor filter, which served as a basis for the neuron models of simple opposition
and double opposition [7–10]. A Gaussian filter as a basis for an opponent process was used in [11]. Simple
and double opposition based on a Gabor filter or a Gaussian filter was applied to simulate the color constancy
effect in order to adjust the scene illumination level.
Aside from convolutional and deep networks, the problem of neural network segmentation also involves
classical architectures: pulse-coupled neural network [12], multilayer perceptron [13], and Kohonen self-
organizing maps [14]. Independently of the neural network model, local features are detected by determining
the brightness difference.
A Gabor filter is most often used to detect textures due to its periodicity, with the “Mexican hat” (“som-
brero”) wavelet serving as its alternative in the edge detection problem. Further construction of computer
vision systems requires analyzing the applicability of different filters from the standpoint of accuracy of edge
detection and learning.
1
2 KUGAEVSKIKH, AND SOGRESHILIN
The pairs of angles
are connected.
Learning
Unit response
vector
36 neurons
respond to angles 630 neurons
respond to angles
Weights are the receptive
fields under study.
Not learning
Fig. 1. Neural networks of edge detection.
ARCHITECTURE OF THE NEURAL NETWORK OF EDGE DETECTION
It is proposed to carry out edge detection on images using a convolutional neural network for a number
of reasons. First of all, the convolution operation equivalent to a weighted sum allows detecting lines
of strictly defined length and orientation due to a kernel configured for a specific length and orientation,
which, in turn, make is possible to obtain local features for classification. Second, this neural network is
functionally similar to the things occurring in the primary visual cortex. Convolution distribution with
a Gabor filter kernel for line detection is due to similarity to the process taking place in the primary visual
cortex. Third, the architecture of the convolutional neural network allows adding color segmentation, motion
detection, scene plan construction, and object recognition using edge detection data.
Image component Lin a color space CIE Labis fed to the neural network input [15]:
L= 116f(Y/Yn)16.(1)
Here Ydenotes the component of the color space CIE XY Z and Yn= 100 is the Ycoordinate of the white
spot for light source D65 and explorer 2;
f(x)=x1/3,x>(6/29)3,
(29/6)2/3+4/29 otherwise.
Brightness segmentation can be carried out using a two-layer neural network (Fig. 1). Lines of certain
orientation are detected on the first layer. The second layer is responsible for line combination, including
angles. The difference of learning convolutional layers, namely the first layer of simple cells in a neocognitron
network, comes down to the use of previously configured receptive fields of first-layer neurons, which increases
the predictability and interpretability of the network operation results.
Each layer contains four types of neurons, differing in the receptive field configuration. At the same
time, interlaminar bonding is organized in a special way. Each second-layer neuron (UC2) is only connected
to two first-layer neurons (US1) (Fig. 2). Thus, the second-layer neurons allow for line and angle detection
(in the case of a Gabor filter) and quadrangles (in the case of a hyperbolic tangent).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 3
US1
UC2
Fig. 2. Principles of organization of interlaminar bonding.
RECEPTIVE FIELDS OF NEURONS
The types of first-layer neurons, also known as simple cells, are given in Table 1. Each type of neurons
is represented by an orientation group comprised of 36 neurons. Each neuron is sensitive to a line of certain
orientation with a deviation step of 10θ).
The operation of first-layer neurons is the two-dimensional convolution of the window of a brightness
pixel array and the corresponding weight array determined using the corresponding function.
The first two types of neurons react to the lines of preferred orientation θ. Their receptive fields are
formed using the Gabor filter [16], which is chosen for the two following reasons. First, its function allows
forming a receptive field of a simple neuron for detecting the lines of certain orientation. Second, unlike the
sombrero wavelet, its constant component is much closer to zero, which reduces the parasitic activation of
the first-layer neurons. The uniform filling of the receptive field of such a neuron yields a value close to zero.
The Gabor filter allows detecting bright lines on a dark background and dark lines on a bright background:
G1,2=expx2+γ2y2
2σ2cos 2πx
λ+ϕ.(2)
It was shown in [17] that Eq. (2) had the optimal constant component of the filter, thereby reducing
convolution induced noise. Another ratio optimal for filtering is σ/λ 0.56, which is derived from a
bandwidth of 1 octave. For the edge detection problem, the Gabor filter has the following parameters:
γ=0.1andθ[0; 350].
The sombrero wavelet is often used as an alternative to the Gabor filter:
G=expx2+γ2y2
2σ21x2+γ2y2
σ2.(3)
Figure 3 shows the comparison of the graphs of the Gabor filter (λ=2andλ= 3) and the sombrero
wavelet.
With a kernel size 5 ×5, the Gabor filter (λ= 2) has a second wavelet, which works well in texture
analysis, but distorts the value of activation of the neuron learned for the line of particular orientation. The
choice of λ= 3 shifts the secondary wavelet beyond the kernel boundaries of the Gabor filter, while a peak
thickness of 1 pixel is retained. The application of the Gabor filter is more preferable than the sombrero
wavelet also because it is easy to select σas a function the filter kernel size (Fig. 4).
The following types of neurons are necessary for determining the brightness difference regions, with the
receptive field being formed using a smooth function so that different nuances, such as blurring in fog or
finding shadows, are accounted for. For this reason, the Haar wavelet cannot be used, but the receptive field
can possibly be configured with the help of a hyperbolic tangent (Fig. 5):
G3,4= tanh(2kx).(4)
To account for different sizes of the brightness difference regions, the λparameter is introduced into
Eq. (4). Expressions (2) and (4) use the rotation matrix
x=xcos θ+ysin θ,
y=xsin θ+ycos θ. (5)
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
4 KUGAEVSKIKH, AND SOGRESHILIN
1.0
0.8
0.6
0.4
0.2
0
_5 _4 _3 _2 _1 0 1 2 435
_0.2
_0.4
_0.6
_0.8
21
3
Fig. 3. Filter graph comparison ((1) shows the Gabor filter for λ=2,(2) shows the Gabor filter
for λ=3,and(3) shows the sombrero filter).
1.00
_0.38
_0.04
0.31
0.66
1.00
_0.43
_0.08
0.28
0.64
Fig. 4. Maps of receptive fields: Gabor filter (a) and sombrero wavelet (b).
1.0
_1.0
_0.5
0
0.5
_2_1 0
_1.0
_0.5
0
0.5
1.0
12
_2 _1 012
Fig. 5. Receptive field using the hyperbolic tangent equation.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 5
Ta b l e 1 . Types of receptive fields of first-layer neurons
No. Figure Function Parameters
1+
+
+
__ Gabor
filter
γ=0.1,
ϕ=0
2_
_
_
++ Gabor
filter
γ=0.1,
ϕ=π
3_+
_
_+
+
Hyperbolic
tangent k=1
4+_
+
+_
_
Hyperbolic
tangent k=1
Fig. 6. Some images for learning and testing of the neuron for line and angle detection.
The first-layer neurons yield activation for the lines in a deviation range of ±20from preferred orienta-
tion. This can cause problems for accurately determining a feature on the next layer.
NETWORK LEARNING
A delta rule using a cosine measure was tested as a working hypothesis for neural network learning.
A detailed algorithm was published in [18–20]. The cosine measure itself (6) is a metric of spreading
the vectors A=(u1w1,u
2w2)andB=(u3w3,u
4w4), which determine the image angles found using
the second layer of the brightness segmentation model
msim =AB
AB.(6)
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
6 KUGAEVSKIKH, AND SOGRESHILIN
(b)
(c)
(a)
11 13 15 17 19 219
0
20
40
60
80
100
11 13 15 17 19 219
0
20
40
60
80
Before learning
Cosine
Back-prop
100
11 13 15 17 19 219
0
20
40
60
80
100
Fig. 7. Comparing the efficiency of learning algorithms: for the Gabor filter (λ= 2) (a), for the
Gabor filter (λ= 3) (b), and for the sombrero wavelet (c).
Iterative learning results in maximization of the cosine measure. On each iteration, the corresponding
weights of two second-layer neurons with different angles of the first-layer neurons are recalculated (θl1
and θl2):
Δwl=qmsim ulcos (θl2θl1),
Here ulis the output of the learning second-layer neuron, and qis the learning rate.
EXPERIMENTAL TESTING
Learning and experimental testing of the behavior of different filters are carried out using black-and-white
image packages. Figure 6 shows some of 630 images for the learning of the neuron of line and angle detection.
The efficiency of the proposed algorithm is compared on the basis of the cosine measure and backprop-
agation algorithm (the measure is the root-mean-square energy of the error, and the delta rule is applied
using a local gradient, back-prop) for each type of filters (Fig. 7). The indicator of learning algorithms and
receptive field configurations is the accuracy of operation of the second-layer neurons, i.e., detection of angles
and lines of strictly certain orientation. For example, if an angle of lines with orientations of 0 and 70,then
the second-layer neuron 0 70should activate. The network learns the same way. The accuracy is defined
as a ratio of the number of line combinations correctly recognized by the second layer to the number of such
combinations (630 for the Gabor filter based neuron and 35 for the hyperbolic tangent based neuron for each
receptive field size).
As seen from Fig. 7, the proposed cosine measure based algorithm is less effective as compared with the
backpropagation algorithm.
Table 2 lists the experimental results regarding the accuracy of operation of the second-layer neurons of
line and angle detection. Considering these data, the optimal approach is to use the Gabor filter (λ=3)
as a basis for the neuron of line and angle detection because this neuron yields a detection quality of 100%
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 7
Ta b l e 2 . Comparing the operational efficiency of the filters before
and after the backpropagation algorithm based learning
ΔθState Kernel size Ag
911 13 15 17 19 21
Gabor filter (λ=2)
10 Before learning 13.65 26.03 30.48 40.79 23.97 56.19 100
After learning 32.22 69.37 60.48 61.59 40 99.84 100
20 Before learning 32.03 50.98 43.79 57.52 43.14 59.48 100
After learning 69.93 91.5 95.42 96.73 67.32 100 100
30 Before learning 100 42.42 42.42 100 63.64 42.42 100
After learning 100 100 100 100 83.33 100 100
Gabor filter (λ=3)
10 Before learning 16.83 28.73 43.17 49.84 46.03 58.41 100
After learning 58.41 90.95 100 96.35 100 97.3 100
20 Before learning 47.06 67.32 81.7 88.89 99.35 92.81 100
After learning 100 100 100 100 100 100 100
30 Before learning 100 53,03 100 100 100 93.94 100
After learning 100 100 100 100 100 100 100
Sombrero wavelet
10 Before learning 16.83 29.05 44.44 49.21 37.62 57.62 100
After learning 58.1 87.78 99.37 94.6 100 96.19 100
20 Before learning 47.06 64.71 72.55 82.35 99.35 77.78 100
After learning 100 99.35 100 100 100 100 100
30 Before learning 100 50 87.88 100 100 65.15 100
After learning 100 98.48 100 100 100 100 100
Fig. 8. Set of images for the learning of the hyperbolic tangent neuron.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
8 KUGAEVSKIKH, AND SOGRESHILIN
Ta b l e 3 . Region detection quality (in %) before and after backpropagation algorithm algorithm
ΔθState Kernel size Ag
911 13 15 17 19 21
10 Before learning 0 0 0 0 0 0 0
After learning 19.52 23.65 35.71 34.13 44.29 37.62 56.1
20 Before learning 0.65 0 0 0 0 0 0
After learning 55.56 67.97 85.62 86.3 100 100 100
30 Before learning 0 0 0 0 0 0
After learning 100 100 100 100 100 100 100
(b)
(a)
Fig. 9. Application of the neural network of edge detection on an image: with small (a) and large (b)
saturation differences.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 9
with a minimal angular deviation Δθ=10
and a filter kernel size Ag= 13. These parameters correlate
with some neurophysiological data on the structure of neurons in the primary visual cortex [21].
As the backpropagation algorithm demonstrates a much better performance as compared to the use of
the cosine measure, the hyperbolic tangent neuron are irradiated using the primer. Figure 8 shows images
for learning and testing of this neuron.
The quality of region detection shown by using the hyperbolic tangent neuron improves after learning
(Table 3). In this case, the minimal kernel size is 17×17 as opposed to the neuron of line and angle detection.
Figure 9 shows the results of detection of angles and lines of different orientation with respect to the
maximal value of activation of second-layer neurons. Aside from explicitly expressed object boundaries, this
method detects insignificant brightness differences corresponding to small details, and there is a high noise
level arising due to parasitic activation of neurons. This drawback can be corrected by applying inhibitory
and inverse corrective relations, which is a further development of the proposed approach.
CONCLUSIONS
The experimental testing shows that the first layer of the neural network of edge detection should be
basedontheGaborfilter(λ=3,Δθ=10
,andAg= 13) for line detection. At the same time, learning
is necessary in order to maximally increase the edge detection accuracy. The region brightness difference is
determined using the hyperbolic tangent with parameters Δθ=20
and Ag= 17.
The working hypothesis in the form of the cosine measure based learning of the neural network can be
regarded as invalid. Using the backpropagation algorithm compensates false activation of the first-layer
neurons and significantly improves the operational quality of the second-layer neurons.
This paper was financially supported by the Russian Foundation for Basic Research (Grant No. 18-37-
00029).
REFERENCES
1. J. Canny, “A Computational Approach to Edge Detection,” IEEE Trans. Pattern Anal. Mach. Intell. PAM I - 8
(6), 679–698 (1986).
2. C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” in Proc. of the 4th Alvey Vision Conf.
Manchester, UK, 31 Aug.– 2 Sept., 1988. DOI: 10.5244/C.2.23.
3. W. F¨orstner and E. G¨ulch, “A Fast Operator for Detection and Precise Location of Distinct Points, Corners and
Centres of Circular Features,” in Proc. of the ISPRS Intercommission Conf. on Fast Processing of Photogram-
metric Data. Interlaken, Switzerland, 2 June, 1987.
4. S. M. Smith and M. Brady, “SUSAN — A New Approach to Low Level Image Processing,” Intern. Journ.
Comput. Vis. 23 (1), 45–78 (1997).
5. T. Serre, L. Wolf, S. Bileschi, et al., “Robust Ob ject Recognition with Cortex-Like Mechanisms,” IEEE Trans.
Pattern Anal. Mach. Intell. 29 (3), 411–426 (2007).
6. F. J. D´ıaz-Pernas and M. Mart´ınez-Zarzuela, M. Ant´on-Rodr´ıguez, and D. Gonz´alez-Ortega, “Learning and
Surface Boundary Feedbacks for Colour Natural Scene Perception,” Appl. Soft Comput. 61, 30–41 (2017).
7. N. Graham and S. S. Wolfson, “Is There Opponent-Orientation Coding in the Second-Order Channels of Pattern
Vision?,” Vis. Res. 44 (27), 3145–3175 (2004).
8. G. Loffler, “Perception of Contours and Shapes: Low and Intermediate Stage Mechanisms,” Vis. Res. 48
(20), 2106–2127 (2008).
9. M. Teichmann, J. Wiltschut, and F. Hamker, “Learning Invariance from Natural Images Inspired by Observations
in the Primary Visual Cortex,” Neural Comput. 24 (5), 1271–1296 (2012).
10. F.-P. Wang, W.-C. Zhang, Z.-F. Zhou, and L. Zhu, “Corner Detection Using Gabor Filters,” IET Image Process.
8(11), 639–646 (2014).
11. Q. Wang and M. W. Spratling, “Contour Detection in Colour Images Using a Neurophysiologically Inspired
Model,” Cognitive Comput. 8(6), 1027–1035 (2016).
12. A. N. Skourikhine, L. Prasad, and B. R. Schlei, “Neural Network for Image Segmentation,” Proc. SPIE. 4120, 28–
35 (2000).
13. S. W. Franklin, “Retinal Vessel Segmentation Employing ANN Technique by Gabor and Moment Invariants-
Based Features,” Appl. Soft Comput. 22, 94–100 (2014).
14. G. Gemignani and A. Rozza, “A Robust Approach for the Background Subtraction Based on Multi-Layered
Self-Organizing Maps,” IEEE Trans. Image Process. 25 (11), 5239–5251 (2016).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
10 KUGAEVSKIKH, AND SOGRESHILIN
15. ISO 11664-4:2008. Colorimetry. P. 4: CTE 1976 L*a*b* colourspace. 2008–11–01.
16. P. Kruizinga and N. Petkov, “Nonlinear Operator for Oriented Texture,” IEEE Trans. Image Process. 8
(10), 1395–1407 (1999).
17. A. V. Kugaevskii, “Comparing the Gabor Filter Parameters for Effective Edge Detection,” Informatsionnye
Tekhnologii 23 (8), 598–605 (2017).
18. H. V. Nguyen and L. Bai, “Cosine Similarity Metric Learning for Face Verification,” in Computer Vision Vol. 6493.
Eds. R. Kimmel, R. Klette, and A. Sugimoto (Springer, Berlin – Heidelberg, 2011).
19. X. Wu, Z.-G. Shi, and L. Liu, “Quasi Cosine Similarity Metric Learning,” in Computer Vision Vol. 9010. Eds.
C. V. Jawahar and S. Shan (Springer International Publishing, Cham, 2015).
20. P. Xia and L. Zhang, and F. Li, “Learning Similarity with Cosine Similarity Ensemble,” Inform. Sci. 307, 39–52
(2015).
21. A. Akbarinia and C. A. Parraga, “Feedback and Surround Modulated Boundary Detection,” Intern. Journ.
Comput. Vis. 126 (12), 1367–1380 (2018).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
... Определив точно края, углы, концы линий и длинные линии, что расширяет представление характеристических точек, мы можем детектировать ребра фигуры Канижа (UR4), на основании которых определяются пары ребер (слой UP5) и сама фигура (слой UI6). Ранее мы подробно описывали модели нейронов детекции краев, углов и фона [8]. Для выделения краев и углов в нейросетевых моделях часто используют фильтр Габора. ...
... Нейроны этого типа также описывались нами в [8] ...
Article
Full-text available
This paper presents the result of designing the architecture of a neural network on bio-inspired neurons, whose task is to work out the mechanism for recognizing an illusory contour using the example of “Kanizsa’s figures”. The neural network made it possible to achieve invariance to the number of corners of the figure and does not lose recognition quality when changing the size of the illusory contour. The main application of the approach can be found in the problem of separating “figure-background” in images.
... В предлагаемой модели выделение движения осуществляется на третьем слое нейронной сети. Первые два слоя осуществляют выделение краев, рисунок 1. Для этого была разработана нейронная сеть [19], основанная на использовании фильтра Габора и гиперболического тангенса. Изображения, поступающие на вход, представлена L* компонентой пространства CIE L*a*b*. ...
Conference Paper
Full-text available
The article presents a new model of the MT neuron (neuron of the middle temporal region), which allows motion detecting and determining its direction and speed without the use of recurrent communication. The model is based on signal accumulation and is organized using a space-time vector that sets the weighting coefficients. The space-time vector is formed using the product of the Gaussian, which defines the spatial component, and the "Mexican hat" wavelet, which sets the time vector of the change in the receptive field. This configuration allows not only to motion detect, but also to make the model not sensitive to uniform or textural fill. The model is presented in variants for determining linear and rotational motion. Motion, in this case, is the sequential activation of several edge selection neurons located in the same direction in a certain neighborhood over time i.e. with a change of frame. To assess the motion, the models were tested on the MPI Sintel dataset. The model developed by us shows results better than Spatio-Temporal Gabor. The best accuracy of determining the direction of movement can be obtained with the size of the space-time vector (7*7, 7).
... Для выделения краев мы используем двухслойную нейронную сеть (рис. 2), предложенную нами в работе [7]. Выходы первого слоя такой сети помимо своей основной функции (выделения краев) служат также основой для работы нейронов конца линий. ...
Article
Full-text available
This article is dedicated to modeling the end-stopped neuron. This type of neuron gives the maximum response at the end of the line and is used to refine the edge. The article provides an overview of different models of end-stopped neurons. I have proposed a simpler and more accurate model of an end-stopped neuron based on the use of Gabor filters in antiphase. For this purpose, the models of simple and complex cells whose output is used in the proposed model are also described. Simple cells are based on the use of a Gabor filter, the parameters of which are also described in this article. The proposed model has shown its effectiveness.
... The motion analysis begins by highlighting the edges. For this purpose, we constructed a two-layer neural network [9] based on the use of the Gabor filter and the hyperbolic tangent as functions for generating receptive fields of neurons. ...
Conference Paper
Full-text available
The paper presents a new model of the MT-neuron (Middle temporal area neuron), which allows detecting movement and determining its direction and speed, without using recurrent connection. The model is based on the accumulation of the signal and is organized using a space-time vector that sets the weight coefficients. Despite the combinatorial redundancy, it is assumed that the model is more resistant to glare in comparison with the optical flow.
Article
Full-text available
In this paper discusses the current situation in Russia and the world in the field of development of sign languages translation system. The main problems are formulated, and ways to solve them are given. One of the most important unresolved tasks is the task of recognizing the gestures of the deaf. To effectively solve it, an approach based on the development of bio-inspired neural networks is proposed. The architecture of a bio-inspired neural network, including four types of neurons, is described. New simpler MT neuron model proposed.
Article
Full-text available
The current problem is considered in the article - early diagnosis of skin melanoma, unsatisfactory results of early diagnosis can be solved, among other things, by training primary contact specialists - oncologists, dermatologists, and surgeons. The authors conducted training in a master class format according to the authors patented methodology of specialists, which led to an improvement in the sensitivity, specificity, and accuracy of diagnosis of skin melanoma. The training was conducted in 3 cities - Samara, Moscow, Chelyabinsk. It is concluded that the use of training methods for specialists in the master class format can significantly improve the diagnosis of skin melanoma and reduce the unjustified number of biopsies.
Article
Full-text available
In this paper I proposed generalized formula of Gabor filter. It was made according to the extensive review of sources. I carried out comparing of different coefficients influencing the form of a kernel on a DC component metrics. Thus the optimum formula was defined. The significant parameters (wavelength and scale of the filter) influencing quality of edges detection were revealed. The metrics of assessment on which comparing of different relations of significant parameters is carried out was offered. As a result the optimum ratio of significant parameters was found. It is proved that the equation offered by Petkov is optimum (equation 8). That was confirmed by the calculation of the DC component and the analytical comparison of the influence of different wavelengths on the form of the kernel graphs. Also, the description of the selection methods for the ratio of the wavelength and filter scale was performed. Quality assessment of edge detection metrics SNRe was proposed. Its application made possible to speak of the optimality ratio the use of the half-response spatial frequency bandwidth (equation 23).
Article
Full-text available
Edges are key components of any visual scene to the extent that we can recognise objects merely by their silhouettes. The human visual system captures edge information through neurons in the visual cortex that are sensitive to both intensity discontinuities and particular orientations. The “classical approach” assumes that these cells are only responsive to the stimulus present within their receptive fields, however, recent studies demonstrate that surrounding regions and inter-areal feedback connections influence their responses significantly. In this work we propose a biologically-inspired edge detection model in which orientation selective neurons are represented through the first derivative of a Gaussian function resembling double-opponent cells in the primary visual cortex (V1). In our model we account for four kinds of receptive field surround, i.e. full, far, iso- and orthogonal-orientation, whose contributions are contrast-dependant. The output signal from V1 is pooled in its perpendicular direction by larger V2 neurons employing a contrast-variant centre-surround kernel. We further introduce a feedback connection from higher-level visual areas to the lower ones. The results of our model on three benchmark datasets show a big improvement compared to the current non-learning and biologically-inspired state-of-the-art algorithms while being competitive to the learning-based methods.
Article
Full-text available
Background The predictive coding/biased competition (PC/BC) model of V1 has previously been applied to locate boundaries defined by local discontinuities in intensity within an image. Objective Here PC/BC is extended to perform contour detection for colour images. Methods The proposed extensions are inspired by neurophysiological data from single neurons in macaque primary visual cortex (V1). Results The behaviour of this extended model is consistent with the neurophysiological experimental results. Furthermore, when compared to methods used for contour detection in computer vision, the colour PC/BC model of V1 slightly outperforms some recently proposed algorithms which use more cues and/or require a complicated training procedure. Conclusions The colour PC/BC model of V1 can successfully simulate the responses properties of orientation-selective double-opponent neuron in macaque V1 and has practical applications for contour detection in natural images.
Conference Paper
Consistency of image edge filtering is of prime importance for 3D interpretation of image sequences using feature tracking algorithms. To cater for image regions containing texture and isolated features, a combined corner and edge detector based on the local auto-correlation function is utilised, and it is shown to perform with good consistency on natural imagery.
Article
Boundary detection and segmentation are essential stages in object recognition and scene understanding. In this paper, we present a bio-inspired neural model of the ventral pathway for colour contour and surface perception, called LPREEN (Learning and Perceptual boundaRy rEcurrent dEtection Neural architecture). LPREEN models colour opponent processes and feedback interactions between cortical areas V1, V2, V4, and IT, which produce top-down and bottom-up information fusion. We suggest three feedback interactions that enhance and complete boundaries. Our proposed neural model contains a contour learning feedback that enhances the most probable contour positions in V1 according to a previous experience, and generates a surface perception in V4 through diffusion processes. We compared the proposed model with another bio-inspired model and two well-known contour extraction methods, using the Berkeley Segmentation Benchmark. LPREEN showed better performance than two methods and slightly worse performance than another one.
Conference Paper
It is vital to select an appropriate distance metric for many learning algorithm. Cosine distance is an efficient metric for measuring the similarity of descriptors in classification task. However, the cosine similarity metric learning (CSML) [3] is not widely used due to the complexity of its formulation and time consuming. In this paper, a Quasi Cosine Similarity Metric Learning (QCSML) is proposed to make it easy. The normalization and Lagrange multipliers are employed to convert cosine distance into simple formulation, which is convex and its derivation is easy to calculate. The complexity of the QCSML algorithm is O(\(t\times p\times d\)) (The parameters \(t\), \(p\), \(d\) represent the number of iterations, the dimensionality of descriptors and the compressed features.), while the complexity of CSML is O(\(r\times b\times g\times s\times d\times m\)) (From the paper [3], \(r\) is the number of iterations used to optimize the projection matrix, \(b\) is the number of values tested in cross validation process, \(g\) is the number of steps in the Conjugate Gradient method, \(s\) is the number of training data, \(d\) and \(m\) are the dimensions of projection matrix.). The experimental results of our method on UCI datasets for classification task and LFW dataset for face verification problem are better than the state-of-the-art methods. For classification task, the proposed approach is employed on Iris, Ionosphere and Wine dataset and the classification accuracy and the time consuming are much better than the compared methods. Moreover, our approach obtains \(92.33\,\%\) accuracy for face verification on unrestricted setting of LFW dataset, which outperforms the state-of-the-art algorithms.
Article
Motion detection in video streams is a challenging task for several computer vision applications. Indeed, segmentation of moving and static elements in the scene allows to increase the efficiency of several challenging tasks such as human computer interface (HCI), robot visions, and intelligent surveillance systems. In this paper, we approach motion detection through a multilayered artificial neural network, which is able to build for each background pixel a multi-modal color distribution evolving over time through self organization. According to the winnertake- all rule, each layer of the network models an independent state of the background scene, in response to external disturbing conditions, such as illumination variations, moving backgrounds, and jittering. As a result, our background subtraction method exhibits high generalization capabilities that in combination with a post-processing filtering schema allow to produce accurate motion segmentation. Moreover, we propose an approach to detect anomalous events (such as camera motion) that require background model re-initialization. We describe our method in full details and we compare it against the most recent background subtraction approaches. Experimental results for video sequences from the 2012 and 2014 CVPR Change Detection datasets demonstrate how our methodology outperforms many state-of-the-art methods in terms of detection rate.
Article
There is no doubt that similarity is a fundamental notion in the field of machine learning and pattern recognition. How to represent and measure similarity appropriately is a pursuit of many researchers. Many tasks, such as classification and clustering, can be accomplished perfectly when a similarity metric is well-defined. Cosine similarity is a widely used metric that is both simple and effective. This paper proposes a cosine similarity ensemble (CSE) method for learning similarity. In CSE, diversity is guaranteed by using multiple cosine similarity learners, each of which makes use of a different initial point to define the pattern vectors used in its similarity measures. The CSE method is not limited to measuring similarity using only pattern vectors that start at the origin. In addition, the thresholds of these separate cosine similarity learners are adaptively determined. The idea of using a selective ensemble is also implemented in CSE, and the proposed CSE method outperforms other compared methods on various data sets.
Article
This study proposes a contour-based corner detector using the magnitude responses of the imaginary part of the Gabor filters on contours. Unlike the traditional contour-based methods that detect corners by analysing the shape of the edge contours and searching for local curvature maxima points on planar curves, the proposed corner detector combines the pixels of the edge contours and their corresponding grey-variation information. Firstly, edge contours are extracted from the original image using Canny edge detector. Secondly, the imaginary parts of the Gabor filters are used to smooth the pixels on the edge contours. At each edge pixel, the magnitude responses at each direction are normalised by their values and the sum of the normalised magnitude response at each direction is used to extract corners from edge contours. Thirdly, both the magnitude response threshold and the angle threshold are used to remove the weak or false corners. Finally, the proposed detector is compared with five state-of-the-art detectors on some grey-level images. The results from the experiment reveal that the proposed detector is more competitive with respect to detection accuracy, localisation accuracy, affine transforms and noise-robustness.
Article
Diabetic retinopathy (DR) is the major ophthalmic pathological cause for loss of eye sight due to changes in blood vessel structure. The retinal blood vessel morphology helps to identify the successive stages of a number of sight threatening diseases and thereby paves a way to classify its severity. This paper presents an automated retinal vessel segmentation technique using neural network, which can be used in computer analysis of retinal images, e.g., in automated screening for diabetic retinopathy. Furthermore, the algorithm proposed in this paper can be used for the analysis of vascular structures of the human retina. Changes in retinal vasculature are one of the main symptoms of diseases like hypertension and diabetes mellitus. Since the size of typical retinal vessel is only a few pixels wide, it is critical to obtain precise measurements of vascular width using automated retinal image analysis. This method segments each image pixel as vessel or nonvessel, which in turn, used for automatic recognition of the vasculature in retinal images. Retinal blood vessels are identified by means of a multilayer perceptron neural network, for which the inputs are derived from the Gabor and moment invariants-based features. Back propagation algorithm, which provides an efficient technique to change the weights in a feed forward network is utilized in our method. The performance of our technique is evaluated and tested on publicly available DRIVE database and we have obtained illustrative vessel segmentation results for those images. (C) 2014 Published by Elsevier B.V.