Content uploaded by Alexander Kugaevskikh
Author content
All content in this area was uploaded by Alexander Kugaevskikh on Oct 08, 2019
Content may be subject to copyright.
ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2019,Vol. 5 5,No. 4,pp. 1–10.c
Allerton Press, Inc., 2019.
Original Russian Text c
A.V. Kugaevskikh, A.A. Sogreshilin, 2019, published in Avtometriya, 2019, Vol. 55, No. 4, pp. 118–128.
ANALYSIS AND SYNTHESIS OF SIGNALS AND IMAGES
Analyzing the Efficiency of Segment Boundary
Detection using Neural Networks
A. V. Kugaevskikha,b,c and A. A. Sogreshilinb
aNovosibirsk State Technical University,
pr. Karla Marksa 20, Novosibirsk, 630073 Russia
bNovosibirsk State University,
ul. Pirogova 1, Novosibirsk, 630090 Russia
3Institute of Automation and Electrometry, Siberian Branch, Russian Academy of Sciences,
pr. Akademika Koptyuga 1, Novosibirsk, 630090 Russia
E-mail: a-kugaevskikh@yandex.ru
Received April 4, 2019; revision received May 23, 2019; accepted for publication May 27, 2019
Abstract—This paper describes the architecture of a neural network for edge detection. Different fil-
ters for first-layer neurons are compared. Neural network learning based on a cosine measure algorithm
shows much worse results than an error backpropagation algorithm. Optimal parameters for first-layer
neuron operation are given. The proposed architecture fulfills the stated tasks on edge selection.
Keywords: edge selection, Gabor filter, cosine measure, neural networks, wavelet sombrero, hyperbolic
tangent.
DOI:?
INTRODUCTION
Image brightness segmentation is a basis for computer vision systems. The results of brightness segmen-
tation are used to construct color consistency mechanisms, motion analysis, and recognition via detection
of local features usually represented by line intersections. The solution of the edge detection problem is based
on a sharp difference in the brightness at the segment boundary. This sharp decrease can be determined
by the Sobel filter and the Canny algorithm [1]. Brightness variation in a certain vicinity serves as a basis
for angle detectors, such as a Harris detector [2], a Ferstner detector [3], and a SUSAN detector [4], while
the angle detectors are noisy and often give false alarms.
In neural network models, the edge detection and angle detection mechanisms (for example, a Gaussian
filteroraGaborfilter)werealsoused to determine local features. The models including HMAX [5] and
LPREEN [6] were based on a Gabor filter, which served as a basis for the neuron models of simple opposition
and double opposition [7–10]. A Gaussian filter as a basis for an opponent process was used in [11]. Simple
and double opposition based on a Gabor filter or a Gaussian filter was applied to simulate the color constancy
effect in order to adjust the scene illumination level.
Aside from convolutional and deep networks, the problem of neural network segmentation also involves
classical architectures: pulse-coupled neural network [12], multilayer perceptron [13], and Kohonen self-
organizing maps [14]. Independently of the neural network model, local features are detected by determining
the brightness difference.
A Gabor filter is most often used to detect textures due to its periodicity, with the “Mexican hat” (“som-
brero”) wavelet serving as its alternative in the edge detection problem. Further construction of computer
vision systems requires analyzing the applicability of different filters from the standpoint of accuracy of edge
detection and learning.
1
2 KUGAEVSKIKH, AND SOGRESHILIN
The pairs of angles
are connected.
Learning
Unit response
vector
36 neurons
respond to angles 630 neurons
respond to angles
Weights are the receptive
fields under study.
Not learning
Fig. 1. Neural networks of edge detection.
ARCHITECTURE OF THE NEURAL NETWORK OF EDGE DETECTION
It is proposed to carry out edge detection on images using a convolutional neural network for a number
of reasons. First of all, the convolution operation equivalent to a weighted sum allows detecting lines
of strictly defined length and orientation due to a kernel configured for a specific length and orientation,
which, in turn, make is possible to obtain local features for classification. Second, this neural network is
functionally similar to the things occurring in the primary visual cortex. Convolution distribution with
a Gabor filter kernel for line detection is due to similarity to the process taking place in the primary visual
cortex. Third, the architecture of the convolutional neural network allows adding color segmentation, motion
detection, scene plan construction, and object recognition using edge detection data.
Image component L∗in a color space CIE L∗a∗b∗is fed to the neural network input [15]:
L∗= 116f(Y/Yn)−16.(1)
Here Ydenotes the component of the color space CIE XY Z and Yn= 100 is the Ycoordinate of the white
spot for light source D65 and explorer 2◦;
f(x)=x1/3,x>(6/29)3,
(29/6)2/3+4/29 otherwise.
Brightness segmentation can be carried out using a two-layer neural network (Fig. 1). Lines of certain
orientation are detected on the first layer. The second layer is responsible for line combination, including
angles. The difference of learning convolutional layers, namely the first layer of simple cells in a neocognitron
network, comes down to the use of previously configured receptive fields of first-layer neurons, which increases
the predictability and interpretability of the network operation results.
Each layer contains four types of neurons, differing in the receptive field configuration. At the same
time, interlaminar bonding is organized in a special way. Each second-layer neuron (UC2) is only connected
to two first-layer neurons (US1) (Fig. 2). Thus, the second-layer neurons allow for line and angle detection
(in the case of a Gabor filter) and quadrangles (in the case of a hyperbolic tangent).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 3
US1
UC2
Fig. 2. Principles of organization of interlaminar bonding.
RECEPTIVE FIELDS OF NEURONS
The types of first-layer neurons, also known as simple cells, are given in Table 1. Each type of neurons
is represented by an orientation group comprised of 36 neurons. Each neuron is sensitive to a line of certain
orientation with a deviation step of 10◦(Δθ).
The operation of first-layer neurons is the two-dimensional convolution of the window of a brightness
pixel array and the corresponding weight array determined using the corresponding function.
The first two types of neurons react to the lines of preferred orientation θ. Their receptive fields are
formed using the Gabor filter [16], which is chosen for the two following reasons. First, its function allows
forming a receptive field of a simple neuron for detecting the lines of certain orientation. Second, unlike the
sombrero wavelet, its constant component is much closer to zero, which reduces the parasitic activation of
the first-layer neurons. The uniform filling of the receptive field of such a neuron yields a value close to zero.
The Gabor filter allows detecting bright lines on a dark background and dark lines on a bright background:
G1,2=exp−x2+γ2y2
2σ2cos 2πx
λ+ϕ.(2)
It was shown in [17] that Eq. (2) had the optimal constant component of the filter, thereby reducing
convolution induced noise. Another ratio optimal for filtering is σ/λ ≈0.56, which is derived from a
bandwidth of 1 octave. For the edge detection problem, the Gabor filter has the following parameters:
γ=0.1andθ∈[0; 350].
The sombrero wavelet is often used as an alternative to the Gabor filter:
G=exp−x2+γ2y2
2σ21−x2+γ2y2
σ2.(3)
Figure 3 shows the comparison of the graphs of the Gabor filter (λ=2andλ= 3) and the sombrero
wavelet.
With a kernel size 5 ×5, the Gabor filter (λ= 2) has a second wavelet, which works well in texture
analysis, but distorts the value of activation of the neuron learned for the line of particular orientation. The
choice of λ= 3 shifts the secondary wavelet beyond the kernel boundaries of the Gabor filter, while a peak
thickness of 1 pixel is retained. The application of the Gabor filter is more preferable than the sombrero
wavelet also because it is easy to select σas a function the filter kernel size (Fig. 4).
The following types of neurons are necessary for determining the brightness difference regions, with the
receptive field being formed using a smooth function so that different nuances, such as blurring in fog or
finding shadows, are accounted for. For this reason, the Haar wavelet cannot be used, but the receptive field
can possibly be configured with the help of a hyperbolic tangent (Fig. 5):
G3,4= tanh(2kx/λ).(4)
To account for different sizes of the brightness difference regions, the λparameter is introduced into
Eq. (4). Expressions (2) and (4) use the rotation matrix
x=xcos θ+ysin θ,
y=−xsin θ+ycos θ. (5)
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
4 KUGAEVSKIKH, AND SOGRESHILIN
1.0
0.8
0.6
0.4
0.2
0
_5 _4 _3 _2 _1 0 1 2 435
_0.2
_0.4
_0.6
_0.8
21
3
Fig. 3. Filter graph comparison ((1) shows the Gabor filter for λ=2,(2) shows the Gabor filter
for λ=3,and(3) shows the sombrero filter).
1.00
_0.38
_0.04
0.31
0.66
1.00
_0.43
_0.08
0.28
0.64
Fig. 4. Maps of receptive fields: Gabor filter (a) and sombrero wavelet (b).
1.0
_1.0
_0.5
0
0.5
_2_1 0
_1.0
_0.5
0
0.5
1.0
12
_2 _1 012
Fig. 5. Receptive field using the hyperbolic tangent equation.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 5
Ta b l e 1 . Types of receptive fields of first-layer neurons
No. Figure Function Parameters
1+
+
+
__ Gabor
filter
γ=0.1,
ϕ=0
2_
_
_
++ Gabor
filter
γ=0.1,
ϕ=−π
3_+
_
_+
+
Hyperbolic
tangent k=1
4+_
+
+_
_
Hyperbolic
tangent k=−1
Fig. 6. Some images for learning and testing of the neuron for line and angle detection.
The first-layer neurons yield activation for the lines in a deviation range of ±20◦from preferred orienta-
tion. This can cause problems for accurately determining a feature on the next layer.
NETWORK LEARNING
A delta rule using a cosine measure was tested as a working hypothesis for neural network learning.
A detailed algorithm was published in [18–20]. The cosine measure itself (6) is a metric of spreading
the vectors A=(u1w1,u
2w2)andB=(u3w3,u
4w4), which determine the image angles found using
the second layer of the brightness segmentation model
msim =AB
AB.(6)
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
6 KUGAEVSKIKH, AND SOGRESHILIN
(b)
(c)
(a)
11 13 15 17 19 219
0
20
40
60
80
100
11 13 15 17 19 219
0
20
40
60
80
Before learning
Cosine
Back-prop
100
11 13 15 17 19 219
0
20
40
60
80
100
Fig. 7. Comparing the efficiency of learning algorithms: for the Gabor filter (λ= 2) (a), for the
Gabor filter (λ= 3) (b), and for the sombrero wavelet (c).
Iterative learning results in maximization of the cosine measure. On each iteration, the corresponding
weights of two second-layer neurons with different angles of the first-layer neurons are recalculated (θl1
and θl2):
Δwl=qmsim ulcos (θl2−θl1),
Here ulis the output of the learning second-layer neuron, and qis the learning rate.
EXPERIMENTAL TESTING
Learning and experimental testing of the behavior of different filters are carried out using black-and-white
image packages. Figure 6 shows some of 630 images for the learning of the neuron of line and angle detection.
The efficiency of the proposed algorithm is compared on the basis of the cosine measure and backprop-
agation algorithm (the measure is the root-mean-square energy of the error, and the delta rule is applied
using a local gradient, back-prop) for each type of filters (Fig. 7). The indicator of learning algorithms and
receptive field configurations is the accuracy of operation of the second-layer neurons, i.e., detection of angles
and lines of strictly certain orientation. For example, if an angle of lines with orientations of 0 and 70◦,then
the second-layer neuron 0 −70◦should activate. The network learns the same way. The accuracy is defined
as a ratio of the number of line combinations correctly recognized by the second layer to the number of such
combinations (630 for the Gabor filter based neuron and 35 for the hyperbolic tangent based neuron for each
receptive field size).
As seen from Fig. 7, the proposed cosine measure based algorithm is less effective as compared with the
backpropagation algorithm.
Table 2 lists the experimental results regarding the accuracy of operation of the second-layer neurons of
line and angle detection. Considering these data, the optimal approach is to use the Gabor filter (λ=3)
as a basis for the neuron of line and angle detection because this neuron yields a detection quality of 100%
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 7
Ta b l e 2 . Comparing the operational efficiency of the filters before
and after the backpropagation algorithm based learning
ΔθState Kernel size Ag
911 13 15 17 19 21
Gabor filter (λ=2)
10 Before learning 13.65 26.03 30.48 40.79 23.97 56.19 100
After learning 32.22 69.37 60.48 61.59 40 99.84 100
20 Before learning 32.03 50.98 43.79 57.52 43.14 59.48 100
After learning 69.93 91.5 95.42 96.73 67.32 100 100
30 Before learning 100 42.42 42.42 100 63.64 42.42 100
After learning 100 100 100 100 83.33 100 100
Gabor filter (λ=3)
10 Before learning 16.83 28.73 43.17 49.84 46.03 58.41 100
After learning 58.41 90.95 100 96.35 100 97.3 100
20 Before learning 47.06 67.32 81.7 88.89 99.35 92.81 100
After learning 100 100 100 100 100 100 100
30 Before learning 100 53,03 100 100 100 93.94 100
After learning 100 100 100 100 100 100 100
Sombrero wavelet
10 Before learning 16.83 29.05 44.44 49.21 37.62 57.62 100
After learning 58.1 87.78 99.37 94.6 100 96.19 100
20 Before learning 47.06 64.71 72.55 82.35 99.35 77.78 100
After learning 100 99.35 100 100 100 100 100
30 Before learning 100 50 87.88 100 100 65.15 100
After learning 100 98.48 100 100 100 100 100
Fig. 8. Set of images for the learning of the hyperbolic tangent neuron.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
8 KUGAEVSKIKH, AND SOGRESHILIN
Ta b l e 3 . Region detection quality (in %) before and after backpropagation algorithm algorithm
ΔθState Kernel size Ag
911 13 15 17 19 21
10 Before learning 0 0 0 0 0 0 0
After learning 19.52 23.65 35.71 34.13 44.29 37.62 56.1
20 Before learning 0.65 0 0 0 0 0 0
After learning 55.56 67.97 85.62 86.3 100 100 100
30 Before learning 0 0 0 0 0 0
After learning 100 100 100 100 100 100 100
(b)
(a)
Fig. 9. Application of the neural network of edge detection on an image: with small (a) and large (b)
saturation differences.
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
ANALYZING THE EFFICIENCY OF SEGMENT BOUNDARY DETECTION 9
with a minimal angular deviation Δθ=10
◦and a filter kernel size Ag= 13. These parameters correlate
with some neurophysiological data on the structure of neurons in the primary visual cortex [21].
As the backpropagation algorithm demonstrates a much better performance as compared to the use of
the cosine measure, the hyperbolic tangent neuron are irradiated using the primer. Figure 8 shows images
for learning and testing of this neuron.
The quality of region detection shown by using the hyperbolic tangent neuron improves after learning
(Table 3). In this case, the minimal kernel size is 17×17 as opposed to the neuron of line and angle detection.
Figure 9 shows the results of detection of angles and lines of different orientation with respect to the
maximal value of activation of second-layer neurons. Aside from explicitly expressed object boundaries, this
method detects insignificant brightness differences corresponding to small details, and there is a high noise
level arising due to parasitic activation of neurons. This drawback can be corrected by applying inhibitory
and inverse corrective relations, which is a further development of the proposed approach.
CONCLUSIONS
The experimental testing shows that the first layer of the neural network of edge detection should be
basedontheGaborfilter(λ=3,Δθ=10
◦,andAg= 13) for line detection. At the same time, learning
is necessary in order to maximally increase the edge detection accuracy. The region brightness difference is
determined using the hyperbolic tangent with parameters Δθ=20
◦and Ag= 17.
The working hypothesis in the form of the cosine measure based learning of the neural network can be
regarded as invalid. Using the backpropagation algorithm compensates false activation of the first-layer
neurons and significantly improves the operational quality of the second-layer neurons.
This paper was financially supported by the Russian Foundation for Basic Research (Grant No. 18-37-
00029).
REFERENCES
1. J. Canny, “A Computational Approach to Edge Detection,” IEEE Trans. Pattern Anal. Mach. Intell. PAM I - 8
(6), 679–698 (1986).
2. C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” in Proc. of the 4th Alvey Vision Conf.
Manchester, UK, 31 Aug.– 2 Sept., 1988. DOI: 10.5244/C.2.23.
3. W. F¨orstner and E. G¨ulch, “A Fast Operator for Detection and Precise Location of Distinct Points, Corners and
Centres of Circular Features,” in Proc. of the ISPRS Intercommission Conf. on Fast Processing of Photogram-
metric Data. Interlaken, Switzerland, 2 June, 1987.
4. S. M. Smith and M. Brady, “SUSAN — A New Approach to Low Level Image Processing,” Intern. Journ.
Comput. Vis. 23 (1), 45–78 (1997).
5. T. Serre, L. Wolf, S. Bileschi, et al., “Robust Ob ject Recognition with Cortex-Like Mechanisms,” IEEE Trans.
Pattern Anal. Mach. Intell. 29 (3), 411–426 (2007).
6. F. J. D´ıaz-Pernas and M. Mart´ınez-Zarzuela, M. Ant´on-Rodr´ıguez, and D. Gonz´alez-Ortega, “Learning and
Surface Boundary Feedbacks for Colour Natural Scene Perception,” Appl. Soft Comput. 61, 30–41 (2017).
7. N. Graham and S. S. Wolfson, “Is There Opponent-Orientation Coding in the Second-Order Channels of Pattern
Vision?,” Vis. Res. 44 (27), 3145–3175 (2004).
8. G. Loffler, “Perception of Contours and Shapes: Low and Intermediate Stage Mechanisms,” Vis. Res. 48
(20), 2106–2127 (2008).
9. M. Teichmann, J. Wiltschut, and F. Hamker, “Learning Invariance from Natural Images Inspired by Observations
in the Primary Visual Cortex,” Neural Comput. 24 (5), 1271–1296 (2012).
10. F.-P. Wang, W.-C. Zhang, Z.-F. Zhou, and L. Zhu, “Corner Detection Using Gabor Filters,” IET Image Process.
8(11), 639–646 (2014).
11. Q. Wang and M. W. Spratling, “Contour Detection in Colour Images Using a Neurophysiologically Inspired
Model,” Cognitive Comput. 8(6), 1027–1035 (2016).
12. A. N. Skourikhine, L. Prasad, and B. R. Schlei, “Neural Network for Image Segmentation,” Proc. SPIE. 4120, 28–
35 (2000).
13. S. W. Franklin, “Retinal Vessel Segmentation Employing ANN Technique by Gabor and Moment Invariants-
Based Features,” Appl. Soft Comput. 22, 94–100 (2014).
14. G. Gemignani and A. Rozza, “A Robust Approach for the Background Subtraction Based on Multi-Layered
Self-Organizing Maps,” IEEE Trans. Image Process. 25 (11), 5239–5251 (2016).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019
10 KUGAEVSKIKH, AND SOGRESHILIN
15. ISO 11664-4:2008. Colorimetry. P. 4: CTE 1976 L*a*b* colourspace. 2008–11–01.
16. P. Kruizinga and N. Petkov, “Nonlinear Operator for Oriented Texture,” IEEE Trans. Image Process. 8
(10), 1395–1407 (1999).
17. A. V. Kugaevskii, “Comparing the Gabor Filter Parameters for Effective Edge Detection,” Informatsionnye
Tekhnologii 23 (8), 598–605 (2017).
18. H. V. Nguyen and L. Bai, “Cosine Similarity Metric Learning for Face Verification,” in Computer Vision Vol. 6493.
Eds. R. Kimmel, R. Klette, and A. Sugimoto (Springer, Berlin – Heidelberg, 2011).
19. X. Wu, Z.-G. Shi, and L. Liu, “Quasi Cosine Similarity Metric Learning,” in Computer Vision Vol. 9010. Eds.
C. V. Jawahar and S. Shan (Springer International Publishing, Cham, 2015).
20. P. Xia and L. Zhang, and F. Li, “Learning Similarity with Cosine Similarity Ensemble,” Inform. Sci. 307, 39–52
(2015).
21. A. Akbarinia and C. A. Parraga, “Feedback and Surround Modulated Boundary Detection,” Intern. Journ.
Comput. Vis. 126 (12), 1367–1380 (2018).
OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING Vol. 55 No. 4 2019