Content uploaded by Zhiming Cai
Author content
All content in this area was uploaded by Zhiming Cai on Aug 22, 2020
Content may be subject to copyright.
Evaluation of Image Feature Detection and Matching Algorithms
Yiwen Ou
School of Information Science and Engineering
Fujian University of Technology
Fuzhou, China
e-mail: 1172704941@qq.com
Zhiming Cai*
National Demonstration Center for Experimental
Electronic Information and Electrical Technology
Education,
Fujian University of Technology
Fuzhou, China
Corresponding author, e-mail: caizm@fjut.edu.cn
Jian Lu
School of Information Science and Engineering
Fujian University of Technology
Fuzhou, China
e-mail: 573843470@qq.com
Jian Dong
School of Information Science and Engineering
Fujian University of Technology,
Fuzhou, China
e-mail: 2711713088@qq.com
Yufeng Ling
School of Information Science and Engineering
Fujian University of Technology,
Fuzhou, China
e-mail: 1504662829@qq.com
Abstract—Image features detection and matching algorithms
play an important role in the field of machine vision. Among
them, the computational efficiency and robust performance of
the features detector descriptor selected by the algorithm have
a great impact on the accuracy and time consumption of image
matching. This paper comprehensively evaluates typical SIFT,
SURF, ORB, BRISK, KAZE, AKAZE algorithms. The Oxford
dataset is used to compare the robustness of various
algorithms under illumination transformation, rotation
transformation, scale transformation, blur transformation,
and viewpoint transformation. Jitter video is also used to
compare the anti-jitter ability for these algorithms. The
indicators compared include: time of detecting features, time
of matching images, total running time, number of detected
feature points, accuracy, number of repeated feature points,
and repetition rate. Experimental results show that, Under
different transformations, each algorithm has its own
advantages and disadvantages.
Keywords-features detection and matching; comprehensively
evaluates; robustness
I. INTRODUCTION
Feature point detection and matching algorithms in
images have been widely used in many machine vision
fields, such as real-time location and 3D reconstruction[1],
pose estimation[2], object recognition[3], intelligent device
application[4], slam (simultaneous localization and mapping)
[5, 6], automatic driving, robot navigation[7], AR[8], etc.
The algorithms can be classified into two categories.
A. Algorithms Based on Blob Detection
Scale invariant feature transform (SIFT) algorithm[9]
was proposed by David G. Lowe in 1999, and then
improved in 2004[10]. SURF (Speed Up Robust Features)
algorithm was first proposed by Bay et al [11]in 2006 and
improved in 2008[12]. This algorithm is a robust local
feature detection algorithm. A more stable feature detection
algorithm KAZE[13] appears in ECCV 2012 than SIFT. In
2013, PF Alcantarilla et al. presented Accelerated-KAZE
(AKAZE) algorithm[14],which adopts nonlinear diffusion
filtering. AKAZE improves repeatability and uniqueness
compared with SIFT and SURF.
B. Algorithms Based on Corner Detection
ORB (Oriented Fast and Rotated Binary Robust
Independent Elementary Features) was proposed by Ruble et
al.[15] in 2011. BRISK(Binary Robot Invariant Scalable
keypoints) method was proposed by Stefan et al in 2011[16],
which realized the detection, description and matching of
image feature points.
Image feature detection and matching algorithms usually
have the following steps: 1) Detection and description of
feature points; 2) Matching of feature points; 3) Rough
matching of feature points and using the RANSAC method
220
978-1-7281-6136-5/20/$31.00 ©2020 IEEE
Authorized licensed use limited to: Fujian University of Technology. Downloaded on July 07,2020 at 08:33:48 UTC from IEEE Xplore. Restrictions apply.
to perform " Purify "(that is, remove outliers). 4) The
features obtained from 3) are then matched (good matches).
In this paper, SIFT, SURF, ORB, BRIKS, KAZE,
AKAZE algorithms are evaluated with Oxford dataset which
providing video sequence for rotation transformation, scale
transformation, illumination transformation, blur
transformation, viewpoint transformation. We also use jitter
video stream to verify the robustness of anti-jitter.
II. FUNDAMENTALS OF EVALUATION
A. Experimental Setup
OPENCV 3.4 has been used for experiments presented in
this paper. Specifications of the computer system used are:
Intel(R) Core(TM) i5-4210U CPU @1.70GHz 2.40GHz and
4.00GB RAM.
B. Datasets
Two groups of experiments, namely the robustness of
various algorithms under different transformations and the
anti-jitter ability are performed.
The Oxford dataset
(http://www.robots.ox.ac.uk/~vgg/research/affine/) is used
as the first set of experimental data to evaluate the robust
performance of each algorithm under illumination
transformation, blur transformation, scale transformation,
rotation transformation and viewpoint transformation. The
leuven and graf image packages are used originally. For
other image packages, the first image is kept and the others
are deleted. In bikes packages, the first image is filtered by 5
× 5 mean, Gauss and median filters. All the result images are
put in the same package which denoted as updated bikes
package. In boats package, the first image is contracted by
0.2 times, 0.5 times sampling and enlarged by 1.5 times, 2
times sampling. In bark package, the first image is rotated by
15 °, 30 °, 45 °, 60 °, 90 ° 180 °respectively. All the result
images are put in corresponding package.
The second set of experimental data is a video suffering
from strong rolling shutter
artifacts(http://web.cecs.pdx.edu/~fliu/project/subspace_stab
ilization/). Two frames of the video are extracted to verify
the anti-jitter performance of each algorithm.
This paper use the following indicators to describe the
performance of various algorithms: 1) time of detecting
features; 2) time of matching images; 3) total running time;
4) number of detected feature points; 5) accuracy (The
number of feature points filtered by RANSAC divided by
the number of feature points after rough matching); 6)
correspondence(repeat feature point pairs for feature point
detection) ;7) repeatability(correspondence divided by the
minimum number of detected feature points in two pictures).
III. EVALUATION OF ROBUSTNESS OF VARIOUS
ALGORITHMS
To evaluate the robustness of every algorithm, the first
image is matched with the rest images one by one in the
updated package. As some algorithms may fail under certain
transformation, this paper takes the average of the
successfully matched data set as the experimental result. To
get reasonable experimental result, each pair of images are
matched for 5 times, and the detecting and matching time is
averaged.
A. Evaluate the Indicators of Each Algorithm under Each
Transformation
1) The number of features detect and correspondence
As shown in the Table I, the number of feature points
detected by SIFT, SURF and BRISK are several times that
of the other three, and the ORB gets the least.
Except for scale transformation, the number of repeated
feature points detected by SURF algorithm is the highest.
ORB algorithm has the least number of repeated feature
points;
TABLE I. THE NUMBER OF FEATURES DETECTED AND CORRESPONDENCE
OF VARIOUS ALGORITHMS
DOJRULWKPV
LQGLFDWRUV6,)7 685) 25% %5,6. .$=( $.$=(
LOOXPLQDWLRQWUDQVIRUPDWLRQRULJLQDOOHXYHQSDFNDJH
D
E
EOXUWUDQVIRUPDWLRQXSGDWHGELNHVSDFNDJH
D
E
VFDOHWUDQVIRUPDWLRQXSGDWHGERDWVSDFNDJH
D
E
URWDWLRQWUDQVIRUPDWLRQXSGDWHGEDUNSDFNDJH
D
E
YLHZSRLQWWUDQVIRUPDWLRQRULJLQDOJUDISDFNDJH
D
E
-LWWHUYLGHR
D
E
Note: a-The number of features detect; b-Correspondence.
2) The time of detecting features, matching features and
total time
The time indicators of all the algorithms evaluated in
different datasets are shown in Figure 1-5. All the figures
show that under any transformation, ORB, AKAZE pay
much less time to detect and match features compared with
other algorithms .KAZE consumes the maximum time under
any transformation. BRISK is similar to SIFT. For all the
tests, ORB spends the least time for detecting and matching
features. In most situations, SIFT outperforms SURF except
scale transformation. From these figures, it also depicts that
detecting costs more time than matching except in scale
transformation. KAZE and AKAZE nearly have equal
matching time. Overall, the algorithms ORB and AKAZE
have better performance in detecting and matching time.
Figure 1. Illumination transformation.
221
Authorized licensed use limited to: Fujian University of Technology. Downloaded on July 07,2020 at 08:33:48 UTC from IEEE Xplore. Restrictions apply.
Figure 2. Blur transformation.
Figure 3. Scale transformation.
Figure 4. Rotate transformation.
Figure 5. Viewpoint transformation.
3) Accuracy and Repeatability
The accuracy and repeatability evaluated with different
datasets are shown in Figure 6-10. From Figure6-10, it
shows that the accuracy of each algorithm is very high in the
blur transformation. However the accuracy is generally low
in viewpoint transformation where KAZE gets the lowest.
Under the illumination transformation, the accuracy of SIFT
is the lowest , while that of AKAZE is the highest. Under the
rotation transformation, KAZE gets the lowest accuracy.
It can be concluded from Figure 6 to Figure 10: 1)under
the illumination, scale and blur transformations, the
repeatability of AKAZE is the highest. 2)under the rotation
or viewpoint transformation, the highest Repeatability
changeto KAZE. 3) For illumination or scale transformation,
the ORB achieves the lowest repeatability.
Based on the above evaluation performance indicators,
we can find that some algorithms are robust under certain
transformations. For instance SURF performs well under the
illumination, blur, scale, rotate transformation. And SIFT
performs better under the scale and rotate transformation. In
addition ,under viewpoint transformation, BRISK algorithm
and AKAZE algorithm are also perform better.
Figure 6. Illumination transformation.
Figure 7. Blur transformation.
Figure 8. Scale transformation.
Figure 9. Rotate transformation.
Figure 10. Viewpoint transformation.
222
Authorized licensed use limited to: Fujian University of Technology. Downloaded on July 07,2020 at 08:33:48 UTC from IEEE Xplore. Restrictions apply.
4) Anti-jitter performance
Table I.(6) and Figure 11. show the experimental result
of each algorithm under video jitter.
a) The number of features detected:
BRISK>SIFT>SURF>KAZE>ORB>AKAZE;
b) Detecting features time: KAZE > BRISK > SURF >
SIFT> ORB > AKAZE;
c) Matching features time: BRISK >SIFT > SURF ǃ
KAZE > ORB > AKAZE;
d) Total Time ˖KAZE > BRISK > SURF >SIFT >
ORB > AKAZE;
e) Accuracy: Excluding the failed match, all the
algorithms have superior results, the accuracy rate
achieving 1;
f) Repeatability: KAZE > ORB > BRISK > SURF >
AKAZE >SIFT;
g) Correspondence: BRISK >KAZE> SIFT > SURF >
ORB > AKAZE;
From the above experimental results of Anti-jitter, we
can see that KAZE algorithm takes the most time with the
highest repetition rate. While ORB algorithm takes less time
with less number of feature points detected and fewer
repeated feature points. The SIFT algorithm detects more
feature points, but the repetition rate is the lowest. BRISK
algorithm takes longer, but the number of feature points, the
repetition rate and the number of repeated feature points are
higher. Although AKAZE algorithm spends less time to
detect feature points, the number of feature points is small
and the repetition rate is low. In the aspects of the number of
detection feature points, the time-consuming, the repetition
rate and the correspondence, the SURF algorithm only has
average performance.
In summary, although the time-consuming of BRISK
algorithm is long, the effect of other indicators is good. So
the anti-shake performance of the BRISK and SURF
algorithms are relatively good.
(a)Detecting, matching features time
(b) The accuracy and repeatability of all algorithms
Figure 11. Anti-jitter performance.
IV. CONCLUSION
In this paper, a large number of experiments are
performed to evaluate some feature detecting and matching
algorithms (SIFT, SURF, ORB, BRISK, KAZE, AKAZE).
Some robustness indicators are used to measure the
performance of the algorithms. The experimental results
show: under the lighting and blur transformation, the SURF
algorithm is more robust; under the scale and rotation
transformation SURF algorithm performs better; in the
viewpoint transformation, the BRISK and AKAZE
algorithms perform better. The second set of experimental
data shows that the BRISK and SURF algorithms have
better anti-jitter performance.
We can see from the experimental data that some
algorithms are more robust than others in some
transformation scenarios. But they all have a common
feature, that is either the accuracy rate is lower when it takes
less time, or the accuracy rate is higher when it takes more
time. In a word, it can be seen that these algorithms can not
be applied to some occasions with both less time-consuming
and high accuracy. It's still a challenge to have an algorithm
with short time and high accuracy at the same time. So in the
future, we should not only pay attention to the time-
consuming of the algorithm, but also improve the accuracy
of the algorithm.
REFERENCES
[1] Mouragnon E, Dekeyser F, Sayd P, et al. Real Time Localization and
3D Reconstruction. IEEE Computer Society Conference on
Computer Vision & Pattern Recognition, 2006. 363-370.
[2] Fleer D, Möller R. Comparing holistic and feature-based visual
methods for estimating the relative pose of mobile robots. Robotics
and Autonomous Systems, 2017, 89: 51-74.
[3] Pillai S, Leonard J. Monocular SLAM Supported Object Recognition.
Computer Science, 2015.
[4] Hu Z, Jiang Y. An improved ORB, gravity-ORB for target detection
on mobile devices. 2016 12th World Congress on Intelligent Control
and Automation (WCICA); 12-15 June 2016, 2016. 1708-1713.
[5] Mur-Artal R, Montiel J M M, Tardos J D. ORB-SLAM: A Versatile
and Accurate Monocular SLAM System. IEEE Transactions on
Robotics, 2015, 31(5): 1147-1163.
[6] Mur-Artal R, Tardos J D. ORB-SLAM2: An Open-Source SLAM
System for Monocular, Stereo, and RGB-D Cameras. IEEE
Transactions on Robotics, 2017, 33(5): 1255-1262.
[7] Geiger A, Lenz P, Stiller C, et al. Vision meets robotics: the KITTI
dataset. The International Journal of Robotics Research, 2013, 32:
1231-1237.
[8] Marchand E, Uchiyama H, Spindler F. Pose estimation for
augmented reality: a hands-on survey. IEEE Transactions on
Visualization & Computer Graphics, 2016, 22(12): 2633-2651.
[9] Lowe D G. Object recognition from local scale-invariant features.
Proceedings of the Seventh IEEE International Conference on
Computer Vision; 20-27 Sept. 1999, 1999. 1150-1157 vol.1152.
[10] Lowe D G. Distinctive Image Features from Scale-Invariant
Keypoints. International Journal of Computer Vision, 2004, 60(2):
91-110.
[11] Bay H, Tuytelaars T, Van Gool L. SURF: Speeded Up Robust
Features. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. 404-
417.
223
Authorized licensed use limited to: Fujian University of Technology. Downloaded on July 07,2020 at 08:33:48 UTC from IEEE Xplore. Restrictions apply.
[12] Bay H, Ess A, Tuytelaars T, et al. Speeded-Up Robust Features
(SURF). Computer Vision and Image Understanding, 2008, 110(3):
346-359.
[13] Alcantarilla P F, Bartoli A, Davison A J. KAZE Features. European
Conference on Computer Vision, 2012. 214-227.
[14] Fernández Alcantarilla P. Fast Explicit Diffusion for Accelerated
Features in Nonlinear Scale Spaces. 2013.
[15] Rublee E, Rabaud V, Konolige K, et al. ORB: an efficient alternative
to SURF or SURF. International Conference on Computer Vision,
2012.
[16] Leutenegger S, Chli M, Siegwart R Y. BRISK: Binary Robust
invariant scalable keypoints. International Conference on Computer
Vision, 2011.
224
Authorized licensed use limited to: Fujian University of Technology. Downloaded on July 07,2020 at 08:33:48 UTC from IEEE Xplore. Restrictions apply.
www.engineeringvillage.com
Detailed results: 1
Downloaded: 7/29/2020
Content provided by Engineering Village. Copyright 2020 Page 1 of 1
1. Evaluation of image feature detection and matching Algorithms
Accession number: 20202808915403
Authors: Ou, Yiwen (1); Cai, Zhiming (2); Lu, Jian (1); Dong, Jian (1); Ling, Yufeng (1)
Author affiliation: (1) Fujian University of Technology, School of Information Science and Engineering, Fuzhou,
China; (2) Fujian University of Technology, Natl. Demonstration Ctr. for Experimental Electronic Information and
Electrical Technology Education, Fuzhou, China
Corresponding author: Cai, Zhiming(caizm@fjut.edu.cn)
Source title: 2020 5th International Conference on Computer and Communication Systems, ICCCS 2020
Abbreviated source title: Int. Conf. Comput. Commun. Syst., ICCCS
Part number: 1 of 1
Issue title: 2020 5th International Conference on Computer and Communication Systems, ICCCS 2020
Issue date: May 2020
Publication year: 2020
Pages: 220-224
Article number: 9118480
Language: English
ISBN-13: 9781728161365
Document type: Conference article (CA)
Conference name: 5th International Conference on Computer and Communication Systems, ICCCS 2020
Conference date: May 15, 2020 - May 18, 2020
Conference location: Shanghai, China
Conference code: 161227
Publisher: Institute of Electrical and Electronics Engineers Inc.
Abstract: Image features detection and matching algorithms play an important role in the field of machine vision.
Among them, the computational efficiency and robust performance of the features detector descriptor selected by the
algorithm have a great impact on the accuracy and time consumption of image matching. This paper comprehensively
evaluates typical SIFT, SURF, ORB, BRISK, KAZE, AKAZE algorithms. The Oxford dataset is used to compare the
robustness of various algorithms under illumination transformation, rotation transformation, scale transformation,
blur transformation, and viewpoint transformation. Jitter video is also used to compare the anti-jitter ability for these
algorithms. The indicators compared include: time of detecting features, time of matching images, total running time,
number of detected feature points, accuracy, number of repeated feature points, and repetition rate. Experimental
results show that, Under different transformations, each algorithm has its own advantages and disadvantages. © 2020
IEEE.
Number of references: 16
Main heading: Feature extraction
Controlled terms: Computational efficiency - Jitter
Uncontrolled terms: Image features - Matching algorithm - Repetition rate - Robust performance - Rotation
transformation - Scale transformation - Time consumption - Viewpoint transformation
DOI: 10.1109/ICCCS49078.2020.9118480
Compendex references: YES
Database: Compendex
Compilation and indexing terms, Copyright 2020 Elsevier Inc.
Data Provider: Engineering Village