Fig 7 - uploaded by Tomáš Krajník
Content may be subject to copyright.
The SAFHG core block diagram.  

The SAFHG core block diagram.  

Source publication
Article
Full-text available
We present a complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm. Aside from image analysis, the module embeds a Linux distribution that allows to run programs specifically tailored for particular applications. The module is based on a Virtex-5 FXT...

Context in source publication

Context 1
... image generation. The structure and principle of operation of this core can be seen in Figure 6. As aforementioned the result- ing integral image is sent not only to the Fast Hessian Generator, but also to the main memory for later reuse by the descriptor calculator, see Figure. 4. The SURF Accelerator -Fast-Hessian Generator IP core (SAFHG) (Fig. 7) is a key component of SURF de- tector acceleration. It calculates the Fast-Hessian re- sponses from the integral image and forms the entire scale space used by the detector. An important factor influencing the performance of the determinant calcu- lation is the optimization of memory access, which is performed by the MasterController ...

Similar publications

Chapter
Full-text available
The use of deep learning algorithms, as a core element of artificial intelligence, has attracted increased attention from industrial and academic institutes recently. One important use of deep learning is to predict the next user action inside an intelligent home environment that is based on Internet of Things (IoT). Recent researcher discusses the...
Conference Paper
Full-text available
Situations when profiling is required for embedded applications are common in software engineering practice. Usually such tasks have additional limitations - no ability to recompile or relink application, necessity to minimize profiler impact on application work, which makes performance evaluation difficult problem. The paper describes lightweight...
Thesis
Full-text available
The goal of this project is to provide a comprehensive resource on designing embedded systems with the most flexible programmable logic device to date, the Platform FPGA. All the steps in the design cycle are covered: building the base hardware, including an operating system and cross-compiling applications to take advantage of the custom computing...
Article
Full-text available
Digital signal-and-image processing is an area that covers a wide range of academic and commercial applications. It is a compulsory topic in most courses at engineering colleges. Moreover, thanks to the current achievements of the semiconductor industry, it is possible to obtain specialized devices that enable the creation of commercializable produ...
Article
Full-text available
The article deals with a design of a new automatically provisioned embedded system. Through the years of our active development a highly advanced platform has been created. This platform, called BEESIP, is meant for the embedded network devices, and allows them to act as telephony exchanges, secured access points, VPN concentrators, etc. As the key...

Citations

... FPGA based image features Detection algorithm improves the speed of image features Detection, Cornelis [9] proposed FPGA based real time image feature detection and matching system based on SIFT algorithm improve the speed. Pedre [10] proposed a FPGA based image fearure extraction system using SURF algorithm. Oruklu [11] proposed a FPGA based traffic sign detection system. ...
Article
Full-text available
Integral image generation improves the speed by reducing no. of computations like additions and multiplications. in computer vision applications such as image feature detectors, There are different algorithms for image feature detection, such as SURF, SIFT, HOG, Harris-Laplace Feature detection, FAST etc. Integral Image generation is used in SURF detector, which detects salient points from image and computes descriptors of their surroundings that are invariant to scale, rotation and illumination changes, hence it can be used in many of applications. The proposed Integral image generator in SURF detector uses Recursive addition equations for 320x240 image. Which improves the speed and reduce the Hard ware.. This Integral Image Generator is implemented in Virtex7 FPGA using verilog HDL.
... FPGA based image features Detection algorithm improves the speed of image features Detection, Cornelis [9] proposed FPGA based real time image feature detection and matching system based on SIFT algorithm improve the speed. Pedre [10] proposed a FPGA based image feature extraction system using SURF algorithm. Oruklu [11] proposed a FPGA based traffic sign detection system. ...
Conference Paper
Full-text available
Integral image generation improves the speed by reducing no. of computations like additions and multiplications. in computer vision applications such as image feature detectors, There are different algorithms for image feature detection, such as SURF, SIFT, HOG, Harris-Laplace Feature detection, FAST etc. Integral Image generation is used in SURF detector, which detects salient points from image and computes descriptors of their surroundings that are invariant to scale, rotation and illumination changes, hence it can be used in many of applications. The proposed Integral image generator in SURF detector uses Recursive addition equations for 320x240 image. Which improves the speed and reduce the Hard ware.. This Integral Image Generator is implemented in Virtex7 FPGA using Verilog HDL.
... Čížek et al. [41] proposed a processor-centric FPGA-based architecture for a latency reduction in the vision-based robotic navigation. Krajní k et al. [42] presented a complete hardware and software solution of an FPGA-based computer vision embedded module that can carry out the SURF image feature extraction algorithm. Yao et al. [43] proposed an architecture of optimized SIFT feature detection for an FPGA implementation of image matching. ...
... The results indicated that the uniform distribution of the matching points and the matching rate are affected by the number of matching points and the textures of the object. Additionally, some errors occur in the matched point pairs; however, these points can be eliminated by using robust fitting methods, such as RANSAC [15,27,42,81] or a combined algorithm of slope-based rejection (SR) and correlation-coefficient-based rejection (CCR) [48]. ...
Article
Full-text available
Feature points that are obtained from the combined speeded-up robust feature (SURF) detector and binary robust independent elementary features (BRIEF) descriptor have a highly robust performance. These points are previously considered the ground control points (GCPs) for building a connection between the image coordinates and the corresponding geodetic coordinates. This paper proposes a novel architecture to automatically and intelligently extract GCPs based on field programmable gate arrays (FPGAs). The parallelization SURF detector, BRIEF descriptor and BRIEF matching are implemented in a single Xilinx XC7VX980T FPGA system. Word length reduction (WLR), memory-efficient parallel architecture (MEPA), shift and subtraction strategies (SAS), a sliding window for separable convolution, and an optimized multispacer-scale are used to optimize the SURF detector. Improved parallel adder trees are used to accelerate the BRIEF matching. The proposed system achieves 380 frame per second (fps) with a 100 MHz clock frequency, which satisfies the real-time and low-power requirements of embedded devices. The results of the experiment demonstrate that the proposed architecture, when mapped onto a Xilinx Virtex-7 XC7VX980T FPGA device, can select the robust feature points.
... Battezzati et al. [28] proposed another architecture to implement SURF algorithm on FPGA for use in industrial applications. In [29] a complete hardware and software solution was also proposed for use in applications with power and spatial constraints such as small mobile robots. ...
Article
Full-text available
Scale and rotation invariant salient point detection and matching algorithms are variously used in computer vision applications such as image matching, 3D localization and pose estimation. Recently, hardware implementation of image and video processing algorithms has emerged as a viable solution to handle the high computational complexity of applications like 3D pose estimation with several processing stages. The hardware implementation of various stages of theses algorithms can be executed in a pipelined manner to ensure the reality of time. In this paper, a new and fully pipelined hardware architecture is proposed for salient point detection using Binary Robust Invariant Scalable Keypoints (BRISK) algorithm. BRISK algorithm is a binary keypoint extractor that detects salient points by constructing a scale-space pyramid; therefore, its fixed-point hardware implementation in a pipelined manner is challenging because of the required synchronization for various layers in scale domain. The proposed hardware architecture was implemented using Verilog Hardware Description Language, and the functionality of the design was validated through several experiments. The proposed design was synthesized by using an ASIC digital design flow utilizing 180 nm CMOS technology as well as a Virtex-4 FPGA. The design is clocked at 90.91 MHz in ASIC implementation and achieves processing rate of 169.29 frames/s while running on input images with 800 × 600 resolution. The throughput of FPGA implementation is 180.44 frames/s with 96.89 MHz clock frequency for the same input image resolution. Experimental results confirm the efficiency of the proposed hardware architecture in comparison with software implementation.
... FPGA based image features Detection algorithm improves the speed of image features Detection, Cornelis [9] proposed FPGA based real time image feature detection and matching system based on SIFT algorithm improve the speed. Pedre [10] proposed a FPGA based image fearure extraction system using SURF algorithm. Oruklu [11] proposed a FPGA based traffic sign detection system. ...
... Cížek [59]. ...
Thesis
Full-text available
This habilitation thesis presents research that aims to enable long-term deployment of mobile robots in changing environments. The presented approaches encompass methods that ensure robustness of autonomous visual navigation in outdoor environments for prolonged time periods, spatio-temporal representations that explicitly model the environment changes over time, and supporting software modules that enable robust and accurate robot localisation. The main contribution of the thesis is a novel approach that allows to incorporate the notion of time into most stationary environment models used in mobile robotics. This is achieved by representing the uncertainty of the environment states not by fixed probabilities, but by probabilistic functions of time, represented in the frequency domain. The method allows to integrate unlimited numbers of sparse and irregular observations obtained during long-term deployments of mobile robots into memory-efficient models that reflect the persistence and recurrence of environment variations. The frequency-enhanced spatio-temporal models allow to predict the future environment states, which improves the efficiency of mobile robot operation in changing environments. In this thesis, we present a series of articles, which demonstrate that the proposed approach improves mobile robot localization, path and task planning, activity recognition, human-robot interaction and allows for life-long spatio-temporal exploration of perpetually-changing environments.
... II. RELATED WORK Probably the first FPGA implementation of SURF extraction [8] has been proposed byŠváb et al. in [12] which has been followed by several further deployments, e.g., [13], [21]. The main computational improvements are achieved by dedicated units to calculate the Hessian responses for individual pixels, but descriptors are computed at the embedded PowerPC CPU of the utilized board with Virtex 5. ...
Article
Full-text available
In this paper, we propose a novel architecture for efficient detection of Speeded Up Robust Features (SURF) for Field-programmable gate array (FPGA). The main benefits of the proposed architecture are in real-time low-latency performance and scalability. The proposed solution provides a significant acceleration of salient points extraction which is fundamental image processing technique for vision-based methods including the simultaneous localization and mapping (SLAM). Based on the presented practical results, the proposed architecture is capable of processing streaming image data at the rate of 140 Megapixels per second which roughly scales from the 640×480@420fps up to 1920×1080@60fps video streams on a low-end, low-cost FPGA solution (Cyclone V). Moreover, the proposed feature detection utilizes only about 20% of logic elements of the FPGA which supports further parallel processing of multiple inputs.
... For the latter, we build upon our previous work [11], which is inspired by the flexibility of designing custom hardware, which can be achieved by using a field programmable gate array (FPGA). FPGAs have been used in the past to implement custom hardware architectures for SIFT [8,9,14,15,31], and speeded up robust features (SURF) [5,18,36,38], enabling these descriptors to run in real time. However, our approach is not a hardware architecture translation of the ORB descriptor, but a hardware implementation of the construction process, which is leveraged with our new way of arranging the pairwise tests such that data memory organization is exploited. ...
Article
Full-text available
Binary descriptors have won their place as efficient and effective visual descriptors in several vision tasks. In this context, one of the most widely used binary descriptors to date is the ORB descriptor. ORB is robust against rotation changes, and it uses a learning procedure to generate sampling pairwise tests to construct the descriptor. However, this construction involves a sequential memory access of as many steps as the binary string size. From the latter and motivated by the fact that modern computer vision tasks may require the construction of thousands, if not millions of binary descriptors, we propose to accelerate the construction process of the ORB descriptor via an FPGA-based hardware architecture. The latter is leveraged with a novel arrangement of pairwise tests, which takes advantage of a dual random access memory scheme achieving an acceleration of up to 17 times when compared against the sequential way. The empirical assessment indicates that ORB descriptors obtained from the proposed approach keep a similar performance to that of the original ORB.
... The proposed implementation achieved 356 fps with 156 MHz clock [22]. Krajnik T. et al. (2014) presented a complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm [23]. Chen et al. (2015) proposed an FPGA architecture of OpenSURF. ...
... The proposed implementation achieved 356 fps with 156 MHz clock [22]. Krajnik T. et al. (2014) presented a complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm [23]. Chen et al. (2015) proposed an FPGA architecture of OpenSURF. ...
Article
Full-text available
This paper presents a FPGA-based method for on-board detection and matching of the feature points. With the proposed method, a parallel processing model and a pipeline structure are presented to ensure a high frame rate at processing speed, but with a low power consumption. To save the FPGA resources and increase the processing speed, a model which combines the modified SURF detector and a BRIEF descriptor, is presented as well. Three pairs of images with different land coverages are used to evaluate the performance of FPGA-based implementation. The experiment results demonstrate that (1) when the image pairs with artificial features (such as buildings and roads), the performance of FPGA-based implementation is better than those image pairs with natural features (such as woods); (2) the proposed FPGA-based method is capable of ensuring the processing speed at a high frame rate, such as the speed of can achieve 304 fps under a 100 MHz clock frequency. The speedup of the proposed implementation is about 27 times higher than that when using the PC-based implementation.
... As hardware platforms, field programmable gate arrays (FPGAs) and graphics processing units (GPUs) are adopted in many previous designs for realizing parallel architectures in embedded vision applications. [13][14][15][16][17][18] The resulting practical designs have enhanced computational capability significantly and enable the handling of enormous data volumes. Specifically, the design cycle using an FPGA is shorter and costs less than an application-specific integrated circuit (ASIC) for small amounts of produced systems. ...
... Compared to central processing unit (CPU)-based or GPU-based systems, Ref. 16 presented a standalone FPGAbased embedded module for the SURF implementation which can process up to 10 XGA (1024 × 768 pixels) frames per second while the FPGA-based solution in Ref. 17 can support up to 56 VGA (640 × 480 pixels) frames per second. Both works implemented the complete calculation of SURF including detection and description stages and are based on the integral-image concept. ...
... 5,[22][23][24][25][26][27][28][29] Consequently, many prior works on the FPGA implementation of the SURF descriptor have therefore been based on the integral-image concept. [13][14][15][16][17]30) These integral images allow fast computation of rectangular Haar-like features at a high constant speed, independent of filter size. Although the calculation of integral images only consists of a few simple addition operations per pixel, in total a massive number of operations is necessary, owing to the generally large amount of necessary pre-stored image data. ...
Article
Intelligent analysis of image and video data requires image-feature extraction as an important processing capability for machine-vision realization. A coprocessor with pixel-based pipeline (CFEPP) architecture is developed for real-time Haar-like cell-based feature extraction. Synchronization with the image sensor's pixel frequency and immediate usage of each input pixel for the feature-construction process avoids the dependence on memory-intensive conventional strategies like integral-image construction or frame buffers. One 180 nm CMOS prototype can extract the 1680-dimensional Haar-like feature vectors, applied in the speeded up robust features (SURF) scheme, using an on-chip memory of only 96 kb (kilobit). Additionally, a low power dissipation of only 43.45 mW at 1.8 V supply voltage is achieved during VGA video procession at 120 MHz frequency with more than 325 fps. The Haar-like feature-extraction coprocessor is further evaluated by the practical application of vehicle recognition, achieving the expected high accuracy which is comparable to previous work.