Figure - uploaded by Solomon Negussie Tesema
Content may be subject to copyright.
Available resources onboard ZYNQ-7020 and ZCU102.

Available resources onboard ZYNQ-7020 and ZCU102.

Source publication
Article
Full-text available
The success of deep convolutional neural networks in solving age-old computer vision challenges, particularly object detection, came with high requirements in terms of computation capability, energy consumption, and a lack of real-time processing capability. However, FPGA-based inference accelerations have recently been receiving more attention fro...

Context in source publication

Context 1
... targeted two Xilinx boards, namely ZYNQ-7000 SoC, specifically Z-7020CGL484-1 and ZCU102 development boards from ZYNQ UltraScale+ MPSoC for the implementation of YOLOv2-based object detection inference. As seen in Table 3, the Z-7020CGL484-1 has minimal resources compared to ZCU102. Since double buffering requires twice as many on-chip buffers than an implementation without double-buffering, we had to use different tile sizes for the two boards. ...

Citations

... However, the existing work [17] indicates that the throughput of the SBL algorithm implemented solely on FPGA is notably low, falling far short of real-time processing requirements. To leverage the parallel processing capabilities of the FPGA more effectively, the authors propose a software-hardware co-processing system for target detection in [18] with the computationally intensive layer placed on the FPGA and the non-computational layer on the ARM. Inspired by this, we intend to explore the application of such a structure to accelerate the SBL algorithm. ...
... Firstly, by leveraging the capabilities of ZYNQ MPSoC (multiprocessor system-onchip) [18], which combines FPGA and ARM processors, we propose a hardware and software (HW&SW) co-implementation method for SBL algorithm. The main body of the algorithm is implemented on the FPGA part, while the control of data input and iteration termination is handled by the ARM processor. ...
Article
Full-text available
In the field of sparse signal reconstruction, sparse Bayesian learning (SBL) has excellent performance, which is accompanied by extremely high computational complexity. This paper presents an efficient SBL hardware and software (HW&SW) co-implementation method using the ZYNQ series MPSoC (multiprocessor system-on-chip). Firstly, considering the inherent challenges in parallelizing iterative algorithms like SBL, we propose an architecture based on the iterative calculations implemented on the PL side (FPGA) and the iteration control and input management handled by the PS side (ARM). By adopting this structure, we can take advantage of task-level pipelines on the FPGA side, effectively utilizing time and space resources. Secondly, we utilize LDL decomposition to perform the inversion of the Hermitian matrix, which not only exhibits the lowest computational complexity and requires fewer computational resources but also achieves a higher level in the parallel pipeline mechanism compared with other alternative methods. Furthermore, the algorithm conducts iterations sequentially, utilizing the parameters derived from the previous dataset as prior information for initializing the subsequent dataset’s initial values. This approach helps to reduce the number of iterations required. Finally, with the help of Vitis HLS 2022.2 and Vivado tools, we successfully accomplished the development of a hardware design language and its implementation on the ZYNQ UltraScale+ MPSoC ZCU102 platform. Meanwhile, we have solved a direction of arrival (DOA) estimation problem using horizontal line arrays to verify the practical feasibility of the method.