Figure 8 - uploaded by Wu Nanjian (吴南健
Content may be subject to copyright.
The diagram of row-parallel processor architecture. 

The diagram of row-parallel processor architecture. 

Source publication
Article
Full-text available
A programmable vision chip with variable resolution and row-pixel-mixed parallel image processors is presented. The chip consists of a CMOS sensor array, with row-parallel 6-bit Algorithmic ADCs, row-parallel gray-scale image processors, pixel-parallel SIMD Processing Element (PE) array, and instruction controller. The resolution of the image in th...

Contexts in source publication

Context 1
... algorithmic ADC has been widely used in CMOS image sensors for many years [23]. In the vision chip, we chose a traditional structure of algorithmic ADC. The diagram of the ADC shows in Figure 7. V in is the output signal of one pixel in the sensor array. V bias is the circuit bias voltage. V ref and V offset are two off-chip reference voltages for analog-to-digital converting. Φ 1 and Φ 2 are non- overlapping two-phase clocks. Φ A Φ B Φ C Φ D are switch signals derived from Φ 1 and Φ 2 . The sample signal is multiplied by 2 in the operational amplifier ‘Op_1’ and hold in ‘Op_2’ . Then the output voltage of ‘Op_2’ compares with a reference voltage in comparator ‘ Comp’ . After comparison the digital result outputs 1 bit by 1 clock cycle of Φ 1 or Φ 2 . At the next clock, the output voltage of ‘Op_2’ recycles to the input of ADC for next bit of the digital output. An analog signal converting into a 6-bit digital signal takes seven clock cycle, one for sampling and six for 6-bit output. The row-parallel processor is designed for calculating sum, subtraction and comparison of two 6-bit data. The diagram is shown in Figure 8. The ‘Buf’ converts the serial input data to the parallel data. ‘D_Shift_Enable’ controls the data to transfer column by column. ‘B_Sel’ switches the input of the ALU. Because the maximum of the sum of nine 6-bit data is less than 11-bit, so the data width of ALU is designed as 11-bit. It composed of eleven single-bit ALUs. The operating instruction, ‘Operation’ , comes from off-chip circuits. The search chain can perform a function that finds out the first logic ‘1’ in a serial of bits along with certain direction. The length L of the search chain is defined as the number of bits being searched in the chain. An example of the search chain that has L = 8 is given in Figure 9. The search ...
Context 2
... row-parallel processor is designed for calculating sum, subtraction and comparison of two 6-bit data. The diagram is shown in Figure 8. The 'Buf' converts the serial input data to the parallel data. ...

Similar publications

Conference Paper
Full-text available
Searching for humans lost in vast stretches of ocean has always been a difficult task. In this paper, a range of machine vision approaches are investigated as candidate tools to mitigate the risk of human fatigue and complacency after long hours performing these kind of search tasks. Our two-phased approach utilises point target detection followed...

Citations

... 为了更好地支持高 层次处理, 很多研究小组设计了针对特殊应用场景的 专用视觉处理器, 如实现边缘提取 [32] 、运动检测 [33] 、 目标追踪 [34] . [38] . 该架构中的PE阵 列能够完成图像的邻域处理, RP阵列能够完成快速特 征提取, 但特征分类仍然需要利用架构中的MCU完成, 特征分类和识别速度慢. ...
... A hierarchical parallel vision processor is a device that integrates multiple levels of processors exhibiting different parallelisms and complexities. Such processors can be extensively applied in areas, including industrial automation and security monitoring [1][2][3][4][5][6]. With the rapid growth of computation requirements in space image-processing missions [7][8][9], vision processors exhibit excellent prospects for performing various image-processing tasks. ...
Article
This paper proposes novel single event upset (SEU) failure probability evaluation and periodic scrubbing techniques for hierarchical parallel vision processors. To automatically evaluate the SEU failure probability and identify all the critical elements in a processor, complementary fault injection methods based on logic circuit simulator and Perl script are proposed. These methods can be used to randomly inject faults into D flip-flops (DFFs) and various types of memory at the register transfer level (RTL) as well as to evaluate the vision processor performance. Based on the evaluation results, an accurate periodic scrubbing technique is proposed to increase the processor availability. The results denote that the peak availability of the processor over a period of one year can be improved from 18% to 99.9% after scrubbing the RISC program memory for a period of 10⁴ s. Therefore, we can improve the fault-tolerance performance of a vision processor while avoiding unnecessary area and power costs using techniques ranging from evaluation to mitigation.
... The hierarchical parallel processing layers can store the image data and implements image processing algorithms in parallel. Many FD vision chips have been reported [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]. The early vision chips consist of a two-dimensional (2D) array of processing elements (PEs) [6][7][8][9][10][11]. ...
... The digital vision chip can implement more complicated image processing algorithms and is more flexible [8][9][10][11][12][13][14][15]. The digital vision chips include application-specific and general-purpose chips. ...
... The chips show impressive performances but low flexibility. The generalpurpose chip includes massively parallel programmable PEs with good flexibility [12][13][14][15]. The vision chip can reconfigure its hardware dynamically by chaining PEs and perform edge detection, block matching, image centroid and optical flow calculation through programming [12]. ...
Article
The paper reviews the progress of neuromorphic vision chip research in decades. It focuses on two kinds of the neuromorphic vision chips: frame-driven (FD) and event-driven (ED) vision chips. The FD and ED vision chips are very different from each other in system architecture, image sensing, image information coding, image processing algorithm, design methodology. The vision chips can overcome serial data transmission and processing bottlenecks in traditional image processing systems. They can perform the high speed image capture and real-time image processing operations. This paper selects two typical chips from the two kinds of vision chips, respectively, and introduces their architectures, image sensing schemes, image processing processors and system operation. The FD neuromorphic reconfigurable vision chip comprises a high speed image sensor, a processing element array and self-organizing map neural network. The FD vision chip has the advantages in image resolution, static object detection, time-multiplex image processing, and chip area. The ED neuromorphic vision chip system is based on address-event-representation image sensor and event-driven multi-kernel convolution network. The ED vision chip has the advantages in fast sensing, low communication bandwidth, brain-like processing, and high energy efficiency. Finally, this paper discusses the architecture and the challenges of the future neuromorphic vision chip and indicates that the reconfigurable vision chip with left- and right-brain functions integrated in the three dimensional (3D) large-scale integrated circuit (LSI) technology becomes a trend of the research on the vision chip.
... A vision chip integrates image sensors with multilevel heterogeneous parallel processors on a single chip and performs real-time image processing. [1][2][3][4][5][6] This type of chips are found in a wide range of critical application domains, such as video and image processing, 7) defect detection, robot vision, and control systems. Our device under verification (DUV) is a heterogeneous parallel processor for real-time vision applications. ...
Article
The implementation of functional verification in a fast, reliable, and effective manner is a challenging task in a vision chip verification process. The main reason for this challenge is the stepwise nature of existing functional verification techniques. This vision chip verification complexity is also related to the fact that in most vision chip design cycles, extensive efforts are focused on how to optimize chip metrics such as performance, power, and area. Design functional verification is not explicitly considered at an earlier stage at which the most sound decisions are made. In this paper, we propose a semi-automatic property-driven verification technique. The implementation of all verification components is based on design properties. We introduce a low-dimension property space between the specification space and the implementation space. The aim of this technique is to speed up the verification process for high-performance parallel processing vision chips. Our experimentation results show that the proposed technique can effectively improve the verification effort up to 20% for the complex vision chip design while reducing the simulation and debugging overheads.
... Compared with a serial implementation, the speed-up factor of the algorithm is roughly the number of PPUs, which is 64 in our case. The implemented high-speed tracking is also more robust than its counterparts [5,8,10,[17][18][19], because the object feature is invariant to illumination changes and updated every frame to adapt to scale and rotation. The PE array and PPU array cooperating mechanism greatly facilitate the object search procedure, thus makes high-speed robust tracking possible. ...
Article
Full-text available
This paper proposes a heterogeneous parallel processor for high-speed vision chip. It contains four levels of processors with different parallelisms and complexities: processing element (PE) array processor, patch processing unit (PPU) array processor, self-organizing map (SOM) neural network processor and dual-core microprocessor (MPU). The fine-grained PE array processor, middle-grained PPU array processor and SOM neural network processor carry out image processing in pixel-parallel, patch-parallel and distributed parallel fashions, respectively. The MPU controls the overall system and execute some serial algorithms. The processor can improve the total system performance from low level to high level image processing significantly. A prototype is implemented with 64 \times 64 PE array, 8 $\times$ 8 PPU array, 16 $\times$ 24 SOM network and a dual-core MPU. The proposed heterogeneous parallel processor introduces a new degree of parallelism, namely patch parallel which is for parallel local feature extraction and feature detection. It can flexibly perform state of the art computer vision as well as various image processing algorithms at high-speed. Various complicated applications including feature extraction, face detection, and highspeed tracking are demonstrated.
... Figure 1 shows the architecture of the vision SoC chip based on heterogeneous parallel processors. The vision SoC chip contains four levels of heterogeneous processors with different parallelism and complexity: processing element (PE) array processor [13] , patch processing unit (PPU) array processor, self-organizing map (SOM) neural network processor [14] and dual-core microprocessor (MPU). PE circuit consists of a main memory, a FIFO, an ALU unit and some multiplexers. ...
Conference Paper
Full-text available
The demand of higher performance system on chip (SoC) based on massively parallel processors has increased significantly throughout the last decades. The design verification of the chip becomes one of the major challenges in microelectronics. The paper proposes an efficient Layered Assertion Based Verification (L-ABV) methodology for vision system on chip based on heterogeneous parallel processors. It focuses on the vision SoC pre-silicon verification solutions. First, we discuss on how to reduce the degree of dependency between verification task and design task. Then we split the verification task into different logic layers. L-ABV has been successfully used in Vision SoC to increase the verification productivity. The result shows that it has effectively shortened the verification time.
... A large number of vision chips have been reported [3][4][5][6][7][8][9][10][11][12][13]. Most of these chips employ two-dimensional (2D) pixel-parallel array processors [3][4][5][6][7][12][13] and one-dimensional (1D) row-parallel array processors [8][9][10][11] to speedup low-and mid-level image processing, respectively. ...
... A large number of vision chips have been reported [3][4][5][6][7][8][9][10][11][12][13]. Most of these chips employ two-dimensional (2D) pixel-parallel array processors [3][4][5][6][7][12][13] and one-dimensional (1D) row-parallel array processors [8][9][10][11] to speedup low-and mid-level image processing, respectively. Usually, the structures of the processing units in the array processors are required to be very simple to meet reasonable chip area constraint. ...
... One typical processing unit only contains an adder and some simple logic gates without dedicated multipliers. Conventionally, only a few classic low-and mid-level algorithms were allowed to be performed on these vision chips, such as image filtering [4,[7][8][9][10][11][12], edge detection [4,[7][8][9][10][11][12], image subtraction [6,10,11,13], thresholding [4,7,[9][10][11][12], mathematical morphology [3,5,6,11,12], intensity statistics [9][10][11], and moment calculation [3,10], because these algorithms have inherent massive parallelism and involve very simple operations. They can be directly and effectively mapped onto the vision chip architecture. ...
... Since the area of the processing circuit is much larger than that of the pixel, such architecture suffers from low sensor resolution and small ll factor % . Our chip separates the pixel array from the PE array to beat the sufferings [28]. In the separated architecture, the sizes of the pixel array and PE array can be designed independently. ...
... Fig. 4 shows the three exible mapping relationships between the pixel array and the PE array when , with different sample intervals and slice sizes. The sub-sampling manner can be dynamically changed by the MPU to emulate a bio-inspired glance-stare vision [28]- [30]: In the rst frame, a 4:1 sub-sampled image in a large slice is roughly processed (glanced) to quickly locate the concerned object; Then in successive frames, only a 2:1 or 1:1 sub-sampled image in a smaller slice containing that object is further processed in detail (stared). Fig. 5 shows the recon gurable PE circuit. ...
Article
Full-text available
This paper proposes a vision chip hybrid architecture with dynamically reconfigurable processing element (PE) array processor and self-organizing map (SOM) neural network. It integrates a high speed CMOS image sensor, three von Neumann-type processors, and a non-von Neumann-type bio-inspired SOM neural network. The processors consist of a pixel-parallel PE array processor with O(N x N) parallelism, a row-parallel row-processor (RP) array processor with O(N) parallelism and a thread-parallel dual-core microprocessor unit (MPU) with O(2) parallelism. They execute low-, mid- and high-level image processing, respectively. The SOM network speeds up high-level processing in pattern recognition tasks by O(N/4 x N/4), which improves the chip performance remarkably. The SOM network can be dynamically reconfigured from the PE array to largely save chip area. A prototype chip with a 256 x 256 image sensor, a reconfigurable 64 x 64 PE array processor/16 x 16 SOM network, a 64 x 1 RP array processor and a dual-core 32-bit MPU was implemented in a 0.18 um CMOS image sensor process. The chip can perform image capture and various-level image processing at a high speed and in flexible fashion. Various complicated applications including M-S functional solution, horizon estimation, hand gesture recognition, face recognition are demonstrated at high speed from several hundreds to >1000 fps.
... A large number of vision chips have been reported [3][4][5][6][7][8][9][10][11][12][13]. Most of these chips employ two-dimensional (2D) pixel-parallel array processors [3][4][5][6][7][12][13] and one-dimensional (1D) row-parallel array processors [8][9][10][11] to speedup low-and mid-level image processing, respectively. ...
... A large number of vision chips have been reported [3][4][5][6][7][8][9][10][11][12][13]. Most of these chips employ two-dimensional (2D) pixel-parallel array processors [3][4][5][6][7][12][13] and one-dimensional (1D) row-parallel array processors [8][9][10][11] to speedup low-and mid-level image processing, respectively. Usually, the structures of the processing units in the array processors are required to be very simple to meet reasonable chip area constraint. ...
... One typical processing unit only contains an adder and some simple logic gates without dedicated multipliers. Conventionally, only a few classic low-and mid-level algorithms were allowed to be performed on these vision chips, such as image filtering [4,[7][8][9][10][11][12], edge detection [4,[7][8][9][10][11][12], image subtraction [6,10,11,13], thresholding [4,7,[9][10][11][12], mathematical morphology [3,5,6,11,12], intensity statistics [9][10][11], and moment calculation [3,10], because these algorithms have inherent massive parallelism and involve very simple operations. They can be directly and effectively mapped onto the vision chip architecture. ...
Article
Full-text available
This paper proposes a massively parallel keypoint detection and description (MP-KDD) algorithm for the vision chip with parallel array processors. The MP-KDD algorithm largely reduces the computational overhead by removing all floating-point and multiplication operations while preserving the currently popular SIFT and SURF algorithm essence. The MP-KDD algorithm can be directly and effectively mapped onto the pixel-parallel and row-parallel array processors of the vision chip. The vision chip architecture is also enhanced to realize direct memory access (DMA) and random access to array processors so that the MP-KDD algorithm can be executed more effectively. An FPGA-based vision chip prototype is implemented to test and evaluate our MP-KDD algorithm. Its image processing speed reaches 600–760 fps with high accuracy for complex vision applications, such as scene recognition.
... The vision chip integrates the image sensor and the processing circuits as a single silicon device to achieve high speed and low power image sensing and processing. [1][2][3][4] Recent research shows that the vision chip owns a broad prospect not only in high speed tracking, 5,6 but also in machine learning and pattern recognition. 7,8 The pixel-parallel processor is a key circuit module in the digital vision chip. ...
Conference Paper
Full-text available
Local memory architecture plays an important role in high performance massively parallel vision chip. In this paper, we propose an enhanced memory architecture with compact circuit area designed in a full-custom flow. The memory consists of separate master-stage static latches and shared slave-stage dynamic latches. We use split transmission transistors on the input data path to enhance tolerance for charge sharing and to achieve random read/write capabilities. The memory is designed in a 0.18 µm CMOS process. The area overhead of the memory achieves 16.6 µm 2 /bit. Simulation results show that the maximum operating frequency reaches 410 MHz and the corresponding peak dynamic power consumption for a 64-bit memory unit is 190 µW under 1.8 V supply voltage.