Flow diagram of a QNN for object detection.

Source publication

Figure 5. Block diagram of a CNN for pattern classification.

Figure 6. Block diagram of (a) BNN and (b) QNN for pattern recognition.

Figure 7. Flow diagram of a QNN for object detection.

Pattern Classification Using Quantized Neural Networks for FPGA-Based Low-Power IoT Devices

Article

Full-text available

Nov 2022

With the recent growth of the Internet of Things (IoT) and the demand for faster computation, quantized neural networks (QNNs) or QNN-enabled IoT can offer better performance than conventional convolution neural networks (CNNs). With the aim of reducing memory access costs and increasing the computation efficiency, QNN-enabled devices are expected...

Context 1

... suitable quantization architecture should be able to store useful information in continuous variables and is critical to network performances. Figure 7 describes the typical blocks of a QNN flowchart. ...

View in full-text

Efficient Binary Weight Convolutional Network Accelerator for Speech Recognition

Article

Full-text available

Jan 2023
SENSORS-BASEL

Speech recognition has progressed tremendously in the area of artificial intelligence (AI). However, the performance of the real-time offline Chinese speech recognition neural network accelerator for edge AI needs to be improved. This paper proposes a configurable convolutional neural network accelerator based on a lightweight speech recognition model, which can dramatically reduce hardware resource consumption while guaranteeing an acceptable error rate. For convolutional layers, the weights are binarized to reduce the number of model parameters and improve computational and storage efficiency. A multichannel shared computation (MCSC) architecture is proposed to maximize the reuse of weight and feature map data. The binary weight-sharing processing engine (PE) is designed to avoid limiting the number of multipliers. A custom instruction set is established according to the variable length of voice input to configure parameters for adapting to different network structures. Finally, the ping-pong storage method is used when the feature map is an input. We implemented this accelerator on Xilinx ZYNQ XC7Z035 under the working frequency of 150 MHz. The processing time for 2.24 s and 8 s of speech was 69.8 ms and 189.51 ms, respectively, and the convolution performance reached 35.66 GOPS/W. Compared with other computing platforms, accelerators perform better in terms of energy efficiency, power consumption and hardware resource consumption.

Flow diagram of a QNN for object detection.

Context in source publication

Citations