The quantization of convolution architecture.

Source publication

Fig. 1. Loss changing of quantization model with different data...

Fig. 2. The quantization of convolution architecture.

Fig. 3. Overall architecture design of DQMQ.

Fig. 4. Coping mechanisms for partially deployed and skeleton-deployed...

Fig. 5. CIFAR-10 images of different qualities.

Data Quality-aware Mixed-precision Quantization via Hybrid Reinforcement Learning

Preprint

Full-text available

Feb 2023

Mixed-precision quantization mostly predetermines the model bit-width settings before actual training due to the non-differential bit-width sampling process, obtaining sub-optimal performance. Worse still, the conventional static quality-consistent training setting, i.e., all data is assumed to be of the same quality across training and inference,...

Context 1

... For a CNN with L convolution layers, we define Θ as the learnable parameters set, and ω l ∈ Θ as the vanilla full precision weight parameters of layer l. A typical quantization-aware training CNN structure can be described as an intertwined pipeline including quantization → convolution → dequantization. As shown in Fig. 2, the left tangle shows the process of the quantization of the convolutional input, wherein the ω l and the x l are the full precision model weight and activations respectively. s is the minimum scale that can be represented after fixed-point quantization, and z represents the quantized fixed-point value corresponding to the ...

View in full-text

The quantization of convolution architecture.

Context in source publication