The process of using floating-point 32-bit default parameters vs. using clustered parameters.

The process of using floating-point 32-bit default parameters vs. using clustered parameters.

Source publication
Preprint
Full-text available
Transformers provide promising accuracy and have become popular and used in various domains such as natural language processing and computer vision. However, due to their massive number of model parameters, memory and computation requirements, they are not suitable for resource-constrained low-power devices. Even with high-performance and specializ...

Contexts in source publication

Context 1
... use the clustered parameters, the corresponding 8-bit index is fetched instead of the 32-bit parameter in the baseline. The index is used to pick the corresponding centroid from the very small table of centroids as shown in Figure 5. In this paper, we explore scalar clustering in which every single parameter in the model will be directly represented by an index. ...
Context 2
... it shows, despite extra instructions and overhead in the kernel to perform the indirect accesses, as shown in Figure 5, the reduced pressure on the memory system, because of clustered parameters, provides significant benefit specially in GPUs with more computing resources such as the GPU in Conf-3. To demonstrate the advantages of using clustered data, the results are obtained while putting maximum pressure on the memory subsystem. ...

Similar publications

Article
Full-text available
Thanks to the boom of computer vision techniques and artificial intelligence algorithms, it is more available to achieve artificial rearing for animals in real production scenarios. Improving the accuracy of chicken day-age detection is one of the instances, which is of great importance for chicken rearing. To solve this problem, we proposed an att...
Preprint
Full-text available
p style='margin:0in;line-height:150%;font-size:15px;font-family:"Arial",sans-serif;margin-left:.5in;background:white;'> Abstract —Pathology slides of malignancies are segmented using lightweight convolutional neural networks (CNNs) that may be deployed on mobile devices. This is made possible by preprocessing candidate images to make CNN analysis t...