Algorithm in C, and x86-64 instructions correspondents.

Source publication

Instruction Set Design for Coarse-Grained Reconfigurable Architectures

Conference Paper

Full-text available

Aug 2019

Maria D. Vieira

In the last ten years, the demand for performance improvements in computing systems has not fulfilled by CPU enhancements. A solution widely applied in different computing is the use of hardware accelerators. In the industrial scenario, accelerators such as Graphics Processing Unit (GPU) are more popular because they offer a well-defined and establ...

Context 1

... Section compares the x86-64 instruction set and our instruction set. In Figure 4, we show an algorithm implemented in C (a), and in x86-64 assembly (b). We choose compare our instruction set with the x86-64 assembly because we aim to compare the i5 with our architecture. ...

View in full-text

Context 2

... 3 (b) shows our instruction set which represents the operations graph mapped in Figure 1. That instructions are equivalent to the x86-64 instructions, shown in Figure 4 (b). That is, the operations of the graph in 3 (b) correspond to the algorithm in 4 (a). ...

View in full-text

Ai→SASAj\documentclass[12pt]{minimal} \usepackage{amsmath}...

Permutation Regularization and Bit Flipping:...

Explanations of the NEPs construction in conventional manner

Explanations of the NEPs construction in uniformed manner

Energy efficient noise error pattern generator for guessing decoding in bursty channels

Article

Full-text available

Feb 2024

For the hard guessing random additive noise decoding Markov order (GRAND-MO) algorithm, it is crucial to develop an efficient noise error patterns (NEPs) generator to facilitate its application in bursty channels. This paper proposes a practical hardware realization by generating the NEPs in a sequential manner. Based on classification of the four...

This figure presents the steps that are necessary to process a search...

This is the memory layout of the hash table on the GPU. We use three...

Benchmark Framework Structure: the framework consists of three...

Benchmark with throughput in case of a increasing number of threads, b...

Hybrid throughput with increase in long string amount, 128 Threads GPU,...

Hybrid CPU/GPU/APU accelerated query, insert, update and erase operations in hash tables with string keys

Article

Full-text available

May 2023

Modern computer systems can use different types of hardware acceleration to achieve massive performance improvements. Some accelerators like FPGA and dedicated GPU (dGPU) need optimized data structures for the best performance and often use dedicated memory. In contrast, APUs, which are a combination of a CPU and an integrated GPU (iGPU), support s...

Fig. 1 Explanations of the NEPs construction in conventional manner.

Fig. 4 Architecture design of the permutation generation module

Fig. 6 PER performance simulation and comparison.

Energy Efficient Noise Error Pattern Generator for Guessing Decoding in Bursty Channels

Preprint

Full-text available

Oct 2023

For the hard guessing random additive noise decoding Markov order (GRAND-MO) algorithm used in bursty channels, this paper presents an efficient noise error patterns (NEPs) generator. By converting the NEPs generation process into practical engineering realization, the ''1" and ''0" burst permutations are generated in a sequentially manner. Then th...

A Run-time Hardware Routing Implementation for CGRA Overlays

Conference Paper

Full-text available

Jul 2020

Maria D. Vieira

Accelerators became a wide-reaching solution for increasing computing systems' performance. However, they bring the trade-off between programming facility versus energy efficiency. FPGAs are highly energy-efficient accelerators, but complex to program. CGRA Overlays offers a more straightforward programming interface for FPGA and can use dataflows...

Figure 2: Diagrams with layer-by-layer details of the two architectures...

Fast Neural Network Inference on FPGAs for Triggering on Long-Lived Particles at Colliders

Preprint

Full-text available

Jul 2023

Experimental particle physics demands a sophisticated trigger and acquisition system capable to efficiently retain the collisions of interest for further investigation. Heterogeneous computing with the employment of FPGA cards may emerge as a trending technology for the triggering strategy of the upcoming high-luminosity program of the Large Hadron...

Algorithm in C, and x86-64 instructions correspondents.

Contexts in source publication

Similar publications