Multiply-Accumulate (MAC) RNS cell (adopted from [37])

Source publication

Residue arithmetic systems in cryptography: a survey on modern security applications

Article

Full-text available

Sep 2020

Dr. Dimitrios Schinianakis

In the last few years, the ancient residue number system has gained renewed scientific interest and has emerged as an interesting alternative in the field of secure hardware implementations. In this survey, however, we investigate some modern and non-typical applications of RNS in the areas of post-quantum cryptography, cloud infrastructures, and h...

Context 1

... categories share common characteristics; for example, all calculations are simplified down to modulo multiply-accumulate (MAC) operations (i.e., a multiplication of small RNS digits followed by an addition and a modulo reduction by the respective RNS modulus; the output is recursively fed to the input to repeat the process). An example of a suitable MAC hardware architecture that supports BC algorithm 3.3 is shown in Figure 3 [37]. Each MAC unit comprises of a multiplier, an adder, and the modular reduction unit per each modulus of the RNS base (of the special form 2 r − µ i ). ...

View in full-text

Fig. 1. Number of operations of the iterations 1-5 for the OM algorithm.

Fig. 2. Number of operations of the iterations 6-10 for the OM algorithm.

Fig. 3. Number of operations of the iterations 11-14 for the OM algorithm.

Fig. 4. Number of operations of the iterations 15-18 for the OM algorithm.

Fig. 6. Execution of BM algorithm when b=10.

Speeding up the Multiplication Algorithm for Large Integers

Article

Full-text available

Dec 2020

Multiplication is one of the basic operations that influence the performance of many computer applications such as cryptography. The main challenge of the multiplication operation is the cost of the operation as compared to other basic operations such as addition and subtraction, especially when the size of the numbers is large. In this work, we in...

Area-Power-Delay-Efficient Multi-Modulus Multiplier Based on Area-Saving Hard Multiple Generator Using Radix-8 Booth-Encoding Scheme on Field Programmable Gate Array

Article

Full-text available

Jan 2024

A multi-modulus architecture based on the radix-8 Booth encoding of a modulo (2n − 1) multiplier, a modulo (2n) multiplier, and a modulo (2n + 1) multiplier is proposed in this paper. It uses the original single circuit and shares many common circuit characteristics with a small extra circuit to carry out multi-modulus operations. Compared with a previous radix-4 study, the radix-8 architecture can increase the modulation multiplication encoding selection from three codes to four codes. This reduces the use of partial products from ⌊n/2⌋ to ⌊n/3⌋ + 1, but it increases the operation complexity for multiplication by three circuits. A hard multiple generator (HMG) is used to address this problem. Two judgment signals in the multi-modulus circuit can be used to perform three operations of the modulo (2n − 1) multiplier, modulo (2n) multiplier, and modulo (2n + 1) multiplier at the same time. The weighted representation is used to reduce the number of partial products. Compared with previously reported methods in the literature, the proposed approach can achieve better performance by being more area-efficient, being faster, consuming low power, and having a lower area-delay product (ADP) and power-delay product (PDP). With the multi-modulus HMG, the proposed modified architecture can save 34.48–55.23% of hardware area. Compared with previous studies on the multi-modulus multiplier, the proposed architecture can save 22.78–35.46%, 4.12–11.15%, 12.59–24.73%, 27.88–38.88%, and 20.49–27.85% of hardware area, delay time, dissipation power, ADP, and PDP, respectively. Xilinx field programmable gate array (FPGA) Vivado 2019.2 tools and the Verilog hardware description language are used for synthesis and implementation. The Xilinx Artix-7 XC7A35T-CSG324-1 chipset is adopted to evaluate the performance.

Measuring 3- and 4-Moduli Sets Delay Per Bit in Residue Number System: A Survey

Article

Full-text available

May 2023
IETE J RES

Given the efficiency of the residue number system in high-speed and low-power applications, a wide variety of moduli sets have been introduced in the literature. Selecting an appropriate moduli set is one of the most important issues of using a residue number system. Moduli sets can be evaluated by several factors such as dynamic range, balanced channels, fast and low-power modular operations, and efficient reverse converter. In this paper, 3- and 4-moduli sets are separately reviewed and compared from a different perspective. Furthermore, the most efficient 3- and 4-moduli sets are suggested. A new parameter for average delay comparison is introduced, which shows the delay of every bit in the dynamic range providing a fair comparison of moduli sets with different numbers of moduli. Our comparison showed that either balanced channels or the delay of computations are essential factors for designing an efficient new moduli set.

A lightweight Encryption Method For Privacy-Preserving in Process Mining

Preprint

Full-text available

Apr 2023

Novel technological achievements in the fields of business intelligence, business management and data science are based on real-time and complex virtual networks. Sharing data between a large number of organizations that leads to a system with high computational complexity is one of the considerable characteristics of the current business networks. Discovery, conformance and enhancement of the business processes are performed using the generated event logs. In this regard, one of the overlooked challenges is privacy-preserving in the field of process mining in the industry. To preserve the data-privacy with a low computational complexity structure that is a necessity for the current digital business technology, a novel lightweight encryption method based on Haar transform and a private key is proposed in this paper. We compare the proposed method with the well-known homomorphic cryptosystem and Walsh- Hadamard encryption (WHE) in terms of cryptography, computational complexity and structure vulnerability. The analyses show that the proposed method anonymizes the event logs with the lower complexity and more accuracy compared with two aforementioned cryptosystems, significantly.

Design of unsigned 2n+1 parallel residue arithmetic multiplier

Conference Paper

Apr 2023

Residue Multiplication operations are extensively used in Residue Number System (RNS) based cryptosystem architecture. Pointing to increase the speed performance of RNS crypto processors, the new parallel unsigned 2n+1 residue multiplier is designed in this work. Mathematical model, Algorithm, Architecture and FPGA implementation is done in this work. The proposed residue multiplier is described in Verilog HDL and synthesized in Application-Specific Integrated Circuits (ASIC) environment. Cadence RTL Compiler estimates the Area, Power and Delay performance parameters using various CMOS libraries. The proposed 2ⁿ+1 residue multiplication scheme saves 13% of the area, improves the speed by 19% and PDP by 23% compared to the recent 2ⁿ+1 residue multipliers.

Semi-primitive roots and the discrete logarithm module $2^k

Preprint

Full-text available

Nov 2022

Bianca Sosnovski

We establish a connection between semi-primitive roots of the multiplicative group of integers modulo $2^{k}$ where $k\geq 3$, and the logarithmic base in the algorithm introduced by Fit-Florea and Matula (2004) for computing the discrete logarithm modulo $2^{k}$. Fit-Florea and Matula used properties of the semi-primitive root 3 modulo $2^{k}$ to obtain their results and provided a conversion formula for other possible bases. We show that their results can be extended to any semi-primitive root modulo $2^{k}$ and also present a generalized version of their algorithm to find the discrete logarithm modulo $2^{k}$. Various applications in cryptography, symbolic computation, and others can potentially benefit from higher precision hardware integer arithmetic. The algorithm is suitable for hardware support of applications where fast arithmetic computation is desirable.

Design of Modified RNS-PPA Based FIR Filter for High-Speed Application

Article

Full-text available

Oct 2022

The primary motive of this paper is to give the design and implementation of RNS (Residue Number System) based Area efficient and excessive-overall performance FIR filter of 4-tap, eight-tap, 16-tap of input eight bit. Additionally, RNS mathematics is a treasured device for theoretical research of the limits of fast mathematics. These proposed strategies additionally have a few additions operation, through the use of convention adder will decrease the speed of operation and additionally increase the number of logic gates. So, to conquer the one's issues we are using Ladner Fischer parallel prefix adder to lower the delay and area. First, the multiplier is designed through the use of RNS approach. In which the delay is decreased through 78.57% and power dissipation is likewise reduced to 64.65% for the RNS_PPA multiplier. A combination of those algorithms generates a brand-new structure of excessive speed and low implementation area in a single multiplier for FIR filter using Xilinx 14.7.

An extendable key space integer image-cipher using 4-bit piece-wise linear cat map

Article

Full-text available

Sep 2022
MULTIMED TOOLS APPL

This paper presents a multiplierless image-cipher, with extendable 2048-bit key-space, based on a 4-dimensional (4D) quantized piece-wise linear cat map (PWLCM). The quantized PWLCM exhibits limit-cycles of 4-bit encoded integers with periods greater than 10⁷. The synthesis of the PWLCM in a finite state space allows to eliminate the undesirable finite precision effect due to the hardware realization. The proposed image-cipher combines chaos, modular arithmetic, and lattice-based cryptography to encrypt a color image by performing pixel permutation and diffusion in a single operation. Further, an image-dependent confusion operation based on an 8-bit 2D-PWLCM is performed on the whole image to enhance security. In order to increase the key-space without key duplication, 16 × 16 sub-images are modified using sub-keys of different lattice length vectors generated from the external key. Both simulations and security analyses confirm that the proposed algorithm can resist common cipher attacks, in addition to its advantages such as simplicity, ease of implementation on low-end processors and extensibility of key-space that allows it to easily adapt even for future post-quantum computing attacks.

Generating Very Large RNS Bases

Article

Jul 2022

Residue Number Systems (RNS) are proven to be effective in speeding up computations involving additions and products. For these representations, there exists efficient modular reduction algorithms that can be used in the context of arithmetic over finite fields or modulo large numbers, especially when used in the context of cryptographic engineering. Their independence allows random draws of bases, which also makes it possible to protect against side-channel attacks, or even to detect them using redundancy. These systems are easily scalable, however the existence of large bases for some specific uses remains a difficult question. In this article, we present four techniques to extract RNS bases from specific sets of integers, giving better performance and flexibility to previous works in the litterature. While our techniques do not allow to solve efficiently every possible case, we provide techniques to provably and efficiently find the largest possible available RNS bases in several cases, improving the state-of-the-art on various works of the recent literature.

Evaluation of the reverse transformation methods complexity of the residual number system for secure data storage

Article

Jan 2022

Serhii Kulyna

The methods of conversion from the residual number system to the decimal number system based on the classical Chinese remainder theorem (CRT) and its improvements CRT I, CRT II are considered in this paper. Analytical dependences of the time complexity of the specified methods are analyzed and constructed. As the result of carried out investigation, it is established that CRT II is characterized by greater efficiency compared to the other methods mentioned above. Examples of the implementation of direct and reverse conversion of RNS based on the application of CRT , CRT I, CRT II are given.

Sign Detection and Signed Integer Comparison for the 3-Moduli Set {2^n±1,2^(n+k)}

Article

Full-text available

Sep 2021

Comparison, division and sign detection are considered complicated operations in residue number system (RNS). A straightforward solution is to convert RNS numbers into binary formats and then perform complicated operations using conventional binary operators. If efficient circuits are provided for comparison, division and sign detection, the application of RNS can be extended to the cases including these operations.For RNS comparison in the 3-moduli set , we have only found one hardware realization. In this paper, an efficient RNS comparator is proposed for the moduli set which employs sign detection method and operates more efficient than its counterparts. The proposed sign detector and comparator utilize dynamic range partitioning (DRP), which has been recently presented for unsigned RNS comparison. Delay and cost of the proposed comparator are lower than the previous works and makes it appropriate for RNS applications with limited delay and cost.

Multiply-Accumulate (MAC) RNS cell (adopted from [37])

Context in source publication

Similar publications

Citations