VHDL implementation of carry save adder

Source publication

Efficient hardware architectures for modular multiplication on FPGAs

Conference Paper

Full-text available

Sep 2005

The computational fundament of most public-key cryptosystems is the modular multiplication. Improving the efficiency of the modular multiplication is directly associated with the efficiency of the whole cryptosystem. This paper presents an implementation and comparison of three recently proposed, highly efficient architectures for modular multiplic...

Error Correction Coding in a Serial Digital Multi-Gigabit Communication System: Implementation and Results

Article

Full-text available

Jan 2006

An error correction code (ECC) design for a wired serial digital multi-gigabit communication system is presented. The code design combines a maximum run length code and a 2-error-correcting primitive BCH code. The implementation of the design in a field programmable gate array (FPGA) and the logic design of this code for low latency is discussed. R...

A novel SEU, MBU and SHE handling strategy for xilinx Virtex-4 FPGAS

Conference Paper

Full-text available

Aug 2009

This paper presents a new single event upset (SEU), multiple bit upset (MBU) and single hardware error (SHE) mitigation strategy to be used in Virtex-4 FPGAs. This strategy aims to increase not only the effectiveness of traditional triple module redundancy (TMR), but also the overall system availability. Frame readback with ECC detection and frame...

Lightweight Architecture for Elliptic Curve Scalar Multiplication over Prime Field

Article

Full-text available

Jul 2022

In this paper, we present a novel lightweight elliptic curve scalar multiplication architecture for random Weierstrass curves over prime field Fp. The elliptic curve scalar multiplication is executed in Jacobian coordinates based on the Montgomery ladder algorithm with (X,Y)-only common Z coordinate arithmetic. At the finite field operation level, the adder-based modular multiplier and modular divider are optimized by the pre-calculation method to reduce the critical path while maintaining low resource consumption. At the group operation level, the point addition and point doubling methods in (X,Y)-only common Z coordinate arithmetic are modified to improve computation parallelism. A compact scheduling method is presented to improve the architecture’s performance, which includes appropriate scheduling of finite field operations and specific register connections. Compared with existing works, our design is implemented on the FPGA platform without using DSPs or BRAMs for higher portability. It utilizes 6.4k~6.5k slices in Kintex-7, Virtex-7, and ZYNQ FPGA and executes an elliptic curve scalar multiplication for a field size of 256-bit in 1.73 ms, 1.70 ms, and 1.80 ms, respectively. Additionally, our design is resistant to timing attacks, simple power analysis attacks, and safe-error attacks. This architecture outperforms most state-of-the-art lightweight designs in terms of area-time products.

Edge enhanced deep learning system for IoT edge device security analytics

Article

Full-text available

Dec 2021
CONCURR COMP-PRACT E

The processing of locally harvested data at the physically accessible edge devices opens a new avenue of security threats for edge enhanced analytics. Cryptographic algorithms are used to secure the data being processed on the edge device. However, the implementation weakness of the algorithms on the edge devices can lead to side-channel attack vulnerability, which is exacerbated with the application of machine-learning techniques. This research proposes a deep learning-based system integrated at the edge device to identify the side-channel leakages. To design such a deep learning-based system, one of the challenges is formulating the suitable attack model for the underlying target algorithm. Based on the previous findings, three machine learning-based side-channel attack models are curated and investigated for the edge device security evaluations. As a test case, the standard elliptic-curve cryptographic algorithm is selected. Moreover, quantitative analysis is provided for the best attack model selection using standard machine-learning evaluation metrics. A comparative analysis is performed on the raw unaligned data samples and reduced feature-engineered samples using edge enhanced security analytics. The investigation concludes that the vulnerable algorithm implementation can lead to the secret key recovery from the edge device, with 96% accuracy, using a neural-network-based algorithm to analyse side-channel attacks.

A Hardware-Efficient Elliptic Curve Cryptographic Architecture over GF (p)

Article

Full-text available

May 2021
MATH PROBL ENG

This paper proposes a hardware-efficient elliptic curve cryptography (ECC) architecture over GF(p), which uses adders to achieve scalar multiplication (SM) through hardware-reuse method. In terms of algorithm, the improvement of the interleaved modular multiplication (IMM) algorithm and the binary modular inverse (BMI) algorithm needs two adders. In addition to the adder, the data register is another optimize target. The design compiler is synthesized on 0.13 µm CMOS ASIC platform. The time range of performing scalar multiplication over 160, 192, 224, and 256 field orders under 150 MHz frequency is 1.99–3.17 ms. Moreover, the gate area required for different field orders in this design is in the range of 35.65k–59.14k, with 50%–91% hardware resource less than other processors.

A High-Performance Elliptic Curve Cryptographic Processor of SM2 over GF(p)

Article

Full-text available

Apr 2019

Elliptic curve cryptography (ECC) is widely used in practical applications because ECC has far fewer bits for operands at the same level of security than other public-key cryptosystems such as RSA. The performance of an ECC processor is usually determined by modular multiplication (MM) and point multiplication (PM) operations. For recommended prime field, MM operation can consist of multiplication and fast reduction operations. In this paper, a 256-bit multiplication operation is implemented by a 129-bit (half-word) multiplier using Karatsuba–Ofman multiplication algorithm. The fast reduction is a modulo operation, which gets 512-bit input data from multiplication and outputs a 256-bit result ( 0 ≤ Z < p ) . We propose a two-stage fast reduction algorithm (TSFR) over SCA-256 prime field, which can obtain an intermediate result of 0 ≤ Z < 2 p instead of 0 ≤ Z < 14 p in traditional algorithm, avoiding a lot of repetitive subtraction operations. The PM operation is implemented in width nonadjacent form (NAF) algorithm and its operational schedules are improved to increase the parallelism of multiplication and fast reduction operations. Synthesized with a 0.13 μ m complementary metal oxide semiconductor (CMOS) standard cell library, the proposed processor costs an area of 280 k gates and PM operation takes 0.057 ms at the frequency of 250 MHz. The design is also implemented on Xilinx Virtex-6 platform, which consumes 27.655 k LUTs and takes 0.37 ms to perform one 256-bit PM operation, attaining six times speed-up over the state-of-the-art. The processor makes a tradeoff between area and performance, thus it is better than other methods.

Machine-Learning-Based Side-Channel Evaluation of Elliptic-Curve Cryptographic FPGA Processor

Article

Full-text available

Dec 2018

Security of embedded systems is the need of the hour. A mathematically secure algorithm runs on a cryptographic chip on these systems, but secret private data can be at risk due to side-channel leakage information. This research focuses on retrieving secret-key information, by performing machine-learning-based analysis on leaked power-consumption signals, from Field Programmable Gate Array (FPGA) implementation of the elliptic-curve algorithm captured from a Kintex-7 FPGA chip while the elliptic-curve cryptography (ECC) algorithm is running on it. This paper formalizes the methodology for preparing an input dataset for further analysis using machine-learning-based techniques to classify the secret-key bits. Research results reveal how pre-processing filters improve the classification accuracy in certain cases, and show how various signal properties can provide accurate secret classification with a smaller feature dataset. The results further show the parameter tuning and the amount of time required for building the machine-learning models.

FPGA-Based Symmetric Re-Encryption Scheme to Secure Data Processing for Cloud-Integrated Internet of Things

Article

Aug 2018

A Low Hardware Consumption Elliptic Curve Cryptographic Architecture over GF(p) in Embedded Application

Article

Full-text available

Jul 2018

In this paper, a low hardware consumption design of elliptic curve cryptography (ECC) over GF(p) in embedded applications is proposed. The adder-based architecture is explored to reduce the hardware consumption of performing scalar multiplication (SM). The Interleaved Modular Multiplication Algorithm and Binary Modular Inversion Algorithm are improved and implemented with two full-word adder units. The full-word register units for data storage are also optimized. The design is based on two full-word adder units and twelve full-word register units of pipeline structure and was implemented on Xilinx Virtex-4 platform. Design Compiler is used to synthesized the proposed architecture with 0.13 μm CMOS standard cell library. For 160, 192, 224, 256 field order, the proposed architecture consumes 5595, 7080, 8423, 9370 slices, respectively, and saves 17.58∼54.93% slice resources on FPGA platform when compared with other design architectures. The synthesized result uses 35.43 k, 43.37 k, 50.38 k, 57.05 k gate area and saves 52.56∼91.34% in terms of gate count in comparison. The design takes 2.56∼4.07 ms to perform SM operation over different field order under 150 MHz frequency. The proposed architecture is safe from simple power analysis (SPA). Thus, it is a good choice for embedded applications.

Acceleration of AES Encryption Algorithm Using Field Programmable Gate Arrays

Article

Full-text available

Dec 2016

This article deals with encryption on Field Programmable Gate Array (FPGA). The first part describes current state of symmetric and asymmetric cryptography. The following part focuses on the AES algorithm and its implementation in VHDL language. The last part shows testing results of mentioned implementation on card NFB-40G2 containing FPGA from Xilinx series Virtex-7.

Dual Field Dual Core Secure Cryptoprocessor on FPGA Platform

Article

Full-text available

Feb 2013

This paper is devoted to the design of dual core crypto processor for executing both Prime field and binaryfield instructions. The proposed design is specifically optimized for Field programmable gate array(FPGA) platform. Combination of two different field(prime field GF(p) and Binary field GF(2m))instructions execution is analysed.The design is implemented in Spartan 3E and virtex5. Both theperformance results are compared. The implementation result shows the execution of parallelism usingdual field instructions

Quad Core Dual Field Cryptoprocessor on FPGA Platform

Article

Jan 2013

C. Veeraraghavan

VHDL implementation of carry save adder

Similar publications

Citations