Conference Paper

Integer Syndrome Decoding in the Presence of Noise

November 2022

November 2022

DOI:10.1109/ITW54588.2022.9965806

Conference: 2022 IEEE Information Theory Workshop (ITW)

Authors:

Vlad-Florin Dragoi

Aurel Vlaicu University of Arad

Pierre-Louis Cayrel

Université Jean Monnet

Integer syndrome decoding in the presence of noise

Article

Full-text available

May 2024

Code-based cryptography received attention after the NIST started the post-quantum cryptography standardization process in 2016. A central NP-hard problem is the binary syndrome decoding problem, on which the security of many code-based cryptosystems lies. The best known methods to solve this problem all stem from the information-set decoding strategy, first introduced by Prange in 1962. A recent line of work considers augmented versions of this strategy, with hints typically provided by side-channel information. In this work, we consider the integer syndrome decoding problem, where the integer syndrome is available but might be noisy. We study how the performance of the decoder is affected by the noise. First we identify the noise model as being close to a centered in zero binomial distribution. Second we model the probability of success of the ISD-score decoder in presence of a binomial noise. Third, we demonstrate that with high probability our algorithm finds the solution as long as the noise parameter d is linear in t (the Hamming weight of the solution) and t is sub-linear in the code-length. We provide experimental results on cryptographic parameters for the BIKE and Classic McEliece cryptosystems, which are both candidates for the fourth round of the NIST standardization process.

Punctured Syndrome Decoding Problem: Efficient Side-Channel Attacks Against Classic McEliece

Chapter

Mar 2023

Among the fourth round finalists of the NIST post-quantum cryptography standardization process for public-key encryption algorithms and key encapsulation mechanisms, three rely on hard problems from coding theory. Key encapsulation mechanisms are frequently used in hybrid cryptographic systems: a public-key algorithm for key exchange and a secret key algorithm for communication. A major point is thus the initial key exchange that is performed thanks to a key encapsulation mechanism. In this paper, we analyze side-channel vulnerabilities of the key encapsulation mechanism implemented by the Classic McEliece cryptosystem, whose security is based on the syndrome decoding problem. We use side-channel leakages to reduce the complexity of the syndrome decoding problem by reducing the length of the code considered. The columns punctured from the original code reduce the complexity of a hard problem from coding theory. This approach leads to efficient profiled side-channel attacks that recover the session key with high success rates, even in noisy scenarios.KeywordsPost-quantum cryptographyCode-based cryptographySide-channel attacks

Hamming Quasi-Cyclic (HQC)

Technical Report

Full-text available

Jun 2021

we introduce HQC, an efficient encryption scheme based on coding theory. HQC stands for Hamming Quasi-Cyclic. This proposal has been published in IEEE Transactions on Information Theory. HQC is a code-based public key cryptosystem with several desirable properties: • It is proved IND-CPA assuming the hardness of (a decisional version of) the Syndrome Decoding on structured codes. By construction, HQC perfectly fits the recent KEMDEM transformation of [23], and allows to get an hybrid encryption scheme with strong security guarantees (IND-CCA2), • In contrast with most code-based cryptosystems, the assumption that the family of codes being used is indistinguishable among random codes is no longer required, and • It features a detailed and precise upper bound for the decryption failure probability analysis.

Syndrome Decoding Estimator

Chapter

Full-text available

Jan 2022

The selection of secure parameter sets requires an estimation of the attack cost to break the respective cryptographic scheme instantiated under these parameters. The current NIST standardization process for post-quantum schemes makes this an urgent task, especially considering the announcement to select final candidates by the end of 2021. For code-based schemes, recent estimates seemed to contradict the claimed security of most proposals, leading to a certain doubt about the correctness of those estimates. Furthermore, none of the available estimates include most recent algorithmic improvements on decoding linear codes, which are based on information set decoding (ISD) in combination with nearest neighbor search. In this work we observe that all major ISD improvements are build on nearest neighbor search, explicitly or implicitly. This allows us to derive a framework from which we obtain practical variants of all relevant ISD algorithms including the most recent improvements. We derive formulas for the practical attack costs and make those online available in an easy to use estimator tool written in python and C. Eventually, we provide classical and quantum estimates for the bit security of all parameter sets of current code-based NIST proposals.KeywordsISDsyndrome decodingnearest neighborestimatorcode-based

Solving a Modified Syndrome Decoding Problem using Integer Programming

Article

Full-text available

Aug 2020

In this article, we model a variant of the well-known syndrome decoding problem as a linear optimization problem. Most common algorithms used for solving optimization problems, e.g. the simplex algorithm, fail to find a valid solution for the syndrome decoding problem over a finite field. However, our simulations prove that a slightly modified version of the syndrome decoding problem can be solved by the simplex algorithm. More precisely, the algorithm returns a valid error vector when the syndrome vector is an integer vector, i.e.,the matrix-vector multiplication, is realized over Z, instead of Fq.

Two decoding algorithms of linear codes

Article

Full-text available

Jan 1989

Ilya Dumer

Two decoding algorithms for linear codes

Article

Full-text available

Mar 1989

Ilya Dumer

We propose an algorithm to compute the nearest codeword in a BCS. For a linear code of length n and rate R, the algorithm executes in time of order 2 n(1-R)/2 . For codes with distance d increasing linearly with length, we propose an algorithm capable of correcting [(d-1)/2]+const errors which involves a linearly increasing number of attempts to correct [(d-1)/2] errors.

Lecture Notes in Computer Science

Conference Paper

Full-text available

Dec 2011

Decoding random linear codes is a fundamental problem in complexity theory and lies at the heart of almost all code-based cryptography. The best attacks on the most prominent code-based cryptosystems such as McEliece directly use decoding algorithms for linear codes. The asymptotically best decoding algorithm for random linear codes of length n was for a long time Stern’s variant of information-set decoding running in time $\tilde{\mathcal{O}}\left(2^{0.05563n}\right)$. Recently, Bernstein, Lange and Peters proposed a new technique called Ball-collision decoding which offers a speed-up over Stern’s algorithm by improving the running time to $\tilde{\mathcal{O}}\left(2^{0.05558n}\right)$. In this paper, we present a new algorithm for decoding linear codes that is inspired by a representation technique due to Howgrave-Graham and Joux in the context of subset sum algorithms. Our decoding algorithm offers a rigorous complexity analysis for random linear codes and brings the time complexity down to $\tilde{\mathcal{O}}\left(2^{0.05363n}\right)$.

Quantitative group testing-based overlapping pool sequencing to identify rare variant carriers

Article

Full-text available

Jun 2014
BMC BIOINFORMATICS

Background Genome-wide association studies have revealed that rare variants are responsible for a large portion of the heritability of some complex human diseases. This highlights the increasing importance of detecting and screening for rare variants. Although the massively parallel sequencing technologies have greatly reduced the cost of DNA sequencing, the identification of rare variant carriers by large-scale re-sequencing remains prohibitively expensive because of the huge challenge of constructing libraries for thousands of samples. Recently, several studies have reported that techniques from group testing theory and compressed sensing could help identify rare variant carriers in large-scale samples with few pooled sequencing experiments and a dramatically reduced cost. Results Based on quantitative group testing, we propose an efficient overlapping pool sequencing strategy that allows the efficient recovery of variant carriers in numerous individuals with much lower costs than conventional methods. We used random k-set pool designs to mix samples, and optimized the design parameters according to an indicative probability. Based on a mathematical model of sequencing depth distribution, an optimal threshold was selected to declare a pool positive or negative. Then, using the quantitative information contained in the sequencing results, we designed a heuristic Bayesian probability decoding algorithm to identify variant carriers. Finally, we conducted in silico experiments to find variant carriers among 200 simulated Escherichia coli strains. With the simulated pools and publicly available Illumina sequencing data, our method correctly identified the variant carriers for 91.5–97.9% variants with the variant frequency ranging from 0.5 to 1.5%. Conclusions Using the number of reads, variant carriers could be identified precisely even though samples were randomly selected and pooled. Our method performed better than the published DNA Sudoku design and compressed sequencing, especially in reducing the required data throughput and cost.

Decoding Random Binary Linear Codes in 2 n/20: How 1 + 1 = 0 Improves Information Set Decoding

Conference Paper

Full-text available

Apr 2012

Decoding random linear codes is a well studied problem with many applications in complexity theory and cryptography. The security of almost all coding and LPN/LWE-based schemes relies on the assumption that it is hard to decode random linear codes. Recently, there has been progress in improving the running time of the best decoding algorithms for binary random codes. The ball collision technique of Bernstein, Lange and Peters lowered the complexity of Stern’s information set decoding algorithm to 20.0556n . Using representations this bound was improved to 20.0537n by May, Meurer and Thomae. We show how to further increase the number of representations and propose a new information set decoding algorithm with running time 20.0494n .

Revealing Information while Preserving Privacy

Conference Paper

Full-text available

Jun 2003

We examine the tradeoff between privacy and usability of statistical databases. We model a statistical database by an n-bit string d1,..,dn, with a query being a subset q ⊆ [n] to be answered by Σiεqdi. Our main result is a polynomial reconstruction algorithm of data from noisy (perturbed) subset sums. Applying this reconstruction algorithm to statistical databases we show that in order to achieve privacy one has to add perturbation of magnitude (Ω√n). That is, smaller perturbation always results in a strong violation of privacy. We show that this result is tight by exemplifying access algorithms for statistical databases that preserve privacy while adding perturbation of magnitude Õ(√n).For time-T bounded adversaries we demonstrate a privacypreserving access algorithm whose perturbation magnitude is ≈ √T.

Information-Set Decoding with Hints

Chapter

Jan 2022

This paper studies how to incorporate small information leakages (called “hints”) into information-set decoding (ISD) algorithms. In particular, the influence of these hints on solving the (n, k, t)-syndrome-decoding problem (SDP), i.e., generic syndrome decoding of a code of length n, dimension k, and an error of weight t, is analyzed. We motivate all hints by leakages obtainable through realistic side-channel attacks on code-based post-quantum cryptosystems. One class of studied hints consists of partial knowledge of the error or message, which allow to reduce the length, dimension, or error weight using a suitable transformation of the problem. As a second class of hints, we assume that the Hamming weights of sub-blocks of the error are known, which can be motivated by a template attack. We present adapted ISD algorithms for this type of leakage. For each third-round code-based NIST submission (Classic McEliece, BIKE, HQC), we show how many hints of each type are needed to reduce the work factor below the claimed security level. E.g., for Classic McEliece mceliece348864, the work factor is reduced below $2^{128}$ for 9 known error locations, 650 known error-free positions or known Hamming weights of 29 sub-blocks of roughly equal size.KeywordsPost-quantum cryptographyCode-based cryptographyInformation set decodingSide-channel attacks

Message-Recovery Laser Fault Injection Attack on the Classic McEliece Cryptosystem

Chapter

Jun 2021

Code-based public-key cryptosystems are promising candidates for standardization as quantum-resistant public-key cryptographic algorithms. Their security is based on the hardness of the syndrome decoding problem. Computing the syndrome in a finite field, usually F2, guarantees the security of the constructions. We show in this article that the problem becomes considerably easier to solve if the syndrome is computed in N instead. By means of laser fault injection, we illustrate how to compute the matrix-vector product in N by corrupting specific instructions, and validate it experimentally. To solve the syndrome decoding problem in N, we propose a reduction to an integer linear programming problem. We leverage the computational efficiency of linear programming solvers to obtain real-time message recovery attacks against the code-based proposal to the NIST Post-Quantum Cryptography standardization challenge. We perform our attacks in the worst-case scenario, i.e. considering random binary codes, and retrieve the initial message within minutes on a desktop computer. Our attack targets the reference implementation of the Niederreiter cryptosystem in the NIST PQC competition finalist Classic McEliece and is practically feasible for all proposed parameters sets of this submission. For example, for the 256-bit security parameters sets, we successfully recover the message in a couple of seconds on a desktop computer Finally, we highlight the fact that the attack is still possible if only a fraction of the syndrome entries are faulty. This makes the attack feasible even though the fault injection does not have perfect repeatability and reduces the computational complexity of the attack, making it even more practical overall.

LWE Without Modular Reduction and Improved Side-Channel Attacks Against BLISS: 24th International Conference on the Theory and Application of Cryptology and Information Security, Brisbane, QLD, Australia, December 2–6, 2018, Proceedings, Part I

Chapter

Jan 2018

Decoding Linear Codes with High Error Rate and Its Impact for LPN Security

Chapter

Apr 2018

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes

Conference Paper

Apr 2015

We propose a new decoding algorithm for random binary linear codes. The so-called information set decoding algorithm of Prange (1962) achieves worst-case complexity $2^{0.121n}$. In the late 80s, Stern proposed a sort-and-match version for Prange’s algorithm, on which all variants of the currently best known decoding algorithms are build. The fastest algorithm of Becker, Joux, May and Meurer (2012) achieves running time $2^{0.102n}$ in the full distance decoding setting and $2^{0.0494n}$ with half (bounded) distance decoding. In this work we point out that the sort-and-match routine in Stern’s algorithm is carried out in a non-optimal way, since the matching is done in a two step manner to realize an approximate matching up to a small number of error coordinates. Our observation is that such an approximate matching can be done by a variant of the so-called High Dimensional Nearest Neighbor Problem. Namely, out of two lists with entries from ${\mathbb F}_2^m$ we have to find a pair with closest Hamming distance. We develop a new algorithm for this problem with sub-quadratic complexity which might be of independent interest in other contexts. Using our algorithm for full distance decoding improves Stern’s complexity from $2^{0.117n}$ to $2^{0.114n}$. Since the techniques of Becker et al apply for our algorithm as well, we eventually obtain the fastest decoding algorithm for binary linear codes with complexity $2^{0.097n}$. In the half distance decoding scenario, we obtain a complexity of $2^{0.0473n}$.

Group testing under sum observations for heavy hitter detection

Conference Paper

Feb 2015

Data extraction via histogram and arithmetic mean queries: Fundamental limits and algorithms

Conference Paper

Jul 2016

Testing the Maximum by the Mean in Quantitative Group Tests

Chapter

Apr 2014

Group testing, introduced by Dorfman in 1943, increases the efficiency of screening individuals for low prevalence diseases. A wider use of this kind of methodology is restricted by the loss of sensitivity inherent to the mixture of samples. Moreover, as this methodology attains greater cost reduction in the cases of lower prevalence (and, consequently, a higher optimal batch size), the phenomenon of rarefaction is crucial to understand that sensitivity reduction. Suppose, with no loss of generality, that an experimental individual test consists in determining if the amount of substance overpasses some prefixed threshold l. For a pooled sample of size n, the amount of substance of interest is represented by $\left (Y _{1},\cdots \,,Y _{n}\right )$, with mean $\overline{Y }_{n}$ and maximum M n . The goal is to know if any of the individual samples exceeds the threshold l, that is, M n > l. It is shown that the dependence between $\overline{Y }_{n}$ and M n has a crucial role in deciding the use of group testing since a higher dependence corresponds to more information about M n given by the observed value of $\overline{Y }_{n}$.

Group Testing under Sum Observations for Heavy Hitter Detection

Article

Jul 2014

We introduce a variation of the classic group testing problem referred to as group testing under sum observations. In this new formulation, when a test is carried out on a group of items, the result reveals not only whether the group is contaminated, but also the number of defective items in the tested group. We establish the optimal nested test plan within a minimax framework that minimizes the total number of tests for identifying all defective items in a given population. This optimal test plan and its performance are given in closed forms. It guarantees to identify all $d$ defective items in a population of $n$ items in ${O\left(d\log_2{\left( n/d \right)}\right)}$ tests. This new formulation is motivated by the heavy hitter detection problem in traffic monitoring in Internet and general communication networks. For such applications, it is often the case that a few abnormal traffic flows with exceptionally high volume (referred to as heavy hitters) make up most of the traffic seen by the entire network. To detect the heavy hitters, it is more efficient to group subsets of flows together and measure the aggregated traffic rather than testing each flow one by one. Since the volume of heavy hitters is much higher than that of normal flows, the number of heavy hitters in a group can be accurately estimated from the aggregated traffic load.

An Observation on the Security of McEliece’s Public-Key Cryptosystem

Conference Paper

May 1988

The best known cryptanalytic attack on McEliece’s public-key cryptosystem based on algebraic coding theory is to repeatedly select k bits at random from an n-bit ciphertext vector, which is corrupted by at most t errors, in hope that none of the selected k bits are in error until the cryptanalyst recovers the correct message. The method of determining whether the recovered message is the correct one has not been throughly investigated. In this paper, we suggest a systematic method of checking, and describe a generalized version of the cryptanalytic attack which reduces the work factor significantly (factor of 211 for the commonly used example of n=1024 Goppa code case). Some more improvements are also given. We also note that these cryptanalytic algorithms can be viewed as generalized probabilistic decoding algorithms for any linear error correcting codes.

A method for finding code words of small weight

Conference Paper

Jan 1988

Jacques Stern

We describe a probabilistic algorithm, which can be used to discover words of small weight in a linear binary code. The work-factor of the algorithm is asymptotically quite large but the method can be applied for codes of a medium size. Typical instances that are investigated are codewords of weight 20 in a code of length 300 and dimension 150.

Security-Control Methods for Statistical Databases: A Comparative Study.

Article

Dec 1989

This paper considers the problem of providing security to statistical databases against disclosure of confidential information. Security-control methods suggested in the literature are classified into four general approaches: conceptual, query restriction, data perturbation, and output perturbation. Criteria for evaluating the performance of the various security-control methods are identified. Security-control methods that are based on each of the four approaches are discussed, together with their performance with respect to the identified evaluation criteria. A detailed comparative analysis of the most promising methods for protecting dynamic-online statistical databases is also presented. To date no single security-control method prevents both exact and partial disclosures. There are, however, a few perturbation-based methods that prevent exact disclosure and enable the database administrator to exercise "statistical disclosure control." Some of these methods, however introduce bias into query responses or suffer from the 0/1 query-set-size problem (i.e., partial disclosure is possible in case of null query set or a query set of size 1). We recommend directing future research efforts toward developing new methods that prevent exact disclosure and provide statistical-disclosure control, while at the same time do not suffer from the bias problem and the 0/1 query-set-size problem. Furthermore, efforts directed toward developing a bias-correction mechanism and solving the general problem of small query-set-size would help salvage a few of the current perturbation-based methods.

On the inherent intractability of certain coding problems

Article

May 1978

The fact that the general decoding problem for linear codes and the general problem of finding the weights of a linear code are both NP-complete is shown. This strongly suggests, but does not rigorously imply, that no algorithm for either of these problems which runs in polynomial time exists.

The Use of Information Sets in Decoding Cyclic Codes

Article

Oct 1962

Eugene Prange

A class of decoding algorithms using encoding-and-comparison is considered for error-correcting code spaces. Code words, each of which agrees on some information set for the code with the word r to be decoded, are constructed and compared with r . An operationally simple algorithm of this type is studied for cyclic code spaces A . Let A have length n , dimension k over some finite field, and minimal Hamming distance m . The construction of fewer than n^2/2 code words is required in decoding a word r . The procedure seems to be most efficient for small minimal distance m , but somewhat paradoxically it is suggested on operational grounds that it may prove most useful in those cases where m is relatively large with respect to the code length n .

On the inherent intractability of certain coding problems (Corresp.)

Article

Jun 1978

MEMBER, IEEE, AND HENK C. A. V~ TILBORG The fact that the general decoding problem for linear codes and the general problem of finding the weights of a linear code are both NP-complete is shown. This strongly suggests, but does not rigorously imply, that no algorithm for either of these problems which runs in polynomial time exists.

Algorithms for Quantum Computation: Discrete Logarithms and Factoring

Article

Oct 1996

Peter W. Shor

A computer is generally considered to be a universal computational device; i.e., it is believed able to simulate any physical computational device with a increase in computation time of at most a polynomial factor. It is not clear whether this is still true when quantum mechanics is taken into consideration. Several researchers, starting with David Deutsch, have developed models for quantum mechanical computers and have investigated their computational properties. This paper gives Las Vegas algorithms for finding discrete logarithms and factoring integers on a quantum computer that take a number of steps which is polynomial in the input size, e.g., the number of digits of the integer to be factored. These two problems are generally considered hard on a classical computer and have been used as the basis of several proposed cryptosystems. (We thus give the first examples of quantum cryptanalysis.) 1 Introduction Since the discovery of quantum mechanics, people have found the behavior of...

Hamming quasi-cyclic (hqc)

C A Melchor
N Aragon
S Bettaieb
L Bidoux
O Blazy
J.-C Deneuville

Decoding random linear codes in O(20.054n)

A May
A Meurer
E Thomae

Message-recovery profiled side-channel attack on the classic mceliece cryptosystem

B Colombier
V.-F Dragoi
P.-L Cayrel
V Grosso

Quantitative group testing and the rank of random matrices

U Feige
A Lellouche

Security-control methods for statistical databases: A comparative study

N R Adam
J C Worthmann

Message-recovery profiled side-channel attack on the classic mceliece cryptosystem

Jan 2022

colombier

Integer Syndrome Decoding in the Presence of Noise

No full-text available

Recommended publications

Presistant Visual Noise (Visual Snow Syndrome)

[Correlation between noise level, hearing loss, and neurovascular diseases among workers]

The case of primary-multiple gonadal tumors in the background of the syndrome of complete insensitiv...

More things that go bang in the night

Hintergründe und Konsequenzen des PICS auf der Intensivstation

Bruit industriel et effets neuropsychiques