Conference PaperPDF Available

Security Margin Evaluation of SHA-3 Contest Finalists through SAT-Based Attacks

September 2012

September 2012

DOI:10.1007/978-3-642-33260-9_4

Conference: 11th International Confonference on Information Systems and Industrial Management

Authors:

Ekawat Homsirikamol

George Mason University

Marcin Rogawski

Cadence Design Systems, Inc.

Marian Srebrny

Polish Academy of Sciences

In 2007, the U.S. National Institute of Standards and Tech-nology (NIST) announced a public contest aiming at the selection of a new standard for a cryptographic hash function. In this paper we eval-uate the security margin of five SHA-3 candidates which have qualified to the final round of the contest. We assume that attacks must have practical complexities, i.e., can be practically verified. Our attacks use a method called logical cryptanalysis where the original task is expressed as a SATisfiability problem. We use a new toolkit which greatly simplifies the most arduous stages of this type of cryptanalysis and helps to mount the attacks in a uniform way. We show that in the context of SAT-based attacks all the finalists have substantially bigger security margin than the current standards SHA-256 and SHA-1.

. The parameters of our CNF formulas coding hash functions

…

. Attacks on the compression functions

…

Figures - uploaded by Marcin Rogawski

Content may be subject to copyright.

Content uploaded by Marcin Rogawski

Content may be subject to copyright.

Security margin evaluation of SHA-3 contest

ﬁnalists through SAT-based attacks

Ekawat Homsirikamol2, Pawe l Morawiecki1,

Marcin Rogawski2, and Marian Srebrny1,3

1Section of Informatics, University of Commerce, Kielce, Poland,

2Cryptographic Engineering Research Group, George Mason University, USA

3Institute of Computer Science, Polish Academy of Sciences, Poland

Abstract. In 2007, the U.S. National Institute of Standards and Tech-

nology (NIST) announced a public contest aiming at the selection of a

new standard for a cryptographic hash function. In this paper we eval-

uate the security margin of ﬁve SHA-3 candidates which have qualiﬁed

to the ﬁnal round of the contest. We assume that attacks must have

practical complexities, i.e., can be practically veriﬁed. Our attacks use a

method called logical cryptanalysis where the original task is expressed

as a SATisﬁability problem. We use a new toolkit which greatly simpliﬁes

the most arduous stages of this type of cryptanalysis and helps to mount

the attacks in a uniform way. We show that in the context of SAT-based

attacks all the ﬁnalists have substantially bigger security margin than

the current standards SHA-256 and SHA-1.

Key words: cryptographic hash algorithm, SHA-3 competition, alge-

braic cryptanalysis, logical cryptanalysis, SATisﬁability solvers

1 Introduction

In 2007, the U.S. National Institute of Standards and Technology (NIST) an-

nounced a public contest aiming at the selection of a new standard for a cryp-

tographic hash function. The main motivation behind starting the contest has

been the security ﬂaws identiﬁed in SHA-1 standard in 2005. Similarities between

SHA-1 and the most recent standard SHA-2 are worrisome and NIST decided

that a new, stronger hash function is needed. 51 functions were accepted to the

ﬁrst round of the contest and in July 2009 among those functions 14 were se-

lected to the second round. At the end of 2010 ﬁve ﬁnalists were announced:

BLAKE [15], Groestl [21], JH [28], Keccak [13], and Skein [2]. The winning al-

gorithm will be named ’SHA-3’ and most likely will be selected in the second

half of 2012.

Security, performance in software, performance in hardware and ﬂexibility

are the four primary criteria normally used in evaluation of candidates. Out of

these four criteria, security is the most important criterion, yet it is also the

most diﬃcult to evaluate and quantify. There are two main ways of estimating

the security margin of a given cryptosystem. The ﬁrst one is to compare the

complexities of the best attacks on the full cryptosystem. The problem with this

approach is that for many modern designs there is no known successful attack

on the full cryptosystem. Security margin would be the same for all algorithms

where there is nothing better than the exhaustive search if this approach is used.

This is not diﬀerent for SHA-3 contest where no known attacks on the full func-

tions, except JH, have been reported. For JH there is a preimage attack [19]

but its time complexity is nearly equal to the exhaustive search and memory

accesses are over the exhaustive search bound. Therefore estimating the secu-

rity margin using this approach tells us very little or nothing about diﬀerences

between the candidates in terms of their security level. The second approach of

how to measure the security margin is to compare the number of broken rounds

to the total number of rounds in a given cryptosystem. As a vast majority of

modern ciphers and hash functions (in particular, the SHA-3 contest ﬁnalists)

has an iterative design, this approach can be applied naturally. However, there is

also a problem with comparing security levels calculated this way. For example,

there is an attack on 7-round Keccak-512 with complexity 2507 [3] and there

is an attack on 3-round Groestl-512 with complexity 2192 [23]. The ﬁrst attack

reaches relatively more rounds (29%) but with higher complexity whereas the

second attack has lower complexity but breaks fewer rounds (21%). Both attacks

are completely non-practical. It is very unclear how such results help to judge

which function is more secure.

In this paper we follow the second approach of measuring the security margin

but with an additional restriction. We assume that the attacks must have prac-

tical complexities, i.e., can be practically veriﬁed. It is very similar to the line

of research recently presented in [5, 6]. This restriction puts the attacks in more

’real life’ scenarios which seems especially important for SHA-3 standard. So far

a large amount of cryptanalysis has been conducted on the ﬁnalists, however the

majority of papers focuses on maximizing the number of broken rounds which

leads to extremely high data and time complexity. These theoretical attacks have

great importance but the lack of practical approach is evident. We hope that

our work helps to ﬁll this gap to some extent.

The method of our analysis is a SAT-based attack. SAT was the ﬁrst known

NP-complete problem, as proved by Stephen Cook in 1971 [7]. A SAT solver de-

cides whether a given propositional (boolean) formula has a satisfying valuation.

Finding a satisfying valuation is infeasible in general, but many SAT instances

can be solved surprisingly eﬃciently. There are many competing algorithms for

it and many implementations, most of them have been developed over the last

two decades as highly optimized versions of the DPLL procedure [10] and [11].

SAT solvers can be used to solve instances typically described in the Con-

junctive Normal Form (CNF) into which any decision problem can be translated.

Modern SAT solvers use highly tuned algorithms and data structures to ﬁnd a

solution to a given problem coded in this very simple form. To solve your prob-

lem: (1) translate the problem to SAT (in such a way that a satisfying valuation

represents a solution to the problem); (2) run your favorite SAT solver to ﬁnd a

solution. The ﬁrst connection between SAT and crypto dates back to [8], where a

suggestion appeared to use cryptoformulae as hard benchmarks for propositional

satisﬁability checkers. The ﬁrst application of SAT solvers in cryptanalysis was

due to Massacci et al. [18] called logical cryptanalysis. They ran a SAT solver

on DES key search, and then also for faking an RSA signature for a given mes-

sage by ﬁnding the e-th roots of a (digitalized) message mmodulo n, in [12].

Courtois and Pieprzyk [9] presented an approach to code in SAT their algebraic

cryptanalysis with some gigantic systems of low degree equations designed as

potential procedures for breaking some ciphers. Soos et al. [26] proposed en-

hancing a SAT solver with some special-purpose algebraic procedures, such as

Gaussian elimination. Mironov and Zhang [20] showed an application of a SAT

solver supporting a non-automatic part of the attack [27] on SHA-1.

In this work we use SAT-based attacks to evaluate security margin of the

SHA-3 contest ﬁnalists and also compare them to the current standards, in par-

ticular SHA-256. We show that all ﬁve ﬁnalists have a big security margin (yet

it diﬀers a bit between functions) against these kind of attacks and are substan-

tially more secure than SHA-1 and SHA-256. We also report some interesting

results on particular functions or its building blocks. For 2-round Keccak with

256-bit hash size we mount preimage and colision attacks and our attacks are

currently the best practical preimage and collision attacks on reduced Keccak.

For 6-round Skein-256 compression function we ﬁnd pseudo-collisions. For the

comparison, the Skein’s authors reached 8 rounds but they found only pseudo-

near-collision [2]. In the attacks we use our toolkit which is a combination of the

existing tools and some newly developed parts. The toolkit helps in mounting

the attacks in a uniform way and it can be easily used for cryptanalysis not

only of hash functions but also of other cryptographic primitives such as block

or stream ciphers.

2 Methodology of our SAT-based attacks

2.1 A toolkit for CNF formula generation

One of the key steps in attacking cryptographic primitives with SAT solvers is

CNF (conjunctive normal form) formula generation. A CNF is a conjunction of

clauses, i.e., of disjunctions of literals, where a literal is a boolean valued variable

or its negation. Thus, a formula is presented to a SAT solver as one big ’AND’

of ’ORs’. A cryptographic primitive (or a segment of it) which is the target of a

SAT based attack has to be completely described by such a formula. Generating

it is a non-trivial task and usually is very laborious. There are many ways to

obtain a ﬁnal CNF and the output results diﬀer in the number of clauses, the

average size of clauses and the number of literals. Recently we have developed a

new toolkit which greatly simpliﬁes the generation of CNF.

Usually a cryptanalist needs to put a considerable eﬀort into creating a ﬁnal

CNF. It involves writing a separate program dedicated only to the cryptographic

primitive under consideration. To make it eﬃcient, some minimizing algorithms

(Karnaugh maps, Quine-McCluskey algorithm or Espresso algorithm) have to

be used [17]. These have to be implemented in the program, or the intermedi-

ate results are sent to an external tool (e.g., Espresso minimizer) and then the

minimized form is sent back to the main program. Implementing all of these

procedures requires a good deal of programming skills, some knowledge of logic

synthesis algorithms and careful insight into the details of the primitive’s opera-

tion. As a result, obtaining CNF might become the most tedious and error-prone

part of any attack. It could be especially discouraging for researchers who start

their work from scratch and do not want to spend too much time on writing

thousands lines of code.

To avoid those disadvantages we have recently proposed a new toolkit con-

sisting basically of two applications. The ﬁrst of them is Quartus II — a software

tool released by Altera for analysis and synthesis of HDL (Hardware Description

Language) designs, which enables the developers to compile their designs and

conﬁgure the target devices (usually FPGAs). We use a free-of-charge version

Quartus II Web Edition which provides all the features that we need. The second

application, written by us, converts boolean equations (generated by Quartus)

to CNF encoded in DIMACS format (standard format for today’s SAT solvers).

The complete process of CNF generation includes the following steps:

1. Code the target cryptographic primitive in HDL;

2. Compile and synthesize the code in Quartus;

3. Generate boolean equations using Quartus inbuilt tool;

4. Convert generated equations to CNF by our converter.

Steps 2, 3, and 4 are done automatically. Using this method the only eﬀort

a researcher has to put is to write a code in HDL. Normally programming and

’thinking’ in HDL is a bit diﬀerent from typical high-level languages like Java or

C. However it is not the case here. For our needs, programming in HDL looks

exactly the same as it would be done in high-level languages. There is no need

to care about typical HDL speciﬁc issues like proper expressing of concurrency

or clocking. It is because we are not going to implement anything in a FPGA

device. All we need is to obtain a system of boolean equations which completely

describes the primitive we wish to attack. Once the boolean equations are gen-

erated by the Quartus inbuilt tool, the equations are converted into CNF by the

separate application. The conversion implemented in our application is based

on the boolean laws (commutativity, associativity, distributivity, identity, De

Morgan’s laws) and there are no complex algorithms involved.

It must be noted that Quartus programming environment gives us two impor-

tant features which may help to create a possibly compact CNF. It minimizes

the functions up to 6 variables using Karnaugh maps. Additionally, all ﬁnal

equations have at most 5 variables (4 inputs, 1 output). It is because Quartus

is dedicated to FPGA devices which are built out of ’logic cells’, each with 4

inputs/1 output. (There are also FPGAs with diﬀerent parameters; e.g., 5/2.

But we chose 4/1 architecture in all the experiments.) This feature is helpful

when dealing with linear ANF equations with many variables (also referred as

’long XOR equations’). A simple conversion of such an equation to CNF gives an

exponential number of clauses; an equation in n-variables corresponds to 2n−1

clauses in CNF. A common way of dealing with this problem is to introduce new

variables and cut the original equation into a few shorter ones.

Example 1. Let us consider an equation with 5 variables:

a+b+c+d+e= 0

A CNF corresponding to this equation consists of 25−1clauses with 5 literals

in each clause. However, introducing two new variables, we can rewrite it as a

system of three equations:

a+b+x= 0

c+d+y= 0

e+x+y= 0

A CNF corresponding to this system of equations would consist of 22+22+ 22=

12 clauses.

Quartus automatically introduces new variables and cuts long equations to

satisfy the requirements for FPGA architecture. Consequently a researcher needs

not to be worried that the CNF would be much aﬀected by very long XOR equa-

tions (which may be a part of the original cryptographic primitive’s description).

To the best of our knowledge, there are only two other tools which provide

similar functionality to our toolkit — automate the CNF generation and help

to mount the uniform SAT-based attacks. First is the solution proposed in [16]

where the main idea is to change the behaviour of all the arithmetic and logical

operators that the algorithm uses, in such a way that each operator produces a

propositional formula corresponding to the operation performed. It is obtained

by using C++ implementation and a feature of the C++ language called opera-

tor overloading. Authors tested their method on MD4 and MD5 functions. The

proposed method can be applied to other crypto primitives but it is not clear

how it would deal with more complex operations, e.g. an S-box described as a

look-up table. The second tool is called Grain of Salt [25] and it incorporates

some algorithms to optimize a generated CNF. However it can be only used with

a family of stream ciphers.

In comparison two these two tools, our proposal is the most ﬂexible. It can

be used with many diﬀerent cryptographic primitives (hash functions, block

and stream ciphers) and it does not limit an input description to only simple

boolean operations. The toolkit handles XOR equations eﬃciently and also takes

an advantage of logic synthesis algorithms which help to provide more compact

CNF.

2.2 Our SAT-based attack

All the attacks reported in the paper have a very similar form and consist of the

following steps.

1. Generate the CNF formula by our toolkit;

2. Fix the hash and padding bits in the formula;

3. Run a SAT solver on the generated CNF.

The above scheme is used to mount a preimage attack, i.e., for a given hash

value h, we try to ﬁnd a message msuch that h=f(m). CryptoMiniSAT2, gold

medalist from recent SAT competitions [24], is selected as our SAT solver. In the

preliminary experiments, we also tried other state-of-art SAT solvers (Lingeling

[4], Glucose [1]) but overall CryptoMiniSAT2 solves our formulas faster.

We attack functions with 256-bit hash. When constructing a CNF coding a

hash function, one has to decide the size of the message (how many message

blocks are taken as an input to the function). It is easier for a SAT solver to

tackle with a single message block because coding each next message block would

make a formula twice as big. However, each of the ﬁve ﬁnalists has diﬀerent way

of padding the message. If only one message block is allowed, BLAKE-256 can

take maximally 446 bits of message which are padded to get a 512-bit block. On

the other hand, Keccak-256 can take as many as 1086 bits of message in a single

block. To avoid the situation where one formula has much more message bits

to search for by a SAT solver than the other formula, message is ﬁxed to 446

bits (maximum value for BLAKE-256 with one message block processed, other

ﬁnalists allow more).

To ﬁnd a second preimage or a collision, only a small adjustment to the afore-

mentioned attack is required. Once the preimage is found, we run SAT solver

on exactly the same formula but with one message bit ﬁxed to the opposite

value of that from the preimage (rest of the message bits are left unknown).

It turns out that in a very similar time the SAT solver is able to solve such

slightly modiﬁed formula, providing a second preimage and a collision. The sec-

ond preimages/collisions are expected because with a size of the message ﬁxed

to 446 bits we have 446 to 256 bits mapping.

3 Results

We have conducted the preimage attack described in Section 2.2 on the ﬁve

ﬁnalists and also on the two standards SHA-256 and SHA-1. As a SAT solver

we used CryptoMiniSat2, 2.9.0 version, with the parameters gaussuntil=0 and

restart=static. These settings were suggested by the author of CryptoMiniSat2.

The experiments were carried out on Intel Core i7 2.67 GHz with 8 Gb RAM.

Starting with 1-round variants of the functions, the SAT solver was run to solve

the given formula and gave us the preimage. The time limit for each experiment

was set to 30 hours. If the solution was found, we added one more round, encoded

in CNF and gave it to the solver. The attack stopped when the time limit was

exceeded or memory ran out. Table 1 shows the results. The second column

contains the number of broken rounds in our preimage attack and the third

column shows the security margin calculated as a quotient of the number of

broken rounds and the total number of rounds. For clarity, we are reporting our

preimage attack but as explained above it can be easily modiﬁed to get a second

preimage or a collision. Therefore the numbers from Table 1 remain valid for all

three types of attacks.

All the SHA-3 contest ﬁnalists have substantially bigger security margin than

SHA-256 and SHA-1 standards. On the other hand, the ﬁnalists diﬀer slightly

(maximally 7%) and all have the security margin over 90%. For Groestl we were

not able to attack even a 1-round variant, nor a simpliﬁed Groestl with the out-

put transformation replaced by a simple truncation. The only successful attack

on Groestl (or rather part of it) is the attack on the output transformation in the

1-round variant of Groestl. The output transformation is not a complete round

but giving 100% of security margin would not be fair neither. Therefore we try to

estimate ’a weight’ of the output transformation. Essentially all the operations

(equations) in Groestl compression function come from two very similar permu-

tations (Pand Q). The output transformation is built on the Ppermutation

only so it can be treated as a half-operation of the compression function. Hence

the attack on the output transformation in a 1-round variant of Groestl is shown

in Table 1 as half the round.

All the reported attacks on the ﬁnalists took just a few seconds. Only for

16-round SHA-256 the attack lasted longer — one hour. Despite the fact a con-

servative time limit (30 hours) was set for this type of experiments, it did not

help to extend the attack to reach one more round. It seems that the time of the

attack grows superexponentially in the number of rounds. The same behaviour

was observed by Rivest et al. when they tested MD6 function with their SAT-

based analysis [22]. For MD6 with 256-bit hash size, they reached 10 rounds

which gives 90% of security margin. For a reader interested in estimating the

asymptotic complexity of our attacks, we report that it would be very diﬃcult

mainly because Altera does not reveal details of algorithms used in Quartus.

Table 1. Security margin comparison calculated from the results of our preimage

attacks on round-reduced hash functions

Function No. of rounds Security margin

SHA-1 21 74% (21/80)

SHA-256 16 75% (16/64)

Keccak-256 2 92% (2/24)

BLAKE-256 1 93% (1/14)

Groestl-256 0.5∗95% (0,5/10)

JH-256 2 96% (2/42)

Skein-256 1 99% (1/72)

It is interesting to see if the parameters of CNF formula, that is the number

of variables and clauses, could be a good metric for measuring the hardness of

the formula and consequently the security margin. Table 2 shows the numbers

of variables and clauses for full hash functions. The values are rounded to the

nearest thousand. For SHA-1, SHA-256, BLAKE, and Skein, we have generated

the complete formula with our toolkit. For the other functions, we have extrapo-

lated the numbers from round-reduced variants as the toolkit had some memory

problems with those huge instances (over 1 million of clauses). As every round

in the given function is basically the same (consists of the same type of equa-

tions), the linear extrapolation is straightforward. For the examined functions,

the CNF formula parameters could be a good metric for measuring the hardness

of the formula but only to some extent. Indeed, the smallest formulas (SHA-1

and SHA-256) have the lowest security margin but, for example, BLAKE and

Keccak have nearly the same security level while Keccak formula is more than

twice as big.

Table 2. The parameters of our CNF formulas coding hash functions

Function Variables Clauses

SHA-1 29 000 200 000

SHA-256 61 000 400 000

BLAKE-256 57 000 422 000

Keccak-256 88 000 1 075 000

Skein-256 148 000 1 041 000

JH-256 169 600 1 998 000

Groestl-256 279 000 3 568 000

Besides the attacks on (round-reduced) hash functions, we have also mounted

the attacks on the compression functions — main building blocks of hash func-

tions. First we tried the preimage attack on a given compression function and if

it did not succeed we attacked the function in a scenario where an adversary can

choose IV (initial value) and get a pseudo-preimage. Table 3 summarizes these

attacks. Similarly as for the results from Table 1, the numbers remain valid for

all three types of attacks (a preimage, a second preimage and a collision attack).

Among the ﬁnalists our best attack was on 6-round Skein-256 compression func-

tion for which we found pseudo-collisions. For comparison, the Skein’s authors

reached 8 rounds but they found only a pseudo-near-collision. For Groestl com-

pression function we were not able to mount any successful attack. Keccak has

a completely diﬀerent design from MD hash function family — there is no any

typical compression function taking IV and a block of a message. Therefore in

Table 3 we do not report any result for these two functions.

It is very diﬃcult to give a good and clear answer which designs or features

of a hash function are harder for SAT solvers. One observation we have made

is that those designs which use S-boxes (JH and Groestl) have the biggest CNF

formula and are among the hardest for SAT solvers. Equations describing an

S-box are more complex than equations describing addition or boolean AND.

Consequently, the corresponding CNF formula for the S-box is also more complex

∗Only output transformation broken. It is estimated as an equivalent to one half-

operation of the Groestl compression function.

Table 3. Attacks on the compression functions

Function Type of attack No. of rounds Security margin

SHA-1 preimage 21 74% (21/80)

SHA-256 preimage 16 75% (16/64)

BLAKE-256 preimage 1 93% (1/14)

JH-256 preimage 2 96% (2/42)

Skein-256 pseudo-preimage 6 92% (6/72)

with greater number of variables and clauses than in the case of other typical

operations. Before we give an example, let us ﬁrst take a closer look at the

addition operation. This operation is used in SHA-1, SHA-256, BLAKE, and

Skein. In our toolkit the addition of two words is described by the following

equations (a full adder equations):

Si=Ai⊕Bi⊕Ci−1

Ci= (Ai·Bi)⊕(Ci−1·(Ai⊕Bi))

Siis the i-th bit of the sum of two i-bit words A and B. Ciis the i-th carry

output.

Now let us compare the CNF sizes of the addition operation and AES S-box

used in Groestl. A CNF of 32-bit addition has 411 clauses and 124 variables

while AES S-box given to our toolkit as a look-up table gives a CNF with 4800

clauses and 900 variables. We also experimented with an alternative description

of AES S-box expressed as boolean logic equations, instead of a look-up table

[14]. This description reduces the CNF size approximately by half but still it is

a degree of order greater than the CNF from the 32-bit addition operation.

We have also observed that there is no clear limit in size of CNF formulas

beyond which a SAT solver fails. For example the CNF of 2-round JH with 59

thousand clauses is solved within seconds whereas the CNF of 2-round Skein

with 27 thousand clauses was not solved having 30 hours of time limit. What

exactly causes the diﬀerence between hardness of formulas is a good point for

further research.

4 Conclusion

The security margin of the ﬁve ﬁnalists of the SHA-3 contest using our SAT-

based cryptanalysis has been evaluated in this paper. A new toolkit which greatly

simpliﬁes the most tedious stages of this type of analysis and helps to mount

the attacks in a uniform way has been proposed and developed. Our toolkit is

more ﬂexible than the existing tools and can be applied to various cryptographic

primitives. Based on our methodology, we have shown that all the ﬁnalists have

substantially bigger security margin than the current standards SHA-256. We

stress that ’bigger security margin’ we refer only to the context of our SAT-

based analysis. Using other techniques (e.g., linear cryptanalysis) could lead to

a diﬀerent conclusion.

As a side eﬀect of our security margin evaluation, we have also carried out

some attacks on compression functions and reported some new state-of-the-art

results. For example, we have found pseudo-collisions for 6-round Skein-256 com-

pression function.

References

1. Audemard, G., Simon, L.: Glucose SAT Solver, http://www.lri.fr/~simon/

?page=glucose

2. B. Schneier et al.: The Skein Hash Function Family, http://www.skein-hash.

info/sites/default/files/skein1.1.pdf

3. Bernstein, D.J.: Second preimages for 6 (7? (8??)) rounds of Keccak? NIST mailing

list (2010), http://ehash.iaik.tugraz.at/uploads/6/65/NIST-mailing-list_

Bernstein-Daemen.txt

4. Biere, A.: Lingeling, http://fmv.jku.at/lingeling

5. Biryukov, A., Dunkelman, O., Keller, N., Khovratovich, D., Shamir, A.: Key Re-

covery Attacks of Practical Complexity on AES Variants With Up To 10 Rounds.

Cryptology ePrint Archive, Report 2009/374 (2009), http://eprint.iacr.org/

2009/374

6. Bouillaguet, C., Derbez, P., Dunkelman, O., Keller, N., Rijmen, V., Fouque,

P.A.: Low Data Complexity Attacks on AES. Cryptology ePrint Archive, Report

2010/633 (2010), http://eprint.iacr.org/2010/633

7. Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the

third annual ACM symposium on Theory of computing. pp. 151–158. STOC ’71,

ACM, New York, NY, USA (1971)

8. Cook, S.A., Mitchell, D.G.: Finding hard instances of the satisﬁability problem: A

survey. pp. 1–17. American Mathematical Society (1997)

9. Courtois, N., Pieprzyk, J.: Cryptanalysis of block ciphers with overdeﬁned systems

of equations. In: Zheng, Y. (ed.) Advances in Cryptology ASIACRYPT 2002,

Lecture Notes in Computer Science, vol. 2501, pp. 267–287. Springer Berlin /

Heidelberg (2002)

10. Davis, M., Logemann, G., Loveland, D.: A machine program for theorem-proving.

Communications of the ACM 7(5), 394–397 (1962)

11. Davis, M., Putnam, H.: A computing procedure for quantiﬁcation theory. Journal

of the ACM 7, 201–215 (1960)

12. Fiorini, C., Martinelli, E., Massacci, F.: How to fake an RSA signature by encoding

modular root ﬁnding as a SAT problem. Discrete Applied Mathematics 130, 101–

127 (2003)

13. G. Bertoni et al.: Keccak sponge function family main document, http://keccak.

noekeon.org/Keccak-main-2.1.pdf

14. Gaj, K., Chodowiec, P.: FPGA and ASIC Implementatios of AES. In: Koc, C.K.

(ed.) Cryptographic Engineering, chap. 10, pp. 235–294. Springer (2009)

15. J. P. Aumasson et al.: SHA-3 proposal BLAKE, http://www.131002.net/blake/

16. Jovanovic, D., Janicic, P.: Logical analysis of hash functions. In: Gramlich, B. (ed.)

Frontiers of Combining Systems, Lecture Notes in Computer Science, vol. 3717, pp.

200–215. Springer Berlin / Heidelberg (2005)

17. Lala, P.K.: Principles of modern digital design. Wiley-Interscience (2007)

18. Massacci, F.: Using Walk-SAT and Rel-SAT for cryptographic key search. In: In

Proceedings of the International Joint Conference on Artiﬁcial Intelligence. pp.

290–295 (1999)

19. Mendel, F., Thomsen, S.: An Observation on JH-512. Available online (2008),

http://ehash.iaik.tugraz.at/uploads/d/da/Jh_preimage.pdf

20. Mironov, I., Zhang, L.: Applications of SAT Solvers to Cryptanalysis of Hash

Functions. In: Biere, A., Gomes, C. (eds.) Theory and Applications of Satisﬁability

Testing - SAT 2006. LNCS, vol. 4121, pp. 102–115. Springer Berlin / Heidelberg

(2006)

21. P. Gauravaram et al.: Grøstl — a SHA-3 candidate, http://www.groestl.info

22. R. Rivest et al.: The MD6 hash function, http://groups.csail.mit.edu/cis/

md6/

23. Schlaﬀer, M.: Updated Diﬀerential Analysis of Groestl. Grstl website (January

2011), http://groestl.info/groestl-analysis.pdf

24. Soos, M.: CryptoMiniSat 2.5.0. In: SAT Race competitive event booklet (July

2010), http://www.msoos.org/cryptominisat2

25. Soos, M.: Grain of Salt — An Automated Way to Test Stream Ciphers through

SAT Solvers. In: Workshop on Tools for Cryptanalysis (2010)

26. Soos, M., Nohl, K., Castelluccia, C.: Extending SAT Solvers to Cryptographic

Problems. pp. 244–257 (2009)

27. Wang, X., Yin, Y.L., Yu, H.: Finding Collisions in the Full SHA-1. In: Crypto.

LNCS, vol. 3621, pp. 17–36. Springer Berlin / Heidelberg (2005)

28. Wu, H.: Hash Function JH, http://www3.ntu.edu.sg/home/wuhj/research/jh/

Appendix

For the reader’s convenience, we provide an example SystemVerilog code for

SHA-1 used in the experiments with our toolkit. In many cases a code strongly

resembles a pseudocode deﬁning a given cryptographic algorithm. A reader fa-

miliar with C or Java should have no trouble adjusting the code to our toolkit’s

needs.

module sha1(IN, OUT);

input [511:0] IN; // input here means 512-bit message block

output [159:0] OUT; // output here means 160-bit hash

reg [159:0] OUT;

reg [31:0] W_words [95:0]; // registers for W words

reg [31:0] h0 ,h1, h2, h3, h4;

reg [31:0] a, b, c, d, e, f, k, temp, temp2;

integer i;

always @ (IN, OUT)

begin

h0 = 32’h67452301; h1 = 32’hEFCDAB89;

h2 = 32’h98BADCFE; h3 = 32’h10325476;

h4 = 32’hC3D2E1F0;

a=h0;b=h1;c=h2;d=h3;e=h4;

W_words[15] = IN[31:0]; W_words[14] = IN[63:32];

W_words[13] = IN[95:64]; W_words[12] = IN[127:96];

W_words[11] = IN[159:128]; W_words[10] = IN[191:160];

W_words[9] = IN[223:192]; W_words[8] = IN[255:224];

W_words[7] = IN[287:256]; W_words[6] = IN[319:288];

W_words[5] = IN[351:320]; W_words[4] = IN[383:352];

W_words[3] = IN[415:384]; W_words[2] = IN[447:416];

W_words[1] = IN[479:448]; W_words[0] = IN[511:480];

for (i=16; i<=79; i=i+1)

begin

W_words[i] = W_words[i-3] ^ W_words[i-8] ^ W_words[i-14] ^ W_words[i-16];

W_words[i] = {W_words[i][30:0], W_words[i][31]}; // leftrotate 1

end

for (i=0; i<=79; i=i+1) // main loop

begin

if ((i>=0) && (i<=19))

begin

f = (b & c) | ((~b) & d); k = 32’h5A827999;

end

if ((i>=20) && (i<=39))

begin

f = b ^ c ^ d; k = 32’h6ED9EBA1;

end

if ((i>=40) && (i<=59))

begin

f = (b & c) | (b & d) | (c & d); k = 32’h8F1BBCDC;

end

if ((i>=60) && (i<=79))

begin

f = b ^ c ^ d; k = 32’hCA62C1D6;

end

temp2 = {a[26:0], a[31:27]}; // a leftrotate 5

temp = temp2 + f + e + k + W_words[i];

e = d;

d = c;

c = {b[1:0], b[31:2]}; // b leftrotate 30

b = a;

a = temp;

end // end of main loop

h0=h0+a;h1=h1+b;

h2=h2+c;h3=h3+d;h4=h4+e;

OUT = {h0, h1, h2, h3, h4}; //HASH

end

endmodule

Inverting Cryptographic Hash Functions via Cube-and-Conquer

Preprint

Full-text available

Dec 2022

Oleg Sergeevich Zaikin

MD4 and MD5 are seminal cryptographic hash functions proposed in early 1990s. MD4 consists of 48 steps and produces a 128-bit hash given a message of arbitrary finite size. MD5 is a more secure 64-step extension of MD4. Both MD4 and MD5 are vulnerable to practical collision attacks, yet it is still not realistic to invert them, i.e. to find a message given a hash. In 2007, the 39-step version of MD4 was inverted via reducing to SAT and applying a CDCL solver along with the so-called Dobbertin's constraints. As for MD5, in 2012 its 28-step version was inverted via a CDCL solver for one specified hash without adding any additional constraints. In this study, Cube-and-Conquer (a combination of CDCL and lookahead) is applied to invert step-reduced versions of MD4 and MD5. For this purpose, two algorithms are proposed. The first one generates inversion problems for MD4 by gradually modifying the Dobbertin's constraints. The second algorithm tries the cubing phase of Cube-and-Conquer with different cutoff thresholds to find the one with minimal runtime estimation of the conquer phase. This algorithm operates in two modes: (i) estimating the hardness of an arbitrary given formula; (ii) incomplete SAT-solving of a given satisfiable formula. While the first algorithm is focused on inverting step-reduced MD4, the second one is not area-specific and so is applicable to a variety of classes of hard SAT instances. In this study, for the first time in history, 40-, 41-, 42-, and 43-step MD4 are inverted via the first algorithm and the estimating mode of the second algorithm. 28-step MD5 is inverted for four hashes via the incomplete SAT-solving mode of the second algorithm. For three hashes out of them this is done for the first time.

Using SAT Solvers to Finding Short Cycles in Cryptographic Algorithms

Article

Full-text available

Sep 2020

A desirable property of iterated cryptographic algorithms , such as stream ciphers or pseudo-random generators, is the lack of short cycles. Many of the previously mentioned algorithms are based on the use of linear feedback shift registers (LFSR) and nonlinear feedback shift registers (NLFSR) and their combination. It is currently known how to construct LFSR to generate a bit sequence with a maximum period, but there is no such knowledge in the case of NLFSR. The latter would be useful in cryptography application (to have a few taps and relatively low algebraic degree). In this article, we propose a simple method based on the generation of algebraic equations to describe iterated cryptographic algorithms and find their solutions using an SAT solver to exclude short cycles in algorithms such as stream ciphers or nonlinear feedback shift register (NLFSR). Thanks to the use of AIG graphs, it is also possible to fully automate our algorithm, and the results of its operation are comparable to the results obtained by manual generation of equations. We present also the results of experiments in which we successfully found short cycles in the NLFSRs used in Grain-80, Grain-128 and Grain-128a stream ciphers and also in stream ciphers Bivium and Trivium (without constants used in the initialization step).

A Novel Hardware Architecture for Enhancing the Keccak Hash Function in FPGA Devices

Article

Full-text available

Aug 2023

Hash functions are an essential mechanism in today’s world of information security. It is common practice to utilize them for storing and verifying passwords, developing pseudo-random sequences, and deriving keys for various applications, including military, online commerce, banking, healthcare management, and the Internet of Things (IoT). Among the cryptographic hash algorithms, the Keccak hash function (also known as SHA-3) stands out for its excellent hardware performance and resistance to current cryptanalysis approaches compared to algorithms such as SHA-1 and SHA-2. However, there is always a need for hardware enhancements to increase the throughput rate and decrease area consumption. This study specifically focuses on enhancing the throughput rate of the Keccak hash algorithm by presenting a novel architecture that supplies efficient outcomes. This novel architecture achieved impressive throughput rates on Field-Programmable Gate Array (FPGA) devices with the Virtex-5, Virtex-6, and Virtex-7 models. The highest throughput rates obtained were 26.151 Gbps, 33.084 Gbps, and 38.043 Gbps, respectively. Additionally, the research paper includes a comparative analysis of the proposed approach with recently published methods and shows a throughput rate above 11.37% Gbps in Virtex-5, 10.49% Gbps in Virtex-6 and 11.47% Gbps in Virtex-7. This comparison allows for a comprehensive evaluation of the novel architecture’s performance and effectiveness in relation to existing methodologies.

Finding Collisions for Round-Reduced Romulus-H

Article

Full-text available

Mar 2023

The hash function Romulus-H is a finalist in the NIST Lightweight Cryptography competition. It is based on the Hirose double block-length (DBL) construction which is provably secure when used with an ideal block cipher. However, in practice, ideal block ciphers can only be approximated. Therefore, the security of concrete instantiations must be cryptanalyzed carefully; the security margin may be higher or lower than in the secret-key setting. So far, the Hirose DBL construction has been studied with only a few other block ciphers, like IDEA and AES. However, Romulus-H uses Hirose DBL with the SKINNY block cipher where only very little analysis has been published so far. In this work, we present the first practical analysis of Romulus-H. We propose a new framework for finding collisions in hash functions based on the Hirose DBL construction. This is in contrast to previous work that only focused on free-start collisions. Our framework is based on the idea of joint differential characteristics which capture the relationship between the two block cipher calls in the Hirose DBL construction. To identify good joint differential characteristics, we propose a combination of MILP and CP models. Then, we use these characteristics in another CP model to find collisions. Finally, we apply this framework to Romulus-H and find practical collisions of the hash function for 10 out of 40 rounds and practical semi-free-start collisions for up to 14 rounds.

New results in SAT – cryptanalysis of the AES

Conference Paper

Nov 2022

SAT Attacks on ARX Ciphers with Automated Equations Generation

Article

Full-text available

Jan 2019

We propose a novel and simple approach to algebraic attack on block ciphers with the SAT-solvers. As opposed to a standard approach, the equations for key expansion algorithms are not included in the formulas that are converted to satisfiability problem. The lack of equations leads to finding the solution much faster. The method was used to attack a lightweight block ciphers - SIMON and SPECK. We report the timings for roundreduced versions of selected ciphers and discuss the potential factors affecting the execution time of our attack.

An efficient SAT-based algorithm for finding short cycles in cryptographic algorithms

Conference Paper

Apr 2018

Cube Attacks and Cube-Attack-Like Cryptanalysis on the Round-Reduced Keccak Sponge Function

Conference Paper

Apr 2015

In this paper, we comprehensively study the resistance of keyed variants of SHA-3 (Keccak) against algebraic attacks. This analysis covers a wide range of key recovery, MAC forgery and other types of attacks, breaking up to 9 rounds (out of the full 24) of the Keccak internal permutation much faster than exhaustive search. Moreover, some of our attacks on the 6-round Keccak are completely practical and were verified on a desktop PC. Our methods combine cube attacks (an algebraic key recovery attack) and related algebraic techniques with structural analysis of the Keccak permutation. These techniques should be useful in future cryptanalysis of Keccak and similar designs. Although our attacks break more rounds than previously published techniques, the security margin of Keccak remains large. For Keyak – the Keccak-based authenticated encryption scheme – the nominal number of rounds is 12 and therefore its security margin is smaller (although still sufficient).

An Electronic Signature and Hash Functions

Chapter

Jan 2013

This chapter presents one of the most important uses of cryptography today—electronic signature algorithms. This is a relatively new alternative to traditional handwritten signatures on paper documents. The electronic signature, analogous to the handwritten signature, is used for signing electronic documents. It can be used online for authentication.

Automated Verification of Real-World Cryptographic Implementations

Article

Nov 2016

Aaron Tomb

Cryptographic software is increasingly important but notoriously difficult to implement correctly. Emerging specification approaches and tools make it possible to automatically and rigorously prove equivalence between machine-readable cryptographic specifications and real-world implementations. The Cryptol and the Software Analysis Workbench tools have successfully proven the correctness of routines from widely used cryptographic libraries.

Grain of Salt — An Automated Way to Test Stream Ciphers through SAT Solvers

Article

Full-text available

Mate Soos

In this paper we describe Grain of Salt, a tool developed to automatically test stream ciphers against standard SAT solver-based attacks. The tool takes as input a set of configuration options and the definition of each filter and feedback function of the stream cipher. It outputs a problem in the language of SAT solvers describing the cipher. The tool can automatically generate SAT problem instances for Crypto-1, HiTag2, Grain, Bivium-B and Trivium. In addition, through a simple text-based interface it can be extended to generate problems for any stream cipher that employs shift registers, feedback and filter functions to carry out its work.

Keccak sponge function family main document

Article

Full-text available

Jan 2009

The MD6 hash function

Article

Full-text available

Nov 2008

This report describes and analyzes the MD6 hash function, an entry in the NIST SHA-3 hash function competition 1 . Significant features of MD6 include: • Accepts input messages of any length up to 2 64 − 1 bits, and produces message digests of any desired size from 1 to 512 bits, inclusive, including the SHA-3 required sizes of 224, 256, 384, and 512 bits. • Security—MD6 is by design very conservative. We aim for provable security whenever possible; we provide reduction proofs for the security of the MD6 mode of operation, and prove that standard differential attacks against the compression function are less efficient than birthday attacks for find-ing collisions. The compression function and the mode of operation are each shown to be indifferentiable from a random oracle under reasonable assumptions. • MD6 has good efficiency: 22.4–44.1M bytes/second on a 2.4GHz Core 2 Duo laptop with 32-bit code compiled with Microsoft Visual Studio 2005 for digest sizes in the range 160–512 bits. When compiled for 64-bit operation, it runs at 61.8–120.8M bytes/second, compiled with MS VS, running on a 3.0GHz E6850 Core Duo processor.

FPGA and ASIC Implementations of AES

Chapter

Full-text available

Dec 2008

Finding hard instances of the satisfiability problem: A survey

Chapter

Dec 1997

Finding Collisions in the Full SHA-1

Article

Jan 2005

Principles of Modern Digital Design

Article

Sep 2006

P.K. Lala

A major objective of this book is to fill the gap between traditional logic design principles and logic design/optimization techniques used in practice. Over the last two decades several techniques for computer-aided design and optimization of logic circuits have been developed. However, underlying theories of these techniques are inadequately covered or not covered at all in undergraduate text books. This book covers not only the "classical" material found in current text books but also selected materials that modern logic designers need to be familiar with.

Updated Differential Analysis of Grøstl

Article

Martin Schläffer

Grøstl is a SHA-3 finalist with clear proofs against a large class of differential attacks, similar to those of MD6. Furthermore, in this note we provide an update also regarding more advanced types of differential attacks that have been developed in recent years. We apply the rebound attacks on the initial submission to the tweaked version of Grøstl. We have analyzed the round-reduced hash function and compression function of Grøstl-256 (10 rounds) and Grøstl-512 (14 rounds). For both versions, we get collisions for 3 rounds of the hash function and collisions for 6 rounds of the compression function. We hope that our own efforts on improving the cryptanalysis will continue to motivate and accelerate external cryptanalysis.

A Computing Procedure for Quantification Theory

Article

Jan 1962

Sha-3 proposal blake

Article

version 1.3, December 16, 2010 * This document is a revised version of the supporting documentation submitted to NIST on October 31, 2008. As such, it does not cite all relevant references published from that date. The hash functions specified are the "tweaked" versions, as submitted for the final of the SHA-3 competition. The original submitted functions were called BLAKE-28, BLAKE-32,BLAKE-48, and BLAKE-64; the tweaked versions are BLAKE-224, BLAKE-256, BLAKE-384, and BLAKE-512.

Security Margin Evaluation of SHA-3 Contest Finalists through SAT-Based Attacks

Abstract and Figures

Recommended publications

Modellbasierte Inferenz in CHARME

On Generalized Hoops, Homomorphic Images of Residuated Lattices, and (G)BL-Algebras

Evaluating the Performance of Solvers for Integer-Linear Programming

Compiling Multi-Paradigm Declarative Programs into Prolog