Conference PaperPDF Available

Security Margin Evaluation of SHA-3 Contest Finalists through SAT-Based Attacks

Authors:

Abstract and Figures

In 2007, the U.S. National Institute of Standards and Tech-nology (NIST) announced a public contest aiming at the selection of a new standard for a cryptographic hash function. In this paper we eval-uate the security margin of five SHA-3 candidates which have qualified to the final round of the contest. We assume that attacks must have practical complexities, i.e., can be practically verified. Our attacks use a method called logical cryptanalysis where the original task is expressed as a SATisfiability problem. We use a new toolkit which greatly simplifies the most arduous stages of this type of cryptanalysis and helps to mount the attacks in a uniform way. We show that in the context of SAT-based attacks all the finalists have substantially bigger security margin than the current standards SHA-256 and SHA-1.
Content may be subject to copyright.
Security margin evaluation of SHA-3 contest
finalists through SAT-based attacks
Ekawat Homsirikamol2, Pawe l Morawiecki1,
Marcin Rogawski2, and Marian Srebrny1,3
1Section of Informatics, University of Commerce, Kielce, Poland,
2Cryptographic Engineering Research Group, George Mason University, USA
3Institute of Computer Science, Polish Academy of Sciences, Poland
Abstract. In 2007, the U.S. National Institute of Standards and Tech-
nology (NIST) announced a public contest aiming at the selection of a
new standard for a cryptographic hash function. In this paper we eval-
uate the security margin of five SHA-3 candidates which have qualified
to the final round of the contest. We assume that attacks must have
practical complexities, i.e., can be practically verified. Our attacks use a
method called logical cryptanalysis where the original task is expressed
as a SATisfiability problem. We use a new toolkit which greatly simplifies
the most arduous stages of this type of cryptanalysis and helps to mount
the attacks in a uniform way. We show that in the context of SAT-based
attacks all the finalists have substantially bigger security margin than
the current standards SHA-256 and SHA-1.
Key words: cryptographic hash algorithm, SHA-3 competition, alge-
braic cryptanalysis, logical cryptanalysis, SATisfiability solvers
1 Introduction
In 2007, the U.S. National Institute of Standards and Technology (NIST) an-
nounced a public contest aiming at the selection of a new standard for a cryp-
tographic hash function. The main motivation behind starting the contest has
been the security flaws identified in SHA-1 standard in 2005. Similarities between
SHA-1 and the most recent standard SHA-2 are worrisome and NIST decided
that a new, stronger hash function is needed. 51 functions were accepted to the
first round of the contest and in July 2009 among those functions 14 were se-
lected to the second round. At the end of 2010 five finalists were announced:
BLAKE [15], Groestl [21], JH [28], Keccak [13], and Skein [2]. The winning al-
gorithm will be named ’SHA-3’ and most likely will be selected in the second
half of 2012.
Security, performance in software, performance in hardware and flexibility
are the four primary criteria normally used in evaluation of candidates. Out of
these four criteria, security is the most important criterion, yet it is also the
most difficult to evaluate and quantify. There are two main ways of estimating
the security margin of a given cryptosystem. The first one is to compare the
complexities of the best attacks on the full cryptosystem. The problem with this
approach is that for many modern designs there is no known successful attack
on the full cryptosystem. Security margin would be the same for all algorithms
where there is nothing better than the exhaustive search if this approach is used.
This is not different for SHA-3 contest where no known attacks on the full func-
tions, except JH, have been reported. For JH there is a preimage attack [19]
but its time complexity is nearly equal to the exhaustive search and memory
accesses are over the exhaustive search bound. Therefore estimating the secu-
rity margin using this approach tells us very little or nothing about differences
between the candidates in terms of their security level. The second approach of
how to measure the security margin is to compare the number of broken rounds
to the total number of rounds in a given cryptosystem. As a vast majority of
modern ciphers and hash functions (in particular, the SHA-3 contest finalists)
has an iterative design, this approach can be applied naturally. However, there is
also a problem with comparing security levels calculated this way. For example,
there is an attack on 7-round Keccak-512 with complexity 2507 [3] and there
is an attack on 3-round Groestl-512 with complexity 2192 [23]. The first attack
reaches relatively more rounds (29%) but with higher complexity whereas the
second attack has lower complexity but breaks fewer rounds (21%). Both attacks
are completely non-practical. It is very unclear how such results help to judge
which function is more secure.
In this paper we follow the second approach of measuring the security margin
but with an additional restriction. We assume that the attacks must have prac-
tical complexities, i.e., can be practically verified. It is very similar to the line
of research recently presented in [5, 6]. This restriction puts the attacks in more
’real life’ scenarios which seems especially important for SHA-3 standard. So far
a large amount of cryptanalysis has been conducted on the finalists, however the
majority of papers focuses on maximizing the number of broken rounds which
leads to extremely high data and time complexity. These theoretical attacks have
great importance but the lack of practical approach is evident. We hope that
our work helps to fill this gap to some extent.
The method of our analysis is a SAT-based attack. SAT was the first known
NP-complete problem, as proved by Stephen Cook in 1971 [7]. A SAT solver de-
cides whether a given propositional (boolean) formula has a satisfying valuation.
Finding a satisfying valuation is infeasible in general, but many SAT instances
can be solved surprisingly efficiently. There are many competing algorithms for
it and many implementations, most of them have been developed over the last
two decades as highly optimized versions of the DPLL procedure [10] and [11].
SAT solvers can be used to solve instances typically described in the Con-
junctive Normal Form (CNF) into which any decision problem can be translated.
Modern SAT solvers use highly tuned algorithms and data structures to find a
solution to a given problem coded in this very simple form. To solve your prob-
lem: (1) translate the problem to SAT (in such a way that a satisfying valuation
represents a solution to the problem); (2) run your favorite SAT solver to find a
solution. The first connection between SAT and crypto dates back to [8], where a
2
suggestion appeared to use cryptoformulae as hard benchmarks for propositional
satisfiability checkers. The first application of SAT solvers in cryptanalysis was
due to Massacci et al. [18] called logical cryptanalysis. They ran a SAT solver
on DES key search, and then also for faking an RSA signature for a given mes-
sage by finding the e-th roots of a (digitalized) message mmodulo n, in [12].
Courtois and Pieprzyk [9] presented an approach to code in SAT their algebraic
cryptanalysis with some gigantic systems of low degree equations designed as
potential procedures for breaking some ciphers. Soos et al. [26] proposed en-
hancing a SAT solver with some special-purpose algebraic procedures, such as
Gaussian elimination. Mironov and Zhang [20] showed an application of a SAT
solver supporting a non-automatic part of the attack [27] on SHA-1.
In this work we use SAT-based attacks to evaluate security margin of the
SHA-3 contest finalists and also compare them to the current standards, in par-
ticular SHA-256. We show that all five finalists have a big security margin (yet
it differs a bit between functions) against these kind of attacks and are substan-
tially more secure than SHA-1 and SHA-256. We also report some interesting
results on particular functions or its building blocks. For 2-round Keccak with
256-bit hash size we mount preimage and colision attacks and our attacks are
currently the best practical preimage and collision attacks on reduced Keccak.
For 6-round Skein-256 compression function we find pseudo-collisions. For the
comparison, the Skein’s authors reached 8 rounds but they found only pseudo-
near-collision [2]. In the attacks we use our toolkit which is a combination of the
existing tools and some newly developed parts. The toolkit helps in mounting
the attacks in a uniform way and it can be easily used for cryptanalysis not
only of hash functions but also of other cryptographic primitives such as block
or stream ciphers.
2 Methodology of our SAT-based attacks
2.1 A toolkit for CNF formula generation
One of the key steps in attacking cryptographic primitives with SAT solvers is
CNF (conjunctive normal form) formula generation. A CNF is a conjunction of
clauses, i.e., of disjunctions of literals, where a literal is a boolean valued variable
or its negation. Thus, a formula is presented to a SAT solver as one big ’AND’
of ’ORs’. A cryptographic primitive (or a segment of it) which is the target of a
SAT based attack has to be completely described by such a formula. Generating
it is a non-trivial task and usually is very laborious. There are many ways to
obtain a final CNF and the output results differ in the number of clauses, the
average size of clauses and the number of literals. Recently we have developed a
new toolkit which greatly simplifies the generation of CNF.
Usually a cryptanalist needs to put a considerable effort into creating a final
CNF. It involves writing a separate program dedicated only to the cryptographic
primitive under consideration. To make it efficient, some minimizing algorithms
(Karnaugh maps, Quine-McCluskey algorithm or Espresso algorithm) have to
3
be used [17]. These have to be implemented in the program, or the intermedi-
ate results are sent to an external tool (e.g., Espresso minimizer) and then the
minimized form is sent back to the main program. Implementing all of these
procedures requires a good deal of programming skills, some knowledge of logic
synthesis algorithms and careful insight into the details of the primitive’s opera-
tion. As a result, obtaining CNF might become the most tedious and error-prone
part of any attack. It could be especially discouraging for researchers who start
their work from scratch and do not want to spend too much time on writing
thousands lines of code.
To avoid those disadvantages we have recently proposed a new toolkit con-
sisting basically of two applications. The first of them is Quartus II — a software
tool released by Altera for analysis and synthesis of HDL (Hardware Description
Language) designs, which enables the developers to compile their designs and
configure the target devices (usually FPGAs). We use a free-of-charge version
Quartus II Web Edition which provides all the features that we need. The second
application, written by us, converts boolean equations (generated by Quartus)
to CNF encoded in DIMACS format (standard format for today’s SAT solvers).
The complete process of CNF generation includes the following steps:
1. Code the target cryptographic primitive in HDL;
2. Compile and synthesize the code in Quartus;
3. Generate boolean equations using Quartus inbuilt tool;
4. Convert generated equations to CNF by our converter.
Steps 2, 3, and 4 are done automatically. Using this method the only effort
a researcher has to put is to write a code in HDL. Normally programming and
’thinking’ in HDL is a bit different from typical high-level languages like Java or
C. However it is not the case here. For our needs, programming in HDL looks
exactly the same as it would be done in high-level languages. There is no need
to care about typical HDL specific issues like proper expressing of concurrency
or clocking. It is because we are not going to implement anything in a FPGA
device. All we need is to obtain a system of boolean equations which completely
describes the primitive we wish to attack. Once the boolean equations are gen-
erated by the Quartus inbuilt tool, the equations are converted into CNF by the
separate application. The conversion implemented in our application is based
on the boolean laws (commutativity, associativity, distributivity, identity, De
Morgan’s laws) and there are no complex algorithms involved.
It must be noted that Quartus programming environment gives us two impor-
tant features which may help to create a possibly compact CNF. It minimizes
the functions up to 6 variables using Karnaugh maps. Additionally, all final
equations have at most 5 variables (4 inputs, 1 output). It is because Quartus
is dedicated to FPGA devices which are built out of ’logic cells’, each with 4
inputs/1 output. (There are also FPGAs with different parameters; e.g., 5/2.
But we chose 4/1 architecture in all the experiments.) This feature is helpful
when dealing with linear ANF equations with many variables (also referred as
’long XOR equations’). A simple conversion of such an equation to CNF gives an
exponential number of clauses; an equation in n-variables corresponds to 2n1
4
clauses in CNF. A common way of dealing with this problem is to introduce new
variables and cut the original equation into a few shorter ones.
Example 1. Let us consider an equation with 5 variables:
a+b+c+d+e= 0
A CNF corresponding to this equation consists of 251clauses with 5 literals
in each clause. However, introducing two new variables, we can rewrite it as a
system of three equations:
a+b+x= 0
c+d+y= 0
e+x+y= 0
A CNF corresponding to this system of equations would consist of 22+22+ 22=
12 clauses.
Quartus automatically introduces new variables and cuts long equations to
satisfy the requirements for FPGA architecture. Consequently a researcher needs
not to be worried that the CNF would be much affected by very long XOR equa-
tions (which may be a part of the original cryptographic primitive’s description).
To the best of our knowledge, there are only two other tools which provide
similar functionality to our toolkit — automate the CNF generation and help
to mount the uniform SAT-based attacks. First is the solution proposed in [16]
where the main idea is to change the behaviour of all the arithmetic and logical
operators that the algorithm uses, in such a way that each operator produces a
propositional formula corresponding to the operation performed. It is obtained
by using C++ implementation and a feature of the C++ language called opera-
tor overloading. Authors tested their method on MD4 and MD5 functions. The
proposed method can be applied to other crypto primitives but it is not clear
how it would deal with more complex operations, e.g. an S-box described as a
look-up table. The second tool is called Grain of Salt [25] and it incorporates
some algorithms to optimize a generated CNF. However it can be only used with
a family of stream ciphers.
In comparison two these two tools, our proposal is the most flexible. It can
be used with many different cryptographic primitives (hash functions, block
and stream ciphers) and it does not limit an input description to only simple
boolean operations. The toolkit handles XOR equations efficiently and also takes
an advantage of logic synthesis algorithms which help to provide more compact
CNF.
2.2 Our SAT-based attack
All the attacks reported in the paper have a very similar form and consist of the
following steps.
1. Generate the CNF formula by our toolkit;
5
2. Fix the hash and padding bits in the formula;
3. Run a SAT solver on the generated CNF.
The above scheme is used to mount a preimage attack, i.e., for a given hash
value h, we try to find a message msuch that h=f(m). CryptoMiniSAT2, gold
medalist from recent SAT competitions [24], is selected as our SAT solver. In the
preliminary experiments, we also tried other state-of-art SAT solvers (Lingeling
[4], Glucose [1]) but overall CryptoMiniSAT2 solves our formulas faster.
We attack functions with 256-bit hash. When constructing a CNF coding a
hash function, one has to decide the size of the message (how many message
blocks are taken as an input to the function). It is easier for a SAT solver to
tackle with a single message block because coding each next message block would
make a formula twice as big. However, each of the five finalists has different way
of padding the message. If only one message block is allowed, BLAKE-256 can
take maximally 446 bits of message which are padded to get a 512-bit block. On
the other hand, Keccak-256 can take as many as 1086 bits of message in a single
block. To avoid the situation where one formula has much more message bits
to search for by a SAT solver than the other formula, message is fixed to 446
bits (maximum value for BLAKE-256 with one message block processed, other
finalists allow more).
To find a second preimage or a collision, only a small adjustment to the afore-
mentioned attack is required. Once the preimage is found, we run SAT solver
on exactly the same formula but with one message bit fixed to the opposite
value of that from the preimage (rest of the message bits are left unknown).
It turns out that in a very similar time the SAT solver is able to solve such
slightly modified formula, providing a second preimage and a collision. The sec-
ond preimages/collisions are expected because with a size of the message fixed
to 446 bits we have 446 to 256 bits mapping.
3 Results
We have conducted the preimage attack described in Section 2.2 on the five
finalists and also on the two standards SHA-256 and SHA-1. As a SAT solver
we used CryptoMiniSat2, 2.9.0 version, with the parameters gaussuntil=0 and
restart=static. These settings were suggested by the author of CryptoMiniSat2.
The experiments were carried out on Intel Core i7 2.67 GHz with 8 Gb RAM.
Starting with 1-round variants of the functions, the SAT solver was run to solve
the given formula and gave us the preimage. The time limit for each experiment
was set to 30 hours. If the solution was found, we added one more round, encoded
in CNF and gave it to the solver. The attack stopped when the time limit was
exceeded or memory ran out. Table 1 shows the results. The second column
contains the number of broken rounds in our preimage attack and the third
column shows the security margin calculated as a quotient of the number of
broken rounds and the total number of rounds. For clarity, we are reporting our
preimage attack but as explained above it can be easily modified to get a second
6
preimage or a collision. Therefore the numbers from Table 1 remain valid for all
three types of attacks.
All the SHA-3 contest finalists have substantially bigger security margin than
SHA-256 and SHA-1 standards. On the other hand, the finalists differ slightly
(maximally 7%) and all have the security margin over 90%. For Groestl we were
not able to attack even a 1-round variant, nor a simplified Groestl with the out-
put transformation replaced by a simple truncation. The only successful attack
on Groestl (or rather part of it) is the attack on the output transformation in the
1-round variant of Groestl. The output transformation is not a complete round
but giving 100% of security margin would not be fair neither. Therefore we try to
estimate ’a weight’ of the output transformation. Essentially all the operations
(equations) in Groestl compression function come from two very similar permu-
tations (Pand Q). The output transformation is built on the Ppermutation
only so it can be treated as a half-operation of the compression function. Hence
the attack on the output transformation in a 1-round variant of Groestl is shown
in Table 1 as half the round.
All the reported attacks on the finalists took just a few seconds. Only for
16-round SHA-256 the attack lasted longer — one hour. Despite the fact a con-
servative time limit (30 hours) was set for this type of experiments, it did not
help to extend the attack to reach one more round. It seems that the time of the
attack grows superexponentially in the number of rounds. The same behaviour
was observed by Rivest et al. when they tested MD6 function with their SAT-
based analysis [22]. For MD6 with 256-bit hash size, they reached 10 rounds
which gives 90% of security margin. For a reader interested in estimating the
asymptotic complexity of our attacks, we report that it would be very difficult
mainly because Altera does not reveal details of algorithms used in Quartus.
Table 1. Security margin comparison calculated from the results of our preimage
attacks on round-reduced hash functions
Function No. of rounds Security margin
SHA-1 21 74% (21/80)
SHA-256 16 75% (16/64)
Keccak-256 2 92% (2/24)
BLAKE-256 1 93% (1/14)
Groestl-256 0.595% (0,5/10)
JH-256 2 96% (2/42)
Skein-256 1 99% (1/72)
It is interesting to see if the parameters of CNF formula, that is the number
of variables and clauses, could be a good metric for measuring the hardness of
the formula and consequently the security margin. Table 2 shows the numbers
of variables and clauses for full hash functions. The values are rounded to the
nearest thousand. For SHA-1, SHA-256, BLAKE, and Skein, we have generated
7
the complete formula with our toolkit. For the other functions, we have extrapo-
lated the numbers from round-reduced variants as the toolkit had some memory
problems with those huge instances (over 1 million of clauses). As every round
in the given function is basically the same (consists of the same type of equa-
tions), the linear extrapolation is straightforward. For the examined functions,
the CNF formula parameters could be a good metric for measuring the hardness
of the formula but only to some extent. Indeed, the smallest formulas (SHA-1
and SHA-256) have the lowest security margin but, for example, BLAKE and
Keccak have nearly the same security level while Keccak formula is more than
twice as big.
Table 2. The parameters of our CNF formulas coding hash functions
Function Variables Clauses
SHA-1 29 000 200 000
SHA-256 61 000 400 000
BLAKE-256 57 000 422 000
Keccak-256 88 000 1 075 000
Skein-256 148 000 1 041 000
JH-256 169 600 1 998 000
Groestl-256 279 000 3 568 000
Besides the attacks on (round-reduced) hash functions, we have also mounted
the attacks on the compression functions — main building blocks of hash func-
tions. First we tried the preimage attack on a given compression function and if
it did not succeed we attacked the function in a scenario where an adversary can
choose IV (initial value) and get a pseudo-preimage. Table 3 summarizes these
attacks. Similarly as for the results from Table 1, the numbers remain valid for
all three types of attacks (a preimage, a second preimage and a collision attack).
Among the finalists our best attack was on 6-round Skein-256 compression func-
tion for which we found pseudo-collisions. For comparison, the Skein’s authors
reached 8 rounds but they found only a pseudo-near-collision. For Groestl com-
pression function we were not able to mount any successful attack. Keccak has
a completely different design from MD hash function family — there is no any
typical compression function taking IV and a block of a message. Therefore in
Table 3 we do not report any result for these two functions.
It is very difficult to give a good and clear answer which designs or features
of a hash function are harder for SAT solvers. One observation we have made
is that those designs which use S-boxes (JH and Groestl) have the biggest CNF
formula and are among the hardest for SAT solvers. Equations describing an
S-box are more complex than equations describing addition or boolean AND.
Consequently, the corresponding CNF formula for the S-box is also more complex
Only output transformation broken. It is estimated as an equivalent to one half-
operation of the Groestl compression function.
8
Table 3. Attacks on the compression functions
Function Type of attack No. of rounds Security margin
SHA-1 preimage 21 74% (21/80)
SHA-256 preimage 16 75% (16/64)
BLAKE-256 preimage 1 93% (1/14)
JH-256 preimage 2 96% (2/42)
Skein-256 pseudo-preimage 6 92% (6/72)
with greater number of variables and clauses than in the case of other typical
operations. Before we give an example, let us first take a closer look at the
addition operation. This operation is used in SHA-1, SHA-256, BLAKE, and
Skein. In our toolkit the addition of two words is described by the following
equations (a full adder equations):
Si=AiBiCi1
Ci= (Ai·Bi)(Ci1·(AiBi))
Siis the i-th bit of the sum of two i-bit words A and B. Ciis the i-th carry
output.
Now let us compare the CNF sizes of the addition operation and AES S-box
used in Groestl. A CNF of 32-bit addition has 411 clauses and 124 variables
while AES S-box given to our toolkit as a look-up table gives a CNF with 4800
clauses and 900 variables. We also experimented with an alternative description
of AES S-box expressed as boolean logic equations, instead of a look-up table
[14]. This description reduces the CNF size approximately by half but still it is
a degree of order greater than the CNF from the 32-bit addition operation.
We have also observed that there is no clear limit in size of CNF formulas
beyond which a SAT solver fails. For example the CNF of 2-round JH with 59
thousand clauses is solved within seconds whereas the CNF of 2-round Skein
with 27 thousand clauses was not solved having 30 hours of time limit. What
exactly causes the difference between hardness of formulas is a good point for
further research.
4 Conclusion
The security margin of the five finalists of the SHA-3 contest using our SAT-
based cryptanalysis has been evaluated in this paper. A new toolkit which greatly
simplifies the most tedious stages of this type of analysis and helps to mount
the attacks in a uniform way has been proposed and developed. Our toolkit is
more flexible than the existing tools and can be applied to various cryptographic
primitives. Based on our methodology, we have shown that all the finalists have
substantially bigger security margin than the current standards SHA-256. We
9
stress that ’bigger security margin’ we refer only to the context of our SAT-
based analysis. Using other techniques (e.g., linear cryptanalysis) could lead to
a different conclusion.
As a side effect of our security margin evaluation, we have also carried out
some attacks on compression functions and reported some new state-of-the-art
results. For example, we have found pseudo-collisions for 6-round Skein-256 com-
pression function.
References
1. Audemard, G., Simon, L.: Glucose SAT Solver, http://www.lri.fr/~simon/
?page=glucose
2. B. Schneier et al.: The Skein Hash Function Family, http://www.skein-hash.
info/sites/default/files/skein1.1.pdf
3. Bernstein, D.J.: Second preimages for 6 (7? (8??)) rounds of Keccak? NIST mailing
list (2010), http://ehash.iaik.tugraz.at/uploads/6/65/NIST-mailing-list_
Bernstein-Daemen.txt
4. Biere, A.: Lingeling, http://fmv.jku.at/lingeling
5. Biryukov, A., Dunkelman, O., Keller, N., Khovratovich, D., Shamir, A.: Key Re-
covery Attacks of Practical Complexity on AES Variants With Up To 10 Rounds.
Cryptology ePrint Archive, Report 2009/374 (2009), http://eprint.iacr.org/
2009/374
6. Bouillaguet, C., Derbez, P., Dunkelman, O., Keller, N., Rijmen, V., Fouque,
P.A.: Low Data Complexity Attacks on AES. Cryptology ePrint Archive, Report
2010/633 (2010), http://eprint.iacr.org/2010/633
7. Cook, S.A.: The complexity of theorem-proving procedures. In: Proceedings of the
third annual ACM symposium on Theory of computing. pp. 151–158. STOC ’71,
ACM, New York, NY, USA (1971)
8. Cook, S.A., Mitchell, D.G.: Finding hard instances of the satisfiability problem: A
survey. pp. 1–17. American Mathematical Society (1997)
9. Courtois, N., Pieprzyk, J.: Cryptanalysis of block ciphers with overdefined systems
of equations. In: Zheng, Y. (ed.) Advances in Cryptology ASIACRYPT 2002,
Lecture Notes in Computer Science, vol. 2501, pp. 267–287. Springer Berlin /
Heidelberg (2002)
10. Davis, M., Logemann, G., Loveland, D.: A machine program for theorem-proving.
Communications of the ACM 7(5), 394–397 (1962)
11. Davis, M., Putnam, H.: A computing procedure for quantification theory. Journal
of the ACM 7, 201–215 (1960)
12. Fiorini, C., Martinelli, E., Massacci, F.: How to fake an RSA signature by encoding
modular root finding as a SAT problem. Discrete Applied Mathematics 130, 101–
127 (2003)
13. G. Bertoni et al.: Keccak sponge function family main document, http://keccak.
noekeon.org/Keccak-main-2.1.pdf
14. Gaj, K., Chodowiec, P.: FPGA and ASIC Implementatios of AES. In: Koc, C.K.
(ed.) Cryptographic Engineering, chap. 10, pp. 235–294. Springer (2009)
15. J. P. Aumasson et al.: SHA-3 proposal BLAKE, http://www.131002.net/blake/
16. Jovanovic, D., Janicic, P.: Logical analysis of hash functions. In: Gramlich, B. (ed.)
Frontiers of Combining Systems, Lecture Notes in Computer Science, vol. 3717, pp.
200–215. Springer Berlin / Heidelberg (2005)
10
17. Lala, P.K.: Principles of modern digital design. Wiley-Interscience (2007)
18. Massacci, F.: Using Walk-SAT and Rel-SAT for cryptographic key search. In: In
Proceedings of the International Joint Conference on Artificial Intelligence. pp.
290–295 (1999)
19. Mendel, F., Thomsen, S.: An Observation on JH-512. Available online (2008),
http://ehash.iaik.tugraz.at/uploads/d/da/Jh_preimage.pdf
20. Mironov, I., Zhang, L.: Applications of SAT Solvers to Cryptanalysis of Hash
Functions. In: Biere, A., Gomes, C. (eds.) Theory and Applications of Satisfiability
Testing - SAT 2006. LNCS, vol. 4121, pp. 102–115. Springer Berlin / Heidelberg
(2006)
21. P. Gauravaram et al.: Grøstl — a SHA-3 candidate, http://www.groestl.info
22. R. Rivest et al.: The MD6 hash function, http://groups.csail.mit.edu/cis/
md6/
23. Schlaffer, M.: Updated Differential Analysis of Groestl. Grstl website (January
2011), http://groestl.info/groestl-analysis.pdf
24. Soos, M.: CryptoMiniSat 2.5.0. In: SAT Race competitive event booklet (July
2010), http://www.msoos.org/cryptominisat2
25. Soos, M.: Grain of Salt — An Automated Way to Test Stream Ciphers through
SAT Solvers. In: Workshop on Tools for Cryptanalysis (2010)
26. Soos, M., Nohl, K., Castelluccia, C.: Extending SAT Solvers to Cryptographic
Problems. pp. 244–257 (2009)
27. Wang, X., Yin, Y.L., Yu, H.: Finding Collisions in the Full SHA-1. In: Crypto.
LNCS, vol. 3621, pp. 17–36. Springer Berlin / Heidelberg (2005)
28. Wu, H.: Hash Function JH, http://www3.ntu.edu.sg/home/wuhj/research/jh/
Appendix
For the reader’s convenience, we provide an example SystemVerilog code for
SHA-1 used in the experiments with our toolkit. In many cases a code strongly
resembles a pseudocode defining a given cryptographic algorithm. A reader fa-
miliar with C or Java should have no trouble adjusting the code to our toolkit’s
needs.
module sha1(IN, OUT);
input [511:0] IN; // input here means 512-bit message block
output [159:0] OUT; // output here means 160-bit hash
reg [159:0] OUT;
reg [31:0] W_words [95:0]; // registers for W words
reg [31:0] h0 ,h1, h2, h3, h4;
reg [31:0] a, b, c, d, e, f, k, temp, temp2;
integer i;
always @ (IN, OUT)
begin
h0 = 32’h67452301; h1 = 32’hEFCDAB89;
h2 = 32’h98BADCFE; h3 = 32’h10325476;
h4 = 32’hC3D2E1F0;
11
a=h0;b=h1;c=h2;d=h3;e=h4;
W_words[15] = IN[31:0]; W_words[14] = IN[63:32];
W_words[13] = IN[95:64]; W_words[12] = IN[127:96];
W_words[11] = IN[159:128]; W_words[10] = IN[191:160];
W_words[9] = IN[223:192]; W_words[8] = IN[255:224];
W_words[7] = IN[287:256]; W_words[6] = IN[319:288];
W_words[5] = IN[351:320]; W_words[4] = IN[383:352];
W_words[3] = IN[415:384]; W_words[2] = IN[447:416];
W_words[1] = IN[479:448]; W_words[0] = IN[511:480];
for (i=16; i<=79; i=i+1)
begin
W_words[i] = W_words[i-3] ^ W_words[i-8] ^ W_words[i-14] ^ W_words[i-16];
W_words[i] = {W_words[i][30:0], W_words[i][31]}; // leftrotate 1
end
for (i=0; i<=79; i=i+1) // main loop
begin
if ((i>=0) && (i<=19))
begin
f = (b & c) | ((~b) & d); k = 32’h5A827999;
end
if ((i>=20) && (i<=39))
begin
f = b ^ c ^ d; k = 32’h6ED9EBA1;
end
if ((i>=40) && (i<=59))
begin
f = (b & c) | (b & d) | (c & d); k = 32’h8F1BBCDC;
end
if ((i>=60) && (i<=79))
begin
f = b ^ c ^ d; k = 32’hCA62C1D6;
end
temp2 = {a[26:0], a[31:27]}; // a leftrotate 5
temp = temp2 + f + e + k + W_words[i];
e = d;
d = c;
c = {b[1:0], b[31:2]}; // b leftrotate 30
b = a;
a = temp;
end // end of main loop
h0=h0+a;h1=h1+b;
h2=h2+c;h3=h3+d;h4=h4+e;
OUT = {h0, h1, h2, h3, h4}; //HASH
end
endmodule
12
... CDCL solvers were also applied to analyze cryptographic hash functions of the SHA family. In particular, step-reduced versions of SHA-0, SHA-1, SHA-256, and SHA-3 were inverted in (Nossum, 2012;Legendre et al., 2012a;Homsirikamol, Morawiecki, Rogawski, & Srebrny, 2012;Nejati, Liang, Gebotys, Czarnecki, & Ganesh, 2017). In (Homsirikamol et al., 2012), it was also done for the BLAKE-256 and JH-256 cryptographic hash functions. ...
... In particular, step-reduced versions of SHA-0, SHA-1, SHA-256, and SHA-3 were inverted in (Nossum, 2012;Legendre et al., 2012a;Homsirikamol, Morawiecki, Rogawski, & Srebrny, 2012;Nejati, Liang, Gebotys, Czarnecki, & Ganesh, 2017). In (Homsirikamol et al., 2012), it was also done for the BLAKE-256 and JH-256 cryptographic hash functions. An algebraic fault attack on SHA-1 and SHA-2 was performed in (Nejati, Horácek, Gebotys, & Ganesh, 2018), while that on SHA-256 was done in (Nakamura, Hori, & Hirose, 2021). ...
Preprint
Full-text available
MD4 and MD5 are seminal cryptographic hash functions proposed in early 1990s. MD4 consists of 48 steps and produces a 128-bit hash given a message of arbitrary finite size. MD5 is a more secure 64-step extension of MD4. Both MD4 and MD5 are vulnerable to practical collision attacks, yet it is still not realistic to invert them, i.e. to find a message given a hash. In 2007, the 39-step version of MD4 was inverted via reducing to SAT and applying a CDCL solver along with the so-called Dobbertin's constraints. As for MD5, in 2012 its 28-step version was inverted via a CDCL solver for one specified hash without adding any additional constraints. In this study, Cube-and-Conquer (a combination of CDCL and lookahead) is applied to invert step-reduced versions of MD4 and MD5. For this purpose, two algorithms are proposed. The first one generates inversion problems for MD4 by gradually modifying the Dobbertin's constraints. The second algorithm tries the cubing phase of Cube-and-Conquer with different cutoff thresholds to find the one with minimal runtime estimation of the conquer phase. This algorithm operates in two modes: (i) estimating the hardness of an arbitrary given formula; (ii) incomplete SAT-solving of a given satisfiable formula. While the first algorithm is focused on inverting step-reduced MD4, the second one is not area-specific and so is applicable to a variety of classes of hard SAT instances. In this study, for the first time in history, 40-, 41-, 42-, and 43-step MD4 are inverted via the first algorithm and the estimating mode of the second algorithm. 28-step MD5 is inverted for four hashes via the incomplete SAT-solving mode of the second algorithm. For three hashes out of them this is done for the first time.
... In cryptology, SAT solvers are successfully used among other methods in issues related to cryptanalysis of block and stream ciphers [14], hash functions [7], and in methods related to formal verification, automatic test pattern generation This work was presented at the International Scientific Conference Mathematical Cryptology & Cybersecurity (MC&C 2020), Warsaw, 16-17.01.2020. ...
... The idea of using an equation taken from implementation was earlier explored by Courtois et al. [14] to conduct an SAT attack on DES block cipher. In 2012, during SHA-3 competition, Homsirikamol et al. [7] developed a similar tool to obtain hardware equations that described SHA-3 final candidates and evaluated their security margin. ...
Article
Full-text available
A desirable property of iterated cryptographic algorithms , such as stream ciphers or pseudo-random generators, is the lack of short cycles. Many of the previously mentioned algorithms are based on the use of linear feedback shift registers (LFSR) and nonlinear feedback shift registers (NLFSR) and their combination. It is currently known how to construct LFSR to generate a bit sequence with a maximum period, but there is no such knowledge in the case of NLFSR. The latter would be useful in cryptography application (to have a few taps and relatively low algebraic degree). In this article, we propose a simple method based on the generation of algebraic equations to describe iterated cryptographic algorithms and find their solutions using an SAT solver to exclude short cycles in algorithms such as stream ciphers or nonlinear feedback shift register (NLFSR). Thanks to the use of AIG graphs, it is also possible to fully automate our algorithm, and the results of its operation are comparable to the results obtained by manual generation of equations. We present also the results of experiments in which we successfully found short cycles in the NLFSRs used in Grain-80, Grain-128 and Grain-128a stream ciphers and also in stream ciphers Bivium and Trivium (without constants used in the initialization step).
... The Keccak family has a total of four different hash functions, which are referred to as Keccak-224, Keccak-256, Keccak-384, and Keccak-512, respectively. These functions are all supported by the same fundamental framework, which is referred to as the sponge construction [34]. The sponge construction is a flexible framework that enables the production of hash values of varying lengths. ...
Article
Full-text available
Hash functions are an essential mechanism in today’s world of information security. It is common practice to utilize them for storing and verifying passwords, developing pseudo-random sequences, and deriving keys for various applications, including military, online commerce, banking, healthcare management, and the Internet of Things (IoT). Among the cryptographic hash algorithms, the Keccak hash function (also known as SHA-3) stands out for its excellent hardware performance and resistance to current cryptanalysis approaches compared to algorithms such as SHA-1 and SHA-2. However, there is always a need for hardware enhancements to increase the throughput rate and decrease area consumption. This study specifically focuses on enhancing the throughput rate of the Keccak hash algorithm by presenting a novel architecture that supplies efficient outcomes. This novel architecture achieved impressive throughput rates on Field-Programmable Gate Array (FPGA) devices with the Virtex-5, Virtex-6, and Virtex-7 models. The highest throughput rates obtained were 26.151 Gbps, 33.084 Gbps, and 38.043 Gbps, respectively. Additionally, the research paper includes a comparative analysis of the proposed approach with recently published methods and shows a throughput rate above 11.37% Gbps in Virtex-5, 10.49% Gbps in Virtex-6 and 11.47% Gbps in Virtex-7. This comparison allows for a comprehensive evaluation of the novel architecture’s performance and effectiveness in relation to existing methodologies.
... In the context of hash function cryptanalysis, automation is mostly driven by custom dedicated tools, particularly when searching for actual collisions [MNS11, SBK + ]. For preimage attacks, a few results using SAT solvers have been shown [HMRS12]. ...
Article
Full-text available
The hash function Romulus-H is a finalist in the NIST Lightweight Cryptography competition. It is based on the Hirose double block-length (DBL) construction which is provably secure when used with an ideal block cipher. However, in practice, ideal block ciphers can only be approximated. Therefore, the security of concrete instantiations must be cryptanalyzed carefully; the security margin may be higher or lower than in the secret-key setting. So far, the Hirose DBL construction has been studied with only a few other block ciphers, like IDEA and AES. However, Romulus-H uses Hirose DBL with the SKINNY block cipher where only very little analysis has been published so far. In this work, we present the first practical analysis of Romulus-H. We propose a new framework for finding collisions in hash functions based on the Hirose DBL construction. This is in contrast to previous work that only focused on free-start collisions. Our framework is based on the idea of joint differential characteristics which capture the relationship between the two block cipher calls in the Hirose DBL construction. To identify good joint differential characteristics, we propose a combination of MILP and CP models. Then, we use these characteristics in another CP model to find collisions. Finally, we apply this framework to Romulus-H and find practical collisions of the hash function for 10 out of 40 rounds and practical semi-free-start collisions for up to 14 rounds.
Article
Full-text available
We propose a novel and simple approach to algebraic attack on block ciphers with the SAT-solvers. As opposed to a standard approach, the equations for key expansion algorithms are not included in the formulas that are converted to satisfiability problem. The lack of equations leads to finding the solution much faster. The method was used to attack a lightweight block ciphers - SIMON and SPECK. We report the timings for roundreduced versions of selected ciphers and discuss the potential factors affecting the execution time of our attack.
Conference Paper
In this paper, we comprehensively study the resistance of keyed variants of SHA-3 (Keccak) against algebraic attacks. This analysis covers a wide range of key recovery, MAC forgery and other types of attacks, breaking up to 9 rounds (out of the full 24) of the Keccak internal permutation much faster than exhaustive search. Moreover, some of our attacks on the 6-round Keccak are completely practical and were verified on a desktop PC. Our methods combine cube attacks (an algebraic key recovery attack) and related algebraic techniques with structural analysis of the Keccak permutation. These techniques should be useful in future cryptanalysis of Keccak and similar designs. Although our attacks break more rounds than previously published techniques, the security margin of Keccak remains large. For Keyak – the Keccak-based authenticated encryption scheme – the nominal number of rounds is 12 and therefore its security margin is smaller (although still sufficient).
Chapter
This chapter presents one of the most important uses of cryptography today—electronic signature algorithms. This is a relatively new alternative to traditional handwritten signatures on paper documents. The electronic signature, analogous to the handwritten signature, is used for signing electronic documents. It can be used online for authentication.
Article
Cryptographic software is increasingly important but notoriously difficult to implement correctly. Emerging specification approaches and tools make it possible to automatically and rigorously prove equivalence between machine-readable cryptographic specifications and real-world implementations. The Cryptol and the Software Analysis Workbench tools have successfully proven the correctness of routines from widely used cryptographic libraries.
Article
Full-text available
In this paper we describe Grain of Salt, a tool developed to automatically test stream ciphers against standard SAT solver-based attacks. The tool takes as input a set of configuration options and the definition of each filter and feedback function of the stream cipher. It outputs a problem in the language of SAT solvers describing the cipher. The tool can automatically generate SAT problem instances for Crypto-1, HiTag2, Grain, Bivium-B and Trivium. In addition, through a simple text-based interface it can be extended to generate problems for any stream cipher that employs shift registers, feedback and filter functions to carry out its work.
Article
Full-text available
This report describes and analyzes the MD6 hash function, an entry in the NIST SHA-3 hash function competition 1 . Significant features of MD6 include: • Accepts input messages of any length up to 2 64 − 1 bits, and produces message digests of any desired size from 1 to 512 bits, inclusive, including the SHA-3 required sizes of 224, 256, 384, and 512 bits. • Security—MD6 is by design very conservative. We aim for provable security whenever possible; we provide reduction proofs for the security of the MD6 mode of operation, and prove that standard differential attacks against the compression function are less efficient than birthday attacks for find-ing collisions. The compression function and the mode of operation are each shown to be indifferentiable from a random oracle under reasonable assumptions. • MD6 has good efficiency: 22.4–44.1M bytes/second on a 2.4GHz Core 2 Duo laptop with 32-bit code compiled with Microsoft Visual Studio 2005 for digest sizes in the range 160–512 bits. When compiled for 64-bit operation, it runs at 61.8–120.8M bytes/second, compiled with MS VS, running on a 3.0GHz E6850 Core Duo processor.
Article
A major objective of this book is to fill the gap between traditional logic design principles and logic design/optimization techniques used in practice. Over the last two decades several techniques for computer-aided design and optimization of logic circuits have been developed. However, underlying theories of these techniques are inadequately covered or not covered at all in undergraduate text books. This book covers not only the "classical" material found in current text books but also selected materials that modern logic designers need to be familiar with.
Article
Grøstl is a SHA-3 finalist with clear proofs against a large class of differential attacks, similar to those of MD6. Furthermore, in this note we provide an update also regarding more advanced types of differential attacks that have been developed in recent years. We apply the rebound attacks on the initial submission to the tweaked version of Grøstl. We have analyzed the round-reduced hash function and compression function of Grøstl-256 (10 rounds) and Grøstl-512 (14 rounds). For both versions, we get collisions for 3 rounds of the hash function and collisions for 6 rounds of the compression function. We hope that our own efforts on improving the cryptanalysis will continue to motivate and accelerate external cryptanalysis.
Article
version 1.3, December 16, 2010 * This document is a revised version of the supporting documentation submitted to NIST on October 31, 2008. As such, it does not cite all relevant references published from that date. The hash functions specified are the "tweaked" versions, as submitted for the final of the SHA-3 competition. The original submitted functions were called BLAKE-28, BLAKE-32,BLAKE-48, and BLAKE-64; the tweaked versions are BLAKE-224, BLAKE-256, BLAKE-384, and BLAKE-512.