Minds aren't magic

Paul Crowley

Month: January, 2020

More thoughts on TCRs

A few more notes regarding target collision resistant functions, following up from my $1000 competition announcement.

Second preimage resistance

There is a simple way to construct a secure TCR compression function given a second-preimage-resistant compression function—just generate a key which is the length of the input, and XOR the key with the input. So if we can build a fast second-preimage-resistant function, we can build a fast secure TCR.

The history of hash functions shows that we have been much more successful at achieving second-preimage resistance than collision resistance. From the excellent Lessons From The History Of Attacks On Secure Hash Functions:

The main result is that there is a big gap between the history of collision attacks and pre-image attacks. Almost all older secure hash functions have fallen to collision attacks. Almost none have ever fallen to pre-image attacks.

Secondarily, no new secure hash functions (designed after approximately the year 2000) have so far succumbed to collision attacks, either.

Tweakable target collision resistance

In the definition of target collision resistance, the attacker supplies a single message, but in practice, we usually want to hash many messages with the same key, eg when constructing a variable-length TCR from a compression function that takes a fixed-length message. This is OK because there’s a straightforward security reduction which shows that if an attacker can find a collision for a single message with probability ε, then they can forge a collision for any of n messages with probability at most .

However, when the messages to be signed are large as they are in Android, this linear falloff is kind of a shame. One possible advantage of TCRs is that it can be secure to use much shorter hash outputs, say 128 bits, which will make Merkle trees much smaller, saving disk space and improving performance. But if the hash function consumes, say, 128 bytes at a time (like BLAKE2) and the system partition is 1GB, it will be broken up into 223 messages, leaving us with only 105-bit security at best. I’d like to do a little better than that, and so I’d like to build multiple-message security into the security definition. I propose a new kind of primitive, a tweakable TCR, which takes a tweak as well as a key and a message. The attacker faces the following challenge:

  • Attacker chooses n messages m1mn and n distinct tweaks t1tn
  • Attacker learns random key K
  • Attacker chooses i, m
  • Attacker wins if  m’ ≠ mi but H(K, ti, m’) = H(K, ti, mi)

If each of the 223 messages gets a distinct tweak, we can preserve 128-bit security across large partitions. I therefore encourage people to design not just TCRs, but TTCRs!

$1000 TCR hashing competition

In my day job, I do cryptography for Android. I have a problem where I need to make some cryptography faster, and I’m setting up a $1000 competition funded from my own pocket for work towards the solution.

On Android devices, key operating system components are stored in read-only partitions such as the /system partition. To prevent an attacker tampering with these partitions, we hash them using a Merkle tree, and sign the root. We don’t check all the hashes when the device boots; that would take too long. Instead, at boot time we check only the root of the tree, and then we check other sectors against their hashes as we load them using a Linux kernel module called dm-verity.

This likely works pretty well on phones sold in the US, which will have the ARM CE instructions that accelerate SHA2-256. But a lot of devices sold in poorer countries don’t have these instructions, and SHA2-256 can be pretty slow, and hurt overall system performance. For example, on the 900MHz Cortex-A7 Broadcom BCM2836, hashing takes 28.86 cpb [eBACS 2019-03-31], limiting reading speed to 31.2 MB/s. One partial fix is to switch to a hash function that is faster on such processors. BLAKE2b is nearly twice as fast on that processor, at 15.32 cpb. However, this is still a lot slower than I’m happy with.

Where sender and receiver have a shared secret, authentication can be very fast; a universal function like NH can run at around 1.5 cpb on such a processor. But this isn’t an option for verified boot, because it’s hard to keep the key out of the attacker’s hands, and given the key it’s trivial to forge messages.

Inbetween these two notions of security is the idea of a “target collision resistant” function, once known as a “universal one-way hash function”. With a TCR, hashing is randomized with a key chosen at signing time once the message to be signed is known. This makes the attacker’s job much harder since they cannot simply search for a pair of colliding messages. Instead, they must choose the first message to be hashed, and only then do they learn the key that will be used at hashing time, after which they must generate a second message that hashes with the first using this key; this problem is more akin to second preimage finding than collision finding. While collision attacks against hash functions are plentiful, second preimage attacks are far rarer.

Collision resistance Target collision resistance Universal function
K \xleftarrow{\$} \mathcal{K}
A \leftarrow K \\ A \rightarrow m_1 \\ A \rightarrow m_2 A \rightarrow m_1 \\ A \leftarrow K \\ A \rightarrow m_2 A \rightarrow m_1 \\ A \rightarrow m_2 \\ A \leftarrow K
Attacker succeeds if m_1 \neq m_2 and H(K, m_1) = H(K, m_2)

In principle, the vastly harder job facing the attacker of a TCR should mean that secure TCRs much faster than hash functions are possible. However, the main impetus to research on TCRs was a desire to bolster existing hash functions, when attacks on MD5 and SHA-1 were new and we didn’t know which if any of our existing hash functions would be left standing. As a result several ways to construct a TCR from a hash function with good provable properties were proposed, but none of these could be faster than their underlying hash functions. As far as I know, no-one has ever proposed a TCR as a primitive, designed to be faster than existing hash functions, and that’s what I need. I’m probably not the only one who’d find a primitive for much faster broadcast authentication useful, either!

To me this looks like an interesting, overlooked problem in symmetric cryptology, and I’d really like it to get some attention. So I’m offering a $1000 prize from my own pocket, to be awarded at Real World Crypto 2021, for the work that in my arbitrary opinion does the most to move the state of the art forwards or is just the most interesting.

I offered a similar prize at the rump session at FSE 2019, promising to award it at the end of the year, but I neglected to really tell anyone and didn’t get any entrants. Hopefully this will be a more successful launch for the prize, and see some of you at Real World Crypto 2020!

I’ve moved some technical notes into a subsequent blog post.