0:00

So our goal for this segment is to build secure compression functions. So

Â compression functions are collision resistant. So just to remind you where we

Â are we looked at this Merkle Damguard construction which takes a little

Â compression function and builds from it a hash function for much, much larger

Â inputs. We proved this cute theorem that in fact if a little compression function h

Â is collision resistant then plugging in into the Merkle Damguard construction

Â gives us a collision resistant hash function for larger inputs. So now in this

Â segment our goal is basically to build a compression function that is collision

Â resistant. So we're going to see a couple of constructions. And so the first

Â question that comes to mind is well, can we build compression functions from

Â primitives that we already have? In particular, we spent a lot of work to

Â build block cyphers and the question is can we build compression functions from

Â block cyphers. And the answer is yes. And let me show you how to do it. So assume we

Â have a certain block cypher here that operates on N bits blocks, so the input is

Â an N bits, output is N bits. And then there's this classic construction called a

Â Davies-Meyer construction which I wrote down in symbols here. Basically says that

Â what you would do is, given the message block and the chaining variable, all we do

Â is we encrypt the chaining variable using the message block as the key. And then we

Â kind of do one more xor on the output. So this might seem a little bizarre, because

Â remember the message block is something that's completely under the control of the

Â adversary. He's trying to find the collision so he can choose the message

Â blocks however he wants. And yet we're using this message block as a key into a

Â block cypher. But nevertheless, we can argue that this construction, at least

Â when E is what's called an ideal cypher, we can argue that this construction is in

Â fact as collision resistant as possible. So let me state the theorem. The theorem

Â states that, as I said, if E is an ideal block cypher, meaning that it's a random

Â collection of K random permutations from 01 to the N to 01 to the N. Then under

Â that assumption that E's an ideal block cypher, in fact, finding the collision for

Â this compression function H takes time, two to the N over two. In particular, we

Â can show that anyone who is finding collisions has to evaluate the encryption

Â and decryption functions at least 2 to the N over 2 times. And if you think

Â about what that means, since the output of this compression function is only N bytes

Â long, we know that there's always a generic birthday attack that finds

Â collisions in time 2 to the N over 2. So basically this theorem says that this

Â collision resistant function is as collision resistant as possible. We can

Â find the collision in time 2 to the N over 2 using the birthday attack but

Â there is no algorithm that will do better than 2 to the n over 2. So this

Â is actually a very common compression function used in collision resistance

Â hashing in fact of a SHA functions all used Davies-Mayer. It turns

Â out that there is something special about the Davies-Mayer construction that

Â makes collision resistant. If you just try to guess the construction very likely you

Â will end up with something that is not collision resistant. And so let me ask you

Â the following puzzle. Suppose we actually define the compression function as

Â follows, namely all we do is we encrypt the chaining variable H using the current

Â message block as the key. So the difference is that we dropped this 'xor' H

Â in Davies-Mayer, we simply ignored it. So it's not there. And the puzzle for you

Â is show me that this compression function then is actually not collision resistant.

Â So, let's see, so we're trying to build a collision, namely a distinct pair of HM

Â and H' M' that happen to collide under this later function H. And my

Â question to you is how would you do it? So I'm already going to tell you that you're

Â going to choose H, M, and M' arbitrarily. The only requirement is that

Â M and M' are distinct. And then my question is, how would you construct an H'

Â that would cause a collision to pop out? So the answer is the first choice and

Â an easy way to see it is if we apply the encryption function using M' to both

Â sides. Then we get that this is basically E of M' applied to H', right.

Â this is what we get by applying encryption using M' to the left hand side. And

Â if we imply encryption using M' to the right hand side, the decryption

Â operator cancels out and we simply are left with the encryption of M, H, which is

Â exactly the collision that we wanted to find. So there. You can see that it's

Â basically by using the decryption function D, it's very easy to build collisions for

Â this compression function. Now I should tell you that in fact Davies-Meyer is not

Â unique. There are other ways to build collision resistant compression functions

Â from block ciphers. For example, here's a method called Miyaguchi Preneel. Miyaguchi

Â Preneel actually is used in WHIRLPOOL hash function that we saw earlier. Here is

Â another method that happens to result in a collision resistant compression function.

Â And there are twelve variants like this playing with XORs and placing the

Â variables in different slots that will actually give a collision resistant

Â mechanism. But there are also many, many variants of this like we saw in the

Â previous puzzle that are not collision resistant. So here's. Another example,

Â that's not collision resistant. And I'm gonna leave it as a homework problem to

Â figure out a collision for this construction. So now, basically, we have

Â all the ingredients to describe the [inaudible] 256 hash function. I'll tell

Â you that it's a Merkel-Damgard construction, exactly as the one that we

Â saw before. It uses a Davies-Mayer compression function. And so the only

Â question is, what's the underlying block cipher for Davies-Mayer? And that block

Â cipher is called SHACAL-2. And I'll just tell you it's parameters. It uses a

Â 512 bit key. And remember the key is taken from the message block. So, this is really

Â what the message block is. And it so happens to be 512 bits. Which means the

Â SHA-256 will process its input message 512 bits at a time. And in the

Â block size, for this block cipher is 256 bits. And these are the chaining variable.

Â So this would be H i-1. And this would be successive chaining variable.

Â So now you have a complete understanding of how SHA-256 works.

Â Module of this cipher SHACAL-2 I'm not going to describe here.

Â So as I said, one class of compression functions is built from block cyphers. It turns out there's another class of

Â compression functions that's built using hard problems from number theory, and I

Â want to very briefly show you one example. And we call these compression functions

Â provable because if you can find the collision on this compression function

Â then you're going to be able to solve a very hard number theoretic problem which

Â is believed to be intractable. And as a result, if the number theory problem is

Â intractable, the resulting compression function is provably a collision

Â resistant. So here's how this compression function works. Basically we're going to

Â choose a large prime piece, so this is a gigantic prime, something like 700 digits,

Â 2,000 bits. And then we're going to choose two random values, U and V, between

Â one and P. And now let's define the compression function as follows. It takes

Â two numbers between 0, and p-1, and it's gonna output one number between

Â 0, and p-1. So it's compression ratio is 2 to 1. And takes two

Â numbers. And outputs one number. In the range 0 to p-1.

Â And it does it basically by computing double exponentiation. It computes u to the H times v to the n.

Â And the theorem, which, right now, I'm just gonna state as a fact. This fact actually

Â turns out to be fairly straightforward to prove, and we're gonna do it later on when

Â we get to the number theoretic part of the course. The fact is, that if you can find

Â a collision for this compression function, then you can solve a standard heart

Â problem in number theory called a discreet log problem. Everyone believes discrete

Â log is hard, and as a result, this compression function is provably collision

Â resistant. So you might ask me why do people not use this compression function

Â in practice. Why would we not use this for SHA-256? And the answer is that this

Â gives very slow performance in comparison to what you get from a block cipher. So

Â this is not really used for any compression functions. However, if for

Â some reason you really only want to, say, MAC or sign. Just one long message, and

Â you have a whole day to do it, then certainly you can use this type of

Â compression function. And even though it's slow, you'll get something that's provably

Â collision resistant. Okay, so that's the end of this segment. And now we're finally

Â ready to talk about HMAC, which we're gonna do in the next segments.

Â