0:00

In this segment, we're gonna study the security of the ElGamal public key

Â encryption system. So let me remind you that when we first presented the

Â Diffie-Hellman protocol, we said that the security is based on the assumption that

Â says that given G, G to the A, G to the B, it's difficult to compute the

Â Diffie-Hellman secret, G to the AB. This is basically what the attacker has to do.

Â He sees Alice's contribution. He sees Bob's contribution and then his goal is to

Â figure out what the Diffie-Hellman secret is. And we said that this problem is

Â difficult. The statement that the problem is difficult is what's called the

Â computational Diffie-Hellman assumption. So, let's take this assumption more

Â precisely. So, as usual we're going to look at a finite cyclic group of order N,

Â so G is some group, in your head you should be thinking ZP star, but there could

Â be other groups, for example, like an ellipctic curve group. And then we say that

Â the computational Diffie-Hellman assumption. I've often used the shorthand

Â CDH for Computational Diffie Hellman. We'll say this assumption holds in G if

Â the following condition holds, namely for all efficient algorithms. If we choose a

Â random generator, and then we choose random exponents A, B in ZN. Then when

Â we give the algorithm G, G to the A, and G to the B, the probability that the

Â algorithm will produce the Diffie Hellman secret is negligible. If this is true for

Â all efficient algorithms, then we say that the CDH assumption holds for G. CDH, as I

Â said, stands for Computational Diffie Hellman. As it turns out, this assumption

Â is not ideal for analyzing the security of the ElGamal system. And instead I'm gonna

Â go ahead and make a slightly stronger assumption called a hash Diffie-Hellman

Â assumption. Okay. So what is hash Diffie-Hellman assumption? Again, we are

Â focusing on a particular group G and now we're also gonna introduce a hash function

Â H that maps pairs of elements in G into the key space of some symmetric encryption

Â system. And then we say that the hash Diffie-Hellman a ssumption, or HDH for

Â short, holds for this pair, G comma H for the group in the hash function if the

Â following is true. Basically, if I choose a random generator and then I choose

Â random exponents A and B in ZN and then I also choose a random R and K, then the

Â following distributions are computationally indistinguishable. That

Â is, if I give you G, G to the A, G to the B, and then this hash value, this should

Â look familiar to you. This is the hash value that's used in the ElGamal system to

Â derive the symmetric encryption key. What we're saying is that this distribution is

Â indistinguishable from a distribution where you're also given. G, G to the A, G

Â to the B. But now, instead of giving the hash, you're given just really random key.

Â So what this assumption says is that the symmetric key that was derived during

Â encryption in the ElGamal system, essentially is indistinguishable from a

Â truly random key that is derived independently from the rest of the

Â parameters of the system. It's a truly independent random key, looks basically

Â the same as the key that we used, to derive from the G to the A and the G to

Â the B. Now I'd like to point out that the Hash Diffie-Hellman assumption is actually

Â a stronger assumption than the Computational Diffie-Hellman assumption.

Â The way to see that is using the contra positive, that is suppose the

Â Computational Diffie-Hellman assumption happens to be easy in the group G. Then I

Â claim that in fact the Hash Diffie-Hellman assumption is also easy in the group G. In

Â fact, I'll say for G, H and this is true in fact for all hash functions where the

Â image of the hash function. It contains at least two elements. In other words all I

Â want to say is that the Hash Diffie-Hellman assumption and it's easy for all

Â hash functions to go match everything to a single point. Which is true for almost all

Â hash functions of interest to us. So actually, this is a really simple

Â statement to prove. Basically, if the Computational Diffie-Hellman assumption is easy, what that

Â says is that, given G to the A and G to the B, I can actually calculate G to the AB

Â myself. Cuz I know the hash function H, I can calculate this complete value here. So

Â if you give me a tuple that's sampled from the left or sampled from the

Â right. I can very easily tell where it's from. If it's sampled from the left, then

Â once I've calculated G to the AB myself, I can just test that the last element in the

Â tuple is, in fact, the hash of G to the B and G to the AB. If it is, I know

Â the sample is from the left. If it isn't, I know the sample is from the right. Okay,

Â so this gives me pretty high advantage, in distinguishing these two distributions. So

Â CDH is easy, it's very easy to see that these distributions are distinguishable.

Â And this basically says that if the Hash Diffie-Hellman is in fact hard, then CDH

Â must also be hard. Which then we say, that therefore the Hash Diffie-Hellman is a

Â stronger assumption. There might be group where CDH is hard, but HDH is not hard.

Â Okay. So we say HDH is a stronger assumption. If you found this

Â discussion of weak assumption versus strong assumption confusing, it's not

Â really that important, it's fine to ignore it. The important thing that I want to

Â show you is in fact that the Hash Diffie-Hellman assumption is sufficient to

Â prove the semantic security of the ElGamal system. Before we do that let me quickly

Â ask you the following puzzle just to make sure the Hash Diffie-Hellman assumption is

Â clear. So imagine our key space is {0, 1} to the 128. So 128 bit strings and our

Â hash function will map pairs of group elements into this 128 byte strings.

Â Suppose it so happens that we chose a hash function Such that it always outputs

Â strings that begin with zero. In other words, of the 128 bit strings the most

Â significant bit of the output is always zero. If we chose such a hash function,

Â does the Hash Diffie-Hellman assumption hold for this pair, G comma H. So the

Â answer is no it doesn't hold. And the reason is because it now very easy to

Â distinguish the two distributions that we have here. The distribution on the left

Â an the distribution on the right. And the way you would distinguish the two is

Â basically if I choose a truly random element in K, in {0, 1} to the 128,

Â then mostly it can very well be zero with probability one half. However, the tuple

Â that's given to me is chosen from the left distribution, then the most significant

Â bit of the hash will always be zero and therefore if I see zero, I'm gonna say the

Â distribution is a distribution on the left. If I see one, I'm gonna say the

Â distribution is a distribution on the right. That will give me advantage one

Â half in distinguishing these two distributions. And so as a result, the

Â Hash Diffie-Hillman assumption is false in this case. So clearly you could see that,

Â even though CDH might be hard in a certain group G, if you choose a bad hash

Â function, then HDH will not hold for the pair G comma H. Okay. But suppose it so

Â happens that we choose a group G and a hash function H for which the Hash Diffie

Â Hellman assumption holds. And in fact, if you choose the hash function to be just

Â SHA-256, for all we know, the Hash Diffie Hellman assumption holds in the

Â group ZP star, and holds in elliptic curve groups. My goal then is to show you that

Â ElGamal is semantically secure under the Hash Diffie-Hellman assumption. So let me

Â quickly remind you how theElGamal system works. While we're gonna choose a

Â random generator G, we're gonna choose a random 'a' in ZN, the public key is

Â gonna be G, and G to the A, the secret key is simply gonna be A. And now here I wrote

Â shorthand for the ElGamal encryption. Basically, what to encrypt, what we do is

Â we choose a random B. We hash G as being H to the B. Remember this H to the B is the

Â value G to the AB. This is the Diffie Hellman secret. The hash function gave us

Â a symmetric encryption key K. We encrypt the message with K, and we output G to the

Â B and the symmetric cipher text. To decrypt we have to value U, and the Diffie

Â Hellman secret, G to the A. To derive a symmetric key, we decrypt the cipher text.

Â And then we output the plaintext message m. So let's see how to argue that,

Â in fact, ElGamal is semantically secure under this Hash Diffie-Hillman assumption is

Â fairly simple. So well we have to argue, remember we had, in semantic security, we

Â have these two experiments. One experiment, the attacker is given the

Â encryption of m0. In the other experiment, the attacker is given the encryption of m1.

Â So I wrote the two experiments here. Here you notice that the attacker starts by

Â sending off the public key to the adversary. The adversary then chooses two

Â equal length messages m0 and m1. And then if he gets the ElGamal encryption of M0,

Â and a kind of rough shorthand for what ElGamal ciphertext is, okay? Similarly, in

Â encryption one, he simply gets the encryption of M1 instead of M0, and

Â everything else is the same about these two experiments. Now, because of the Hash

Â Diffie-Hellman assumption, we know that even though the attacker got to see G, G

Â to the a and G to the b, we know that the output of the hash of G to the b, G to the

Â ab is indistinguishable from random. Therefore, if we replace the actual hash

Â function by a truly generated random key K that's independent of everything else, by

Â the Hash Diffie-Hellman assumption, the attacker can't distinguish these two

Â games. And again, it's a simple exercise for you to show that if he could

Â distinguish these two games, then he would break the Hash Diffie-Hellman assumption.

Â But then the same is true for experiment one. The attacker also could not

Â distinguish the output of the hash function from a truly random key, that was

Â used the generate the challenge cipher text. Okay. So now basically we look at

Â these two experiments and we realize that, wait a minute, what's going on in these

Â two experiments? Basically everything is the same except in one experiment, the

Â attacker gets the encryption under a symmetric encryption system of m0. In the

Â other one, he gets the encryption under a symmetric encryption system of m1 and the

Â key is chosen at random independently over everything else. So by the fact that the

Â symmetric encryption system is semantically secure, these two games are

Â indistinguishable. If they were distinguishable, then the attacker could

Â break the semantic security of this symmetric encryption system. So now we

Â kinda prove this, you know, this chain of equivalences. And that proves that the

Â original games, in fact, are indistinguishable, computationally

Â indistinguishable. And therefore, the ElGamal system is semantically secure.

Â Okay so it's like I said a very simple proof by pictures and you can make this

Â into a rigorous proof without too much work. But semantic security isn't enough,

Â what we really want is chosen ciphertext security, and the question is ElGamal chosen ciphertext

Â secure? So it turns out to prove the chosen ciphertext security of ElGamal we

Â actually need a stronger assumption, Hash Diffie-Hellman or Computational Diffie-Hellman

Â are actually not enough to prove chosen ciphertext security of the system as far

Â as we know. So to prove chosen-ciphertext security, I'm going to introduce a new

Â assumption called Interactive Diffie Hellman assumption. And actually,

Â technically we really don't like this assumption. And we're going to try to get

Â rid of this, later on. But for now, we just want to analyze the security of the

Â basic ElGamal system against chosen ciphertext attack. So to argue that it is

Â chosen ciphertext secure, here is what the assumption says. Basically the challenger

Â starts, you know, plays a game with the adversary, he generates a random G,

Â generates two exponents, and then he says to the adversary as usual, G, G to the a

Â and G to the b. Now the adversary's goal is basically to figure out the

Â Diffie-Helmman's secrets, mainly g to the ab. He outputs the value of V and he wins

Â the game if V happens to be equal to G to the AB. So, so far this looks identical to

Â the Computational Diffie-Hellman assumption, there's no difference - this is

Â the Computational Diffie-Hellman assumption except in Interactive Diffie-Hellman would

Â give the attacker a little bit more power. So because the attacker has more power,

Â it's harder to satisfy the assumption, which is why we say that this assumption

Â is stronger than Computational Diffie-Hellman. Now what is this power to be

Â given? We are given the ability to make queries. In particular, he gets to submit

Â two elements from the group G, so U1, V1 disappear from the group G. And then he's

Â told whether U1 to the A is equal to V1, so he gets one. If you wanted the A equals

Â to V1 get zero, otherwise it's kind of a bizarre type of query. What, how does it

Â be possibly help the attacker? But it turns out we need to be able to answer

Â these types of queries to the adversary in order to be able to prove chosen

Â ciphertext security. And in fact, he can do these type of queries again and again

Â and again. So he can issue as many queries like these as he wants and then the

Â assumption says that despite all these queries he still can't figure out the

Â Diffie-Hellman secret, namely despite making all these queries, the probability

Â that he outputs the Diffie-Hellman secret is negligible. Okay.

Â So clearly if this assumption holds, that the Computational Diffie-Hellman assumption

Â holds, because here, the adversary has more power, and as a result we say that this assumption

Â is stronger that Computational Diffie-Hellman, the thing, we really don't like about this

Â assumption, is the fact, that, it's interactive, and that the adversary is allowed to

Â make queries to the challenger, and as I said, we're gonna trying to get rid

Â of this interaction later on. But for now I'll tell you that under this Interactive

Â Diffie-Hellman assumption and under the assumption that the symmetric encryption

Â system provides authenticated encryption, and under the assumption that the hash

Â function is kind of an ideal hash function, what we call the random oracle,

Â then in fact the ElGamal system is chosen ciphertext secure and again this

Â RO here denotes the fact that it's in the random oracle model. Which is not

Â important, so much for our purposes. The main thing to remember that under

Â kind of this weird, interactive assumption we can actually prove that

Â ElGamal is a chosen ciphertext secure. And as far as we know this assumption

Â actually holds for the group ZP star. So what we're gonna do next is basically

Â answer this question of whether we can get rid of the interactive assumption. Can we

Â prove security strictly based on CDH? And similarly can we proof security without

Â relying on random oracles? Just you know without having to assume, that the hash

Â function is ideal. Just you know, can we prove security using a concrete hash

Â function. And we'll do that very briefly in the next segment.

Â