0:00

In the previous lecture, we looked at a public key

Â encryption system that is built from the RSA,

Â or more generally from trapdoors functions.

Â In this lecture, we are going to look at public key encryption

Â schemes that are build from the Diffie-Hellman protocol.

Â So, first recall that a public key encryption system is

Â made up of three algorithms. There is a key

Â generation algorithm that I will denote by Gen,

Â that basically generates a public key and a

Â secret key. And then there are two algorithms: E

Â and D that encrypt and decrypt. And the point is

Â the encryption algorithm encrypts using the

Â public key and the decryption algorithm decrypts

Â using the secret key. The physical world analogy for

Â public key encryption is a locked box, where anyone

Â can put a message inside the box and then lock the

Â box, that corresponds to the public key and then

Â no one can open this box, except the person who has

Â the secret key, that has the key can put it in the

Â lock, unlock the box and the recover the message in

Â the box. Now, in the previous lecture, we looked at

Â a number of applications for public key encryption.

Â In particular, we looked at the key exchange

Â application, in fact, this is how public key encryption

Â is used in SSL, where the server sends its public

Â key to the browser, the browser chooses a secret

Â and then encrypts the secret using the server's

Â public key, sends it back to the server, the

Â server decrypts and now both the browser and the

Â server have a common secret that they can then

Â use to encrypt data, going back and forth, between

Â them. So, in the interactive settings, such as in a

Â networking protocol, public key encryption would

Â primarily be used for setting up shared symmetric key

Â which the two parties can then use to exchange

Â information. However, there are many settings

Â where interaction is simply not possible and then

Â public key encryption is directly used to encrypt

Â messages. One example of this is secure email.

Â The email system in some sense is designed to be non-

Â interactive, in the sense that the sender sends an

Â email, the email travels from relay to relay, to

Â relay, until it finally arrives at the destination

Â and the destination should be able to decrypt,

Â without interacting with the sender at that point.

Â That can be done basically, by the sender

Â encrypting the message using the recipient's public

Â key. The ciphertext would travel along the SMTP chain

Â until it reaches the recipient. The recipient would

Â use a secret key and recover the original sent

Â message. However, there are many other cases

Â where interaction is not possible and I want to show

Â you two such cases. The first example is

Â file systems. And in fact, public key encryption is a

Â good way to manage file sharing, in an encrypted file

Â system. So, let me show you what I mean

Â by that. So, imagine we have our friend Bob

Â here, who wants to store an encrypted file on some

Â storage server. So, he will go ahead and write the

Â encrypted file to the storage server. What he

Â actually writes in the server is basically the

Â following: he will generate a random file encryption key, we

Â will call it 'K sub F' and then he will use the

Â symmetric encryption system to encrypt the file

Â using the key 'K sub F'. Then, he will encrypt the

Â key 'K sub F', using his own public key. So public key

Â of Bob. This will give Bob access to the file at a later

Â time, right. Using his secret key, Bob can decrypt

Â this header, recover 'K sub F'. Then he will

Â decrypt the actual encrypted file and recover the

Â plaintext file. So, decryption will work in two steps.

Â However, Bob now wants also to give access to

Â Alice, to this file. What he will do is, he will go

Â ahead and in addition he will also include in the

Â file header, an encyption of 'K sub F' under Alice's

Â public key. OK. So, notice that there was no

Â interaction here. All that Bob knows is Alice's public

Â key and yet he was able to the make the file

Â accesible to Alice at a later time.

Â So, now Bob disappears and then Alice

Â comes and accesses the disk at a later time.

Â She will read the ciphertext, decrypt her part of

Â the header, recover Kf and then she can decrypt

Â the symmetrically encrypted file and recover the

Â actual file contents. Ok, so you can see that without

Â any interaction, Bob was able to write to the file

Â system and enable Alice to access the file, as well.

Â Again, at the time that Alice was reading the file, there is no

Â interaction with Bob, because maybe Bob is already

Â inaccesible and yet Alice can still read the file

Â recovered by herself.

Â 4:05

So, another example of a non-interactive

Â application of public key encryption is what's called

Â key escrow. Now, key escrow may actually sound

Â like a bad thing but in fact it is a mandatory

Â feature in corporate environments. So, the idea

Â here is this. So imagine Bob writes data to a

Â disk and then later Bob becomes inaccesible.

Â Maybe Bob is fired. Maybe Bob is out sick.

Â And somehow the company needs to have access to

Â Bob's files. So having Bob be the only one

Â able to decrypt these files is simply unacceptable in

Â corporate settings. The corporation might need

Â access to those files. So, the question is what to do.

Â And the answer is to introduce this entity called

Â a key escrow service. The way the system then

Â would work is as follows:

Â Basically, when Bob writes his file to disk, his

Â system, as it's writing the file to this shared

Â storage medium, what it would do of course, as before,

Â it would encrypt the file using the file encryption key Kf.

Â It would encrypt Kf using Bob's public key, but it would

Â also encrypt Kf using an escrow service. So here,

Â the escrow service is completely offline. We never

Â talk to it unless we actually need its services.

Â As Bob is writing the file, all he does is he simply writes

Â the encryption of Kf under the escrow authority's

Â public key into the file header. Now, later Bob

Â disappears and now the company needs to recover

Â Bob's file. What does it do? Well, at this point it would

Â contact the escrow service. The escrow service

Â would read this part of the header, use its secret

Â key to decrypt the header and recover Kf and then use Kf

Â to decrypt the actual file.

Â Ok. So, in this way again you notice that the escrow service was

Â completely offline. There was no interaction with the

Â escrow service until the point at which the escrow

Â services were actually needed. And again, you can

Â see that this is a very clean and elegant application

Â of public key encryption. So, as I said in the

Â previous lecture, we saw constructions for public

Â key encryption based on trapdoor functions.

Â In particular, we look at RSA. We looked at the

Â generic construction we called ISO standard.

Â We looked at constructions like OAEP+ and so on and so forth.

Â In this lecture, we are going to look at public key

Â constructions from the Diffie-Hellman protocol.

Â This is another family of public key systems and

Â I am going to show you how they work. These public

Â key systems are generally called ElGamal public key

Â encryption schemes. Taher ElGamal was actually Marty

Â Hellman's student. He came up with this ElGamal

Â encryption system as part of his PhD thesis.

Â And, in fact, ElGamal encryption, for historical

Â reasons, is used in an email encryption system called GPG,

Â the GNU Privacy Guard. As usual, when we

Â construct public key encryption systems, our goal is

Â to build systems that have chosen ciphertext security,

Â so that they are secure both against eavesdropping

Â and tampering attacks. So, before I show you the ElGamal

Â system let's do a very brief review of the Diffie-Hellman

Â protocol. So, in my description here, I am going

Â to abstract a little bit from the version that we saw last

Â week. In fact, I just going to use the concept

Â of a finite cyclic group. In fact, we have an arbitrary finite

Â cyclic group, for example, it could be the group (Zp) star,

Â but it could also be the points of an elliptic curve.

Â And as I mentioned, there are some benefits to doing

Â Diffie-Hellman over an elliptic curve. But, for simplicity,

Â I am just going to refer to G as an abstract finite

Â cyclic group, but in your heads you should be

Â thinking G is the group (Zp) and let's suppose

Â that the group has order n for some integer n.

Â Now, we are going to fix a generator g of this group

Â and all this means, is basically, if you look at the

Â successive powers of g, that basically you get

Â all the elements in the group G. You notice,

Â because the group has order n, we know that

Â g to the power of n is equal to 1. And, therefore

Â there is no reason to go beyond the n-1st power

Â of g. g to the n is equal to 1 so that we just wrap around.

Â 8:12

So, Bob sends over to Alice g to the b and of

Â course I should remind you that both g to the a and

Â g to the b are just elements in this group G. Now,

Â they can derive the shared secret, If you remember,

Â the shared secret is g to the ab, and these equalities

Â here shows that both sides can actually compute

Â the shared secret given the values at their disposal.

Â So, Alice for example, has B and she has a, and

Â so raising B to the power of a, gives her the shared

Â secret. The attacker, of course, the poor attacker

Â he gets to see A and B and his goal is now,

Â of course, he also gets to see the generated g, and

Â his goal now is to compute g to the ab. But, we said that

Â this is believed to be a hard problem. Given g, g to the a,

Â and g to the b, in a group like Zp<i>, computing g to </i>

Â the ab is difficult. So, now let's see how to convert to

Â the Diffie-Hellman protocol into an actual public key

Â system. As I said, this was a brilliant idea due to

Â Taher ElGamal. So, as before, we are going to fix our

Â cyclic group G and a generator g inside of G.

Â Now, here I wrote the Diffie-Hellman protocol again,

Â except now we are going to assume that these guys

Â are separated in time. These two steps do not have

Â to occur simultaneously; they could actually take place

Â at quite different times. The first step of the Diffie-Hellman

Â protocol is what we are going to view as key

Â generation. That is the public key is going to be

Â this capital A and the secret key is simply going to

Â be the little a. So, you notice of course that

Â extracting the secret key from the public key,

Â namely extracting the little a from capital A is a

Â discrete log problem. So, that recovering a secret

Â key is actually difficult. Ok, now this gives us our public key.

Â So, now at a later time Bob wants to encrypt a

Â message to Alice, encrypted using her public key.

Â So how does Bob encrypt? Well, what he is going to do is

Â he's going to compute his contribution to the Diffie-Hellman

Â protocol. Namely, he is going to send over g to the

Â little b. Of course, he is going to choose this little b

Â at random and now he is going to compute by

Â himself the shared secret. So, he is going to

Â compute by himself g to the ab. From this g to the ab

Â he is going to derive a symmetric key for a

Â symmetric encryption system and then he is going to

Â encrypt the message m using this symmetric key

Â that he just derived. And that is the pair he is going to

Â send. So, he is going to send over his contribution

Â to the Diffie-Hellman protocol plus the symmetric

Â encryption of the message m that he wants to send

Â over to Alice. Ok, so you can see basically, we are

Â doing the exact same thing as we would do in the Diffie-Hellman

Â protocol except now we, Bob directly immediately is using

Â his Diffie-Hellman secret to encrypt the message that he

Â wants to send over to Alice.

Â Now, what does Alice do to decrypt?

Â Basically, she is going also to compute the

Â Diffie-Hellman secret. Remember, that now she just

Â received Bob's contribution to the Diffie-Hellman

Â protocol and she has her secret key a. So, she can

Â compute also the Diffie-Hellman secret, namely

Â g to the ab, from which she is going to derive the

Â symmetric encryption key k. And then, she is going

Â to decrypt the message to recover the actual

Â plaintext. Ok, so that is the intuition for how we

Â convert the Diffie-Hellman protocol into a public key

Â system. By the way, this was kind of an interesting

Â development at the time that it came out, partially

Â because, you notice this is a randomized encryption scheme.

Â So, every time Bob encrypts a message, it is

Â required that he choose a new random b and

Â encrypt the message using this new random b.

Â So, let's see the ElGamal system, actually in more

Â detail. So, now actually let's view it as an actual

Â public key encryption system. Namely, algorithm Gen,

Â algorithm E and algorithm D.

Â So, as usual, we are going to fix our finite cyclic

Â group of order n. Another ingredient that we are going

Â to need is a symmetric encryption system. So, I am

Â going to refer to it as E sub s, and D sub s. These are the

Â encryption and decryption algorithms of a symmetric

Â encryption system that happens to provide

Â authenticated encryption. And, the key space for

Â this system is capital K. And, then we are also going

Â to need a hash function that maps pairs of elements

Â in the group, namely elements in G squared into

Â the key space. Now, here is how the public key

Â encryption system works. So, I have to describe

Â three algorithms. Algorithm that generates the

Â public key and the secret key, and then the encryption

Â and decryption algorithms. So, the key generation

Â algorithm works as follows: All we do is basically,

Â build up Alice's contribution to the Diffie-Hellman

Â protocol. What we are going to do is going to choose a

Â random generator g in G and then we are going to

Â choose a random exponent a. The secret key is

Â going to be a and then the public key is going to be

Â the generator g and Alice's contribution to the

Â Diffie-Hellman protocol. The reason, by the way,

Â we don't use a fix generator, is because it

Â allows us to somewhat use a weaker assumption,

Â improving security. And, so it is actually better to

Â choose a random generator every time. It is easy

Â enough to choose a random generator. All we do, is we

Â take the generator that we started with and then we

Â raise it to some power that is relatively prime to n.

Â That will give us another generator. A random

Â generator of the group capital G. Ok. So, you can

Â see here that again the public key is simply Alice's

Â contribution to the Diffie-Hellman protocol. And, the

Â secret key is the random a that she chose. Now, how

Â do we encrypt and decrypt? Well, when Bob wants

Â to encrypt the message, he is going to use the

Â public key. Remember, it consists of g and h.

Â Here, he wants to encrypt a message m.

Â So, here is what he is going to do. So, he is going

Â to choose his contribution to the Diffie-Hellman protocol.

Â So, this is the secret b that he would normally

Â choose in Diffie-Hellman. And, now he is going to

Â compute g to the b, which is actually his message,

Â that gets send to Alice in the Diffie-Hellman protocol.

Â He is going to compute the Diffie-Hellman secret,

Â that's h to the b. If you remember, h was g to the a,

Â therefore, this value here is really g to the ab.

Â That's the Diffie-Hellman secret. That is the one thing that

Â the attacker doesn't actually know. Next, he is going

Â to compute a symmetric key by basically hashing

Â this pair u comma v. So, u, of course, is something

Â that the attacker is going to know because that is

Â going to be sent as part of the ciphertext. But v, the

Â attacker isn't going to know. Again, for the proof of

Â security, actually it helps to hash both u and v. So, we

Â hash both of them together. Although, strictly speaking

Â we just needed to hash v, because v is the only value

Â that the attacker doesn't know. The attacker

Â already knows u, because that's going to be part of the data

Â that's sent on the network. So, anyhow, so Bob derives

Â this symmetric key k, by hashing u and v. Then, he

Â goes ahead and encrypts the message using the

Â symmetric key k and finally he outputs his

Â contribution to the Diffie-Hellman protocol.

Â The value u and then the symmetric ciphertext that

Â directly encrypts the message m.

Â That's it. So, the ciphertext consists of these two

Â parts and that is the thing that gets sent over the

Â network. Let's see, how does Alice decrypt now.

Â So, she is going to use her secret key a to decrypt

Â and she receives here, Bob's contribution to the

Â Diffie-Hellman protocol plus the symmetric

Â encryption of the message that Bob sent.

Â What she will do is she'll compute herself the Diffie-Hellman

Â secret. If you remember, u to the a is simply g to the b

Â to the a. Which is g to the ab. So, here Alice

Â computed the Diffie-Hellman secret. And, now let me

Â ask you, how does she derive the symmetric key k

Â given the Diffie-Hellman secret, g to the ab and the

Â ciphertext that she was given?

Â 15:22

Well, what she will do, is simply again, now she has

Â u from the ciphertext and she has v, because she just

Â computed it herself. So, now she can rederive the

Â symmetric encryption key by hashing u and v

Â together to get the symmetric encryption key and

Â then she just decrypts the ciphertext to get the

Â actual plaintext. OK, so that's it. That is the whole

Â encryption and decryption algorithm. In a picture,

Â the way the ciphertext would look, is also

Â as kind of what we saw in the last lecture. Basically,

Â there would be a short header that contains u.

Â Which as you recall, is g to the b. And, then the rest of the

Â ciphertext would be the encryption of the message

Â that is being sent, under the symmetric key k.

Â And, then to decrypt, Alice would use this header

Â to derive the Diffie-Hellman secret from which she

Â will derive k and then decrypts the body, to get the

Â original plaintext. By the way, I should note that

Â the way I describe this system here, is actually not

Â how ElGamal described it originally, this is

Â in some sense a modern view about the ElGamal

Â encryption, but it is pretty much equivalent to how

Â ElGamal viewed it. So, now let's look at the

Â performance of ElGamal. So, here what I wrote

Â is the, kind of the time intensive steps of ElGamal

Â encryption. Namely, during encryption, there are

Â these two exponentiations in the group G.

Â Exponentiation, remember is a cubic time algorithm

Â using the repeated squaring algorithm. And, as a

Â result, it is fairly time intensive. When I say, time

Â intensive, I mean, that on a modern processor it would

Â take a few milliseconds to compute these

Â exponentiations and during decryption, basically,

Â the decryptor computes one exponentiation,

Â namely, u to the a. This is the bottleneck during

Â decryption. Ok, so you would think that encryption

Â actually is, takes twice as long as decryption, because

Â encryption requires two exponentiations, while

Â decryption requires only one. It turns out, that

Â is not entirely accurate because, you notice that the exponentiation

Â during decryption is done to a variable basis.

Â Namely, u changes every time. Whereas during

Â encryption, the basis is fixed: g and h are derived

Â from the public key and are fixed forever.

Â So, in fact, in turns out that if you want to do

Â exponentiation to a fixed basis, you can do a lot of

Â precomputation. In particular, you can do all the

Â squaring steps in the repeated squaring algorithm

Â offline. So, here what you would do, you would

Â compute all powers of 2 of g. So, you would

Â compute g, g squared, g to the fourth, g to the

Â eighth, go to the sixteenth, g to the thirty two, and so

Â on and so forth. These are all the squaring steps of

Â the repeated squaring algorithm you would do offline.

Â The same thing for h. And, then when it comes time to

Â actually do the real exponentiation all you need to do is just do the

Â multiplications, to accumulate these powers of 2 into the

Â exponent b, that you're trying to compute.

Â So, if you think about it, this can actually speed up

Â exponentiation by a factor of 3. In fact, it would

Â speed up it even more if you allow me to store even

Â larger tables. This is called a windowed exponentiation.

Â But, regardless, if you allow the encryptor to

Â store large tables that are derived from the public

Â key, then in fact encryption is not going to be

Â slower than decryption. In fact, encryption will be

Â faster than decryption. But, again, this requires that

Â the encryptor to precompute these large tables

Â and store them around. So, if all the encryptor is

Â doing, is just constantly encrypting to a single

Â recipient, that can be done actually fairly fast using these

Â precomputed tables. If the encryptor, for every

Â message is encrypting to a different recipient, for

Â example, if every time you send an email, you send

Â an email to a different recipient, then, in fact, the

Â encryption will be twice as slow as decryption.

Â