0:00

In this segment we will look at how to use block ciphers to encrypt multiple messages

Â using the same key. This comes up in practice for example in file systems where

Â the same key's used to encrypt multiple files. It comes up in networking protocols

Â where the same key is used to encrypt multiple packets. So let's see how to do

Â it. The first thing we need to do is to define what is mean for a cipher to be

Â secure when the same key is used to encrypt multiple messages. When we use the

Â key more than once the result of that is that the adversary gets to see many cyber

Â text encrypted using the same key. As a result, when we define security, we're

Â gonna allow the adversary to mount what's called a chosen plain text attack. In

Â other words, the adversary can obtain the encryption of arbitrary messages of his

Â choice. So, for example, if the adversary's interacting with Alice. The

Â adversary can ask Alice to encrypt arbitrary messages of the adversary's

Â choosing. And Alice will go ahead and encrypt those messages and give the

Â adversary the resulting cipher texts. You might wonder why would Alice ever do this.

Â How could this possibly happen in real life? But it turns out this is actually

Â very common in real life. And in fact, this modeling is quite a conservative

Â modeling of real life. For example, the adversary might send Alice an email. When

Â Alice receives the email, the writes it to her encrypted disk, thereby encrypting the

Â adversary's email using her secret key. If later the adversary steals this disc, then

Â he obtains the encryption of an email that he sent Alice under Alice's secret key. So

Â that's an example of a chosen plain text attack, where the adversary provided Alice

Â with a message and she encrypted that message using her own key. And then later

Â the attacker was able to obtain the resulting cipher text. So that's the

Â adversary's power. And then the adversary's goal is basically to break

Â semantic security. So let's define this more precisely. As usual, we're gonna

Â define semantic security under a chosen plain text attack using two experiments,

Â experiment zero and experiment one, that are modeled as a game between a challenger

Â and an adversary. When the game begins, the challenger is gonna choose a random

Â key K. And now the adversary basically gets to query the challenger. So the

Â adversary now begins by submitting a semantic security query, namely, he

Â submits two messages, M zero and M one. I added another index, but let me ignore

Â that extra index for a while. So the adversary submits two messages, M zero and

Â M one, that happen to be of the same length. And then the adversary receives

Â the encryption of one of those messages, either of M zero or of M one. In

Â experiment zero, he receives the encryption of M zero. In experiment one,

Â he receives the encryption of M one. So, so far this would look familiar this looks

Â exactly like a standard semantic security [inaudible]. However, plain text attack

Â the adversary can now repeat this query again. So now you can issue a query with

Â two other plain texts, again of the same length, and again you would receive the

Â encryption of one of them. In experiment zero you would receive the encryption of M

Â zero. In experiment one you would receive the encryption of M one. And the attacker

Â can continue issuing queries like this. In fact we'll say that he can issue up to Q

Â queries of this type. And then, remember, every time he issues a pair of messages.

Â That happen to be of the same length and every time he either gets the encryption

Â of the left side or the right side again in experiment zero he will always get the

Â encryption of the left message in experiment one he will always get the

Â encryption of the left message. And, then adversary's goal is, basically, to figure

Â out whether he's in experimental zero or in experiment one. In other words, whether

Â he was constantly receiving the encryption of the left message or the encryption of

Â the right message. So, in some sense, this is a standard semantic security game just

Â iterated over many queries that the attacker can issue to adaptively one after

Â the other. Now the chosen plain text attack is captured by the fact that if the

Â attacker wants the encryption of a particular message m. What he could do is,

Â for example, use query J for sum J, where in this query J he'll set both the zero

Â message and the one message to be the exactly same message M. In other words,

Â both the left message and the right message are the same, and both are set to

Â the message M. In this case, what he will receive, since both messages are the same,

Â he knows that he's gonna receive the encryption of this message M that he was

Â interested in. So this is exactly what we meant by a chosen [inaudible] attack.

Â Where the advisory can submit a message m and receive the encryption of that

Â particular message m of his choice. So some of his queries might be of this chose

Â plain text flavor where the message on the left is equal to the message on the right,

Â but some of the queries might be standard semantic security queries where the two

Â messages are distinct and that actually gives him information on whether he's in

Â experiment zero or in experiment one. Now by now you should be used to this

Â definition where we say that the system is semantically secure under a chosen plain

Â text attack. If, for all efficient adversaries, they cannot distinguish

Â experiment zero from experiment one. In other words, the probability that, at the

Â end, the output, B Prime, which we're gonna denote by the output of experiment

Â B. This output will be the same whether [inaudible] experiment zero or experiment

Â one. So the attacker couldn't distinguish between always receiving encryptions of

Â the left messages, versus always receiving encryptions of the right messages. So in

Â your mind, I'd like you to be thinking of an adversary that is able to mount a

Â chosen plaintext attack, namely, be given the encryption of arbitrary messages of

Â his choice, and his goal is to break semantic security for some other challenge

Â cipher texts. And as I said in this [inaudible] model of the real world the

Â attacker is able to fool Alice into encrypting for him messages of his choice

Â and then the attacker's goal is to somehow break some challenge cypher text. So I

Â claim that all the ciphers that we've seen up until now, namely deterministic counter

Â mode or the one time pad, are insecure under a chosen plain text attack. More

Â generally, suppose we have an encryption scheme that always outputs the same cipher

Â text for a particular message M. In other words, if I ask the encryption scheme to

Â encrypt the message M once. And then I ask the encryption scheme to encrypt the

Â message m again. If in both cases the encryption scheme outputs the same cypher

Â text, then that system cannot possibly be secure under a chosen plain text attack.

Â And both deterministic counter mode and the one time pad were of that flavor. They

Â always output the same cipher text, given the same message. And so let's see why

Â that cannot be chosen plain text secure. And the attack is fairly simple, what the

Â attacker is gonna do, is he's gonna output the same message twice. This just says.

Â That he really wants the encryption of M0. So here the attacker is given C0 which is

Â the encryption of M0. So this was his chosen plain text query where he actually

Â received the encryption of the message M0 of his choice. And now he's going to break

Â semantic security. So what he does is he outputs two messages, M0 and M1 of the

Â same length, and he's going to be given the encryption of MB. But low and behold,

Â we said that the encryption system. Always outputs the same cipher text when its

Â encrypting the message, M0. Therefore, if B is=to zero, we know that C, this

Â challenged cipher text, is simply=to CO, because it's the encryption of M0.

Â However, if B is=to one. Then we know that this challenge cypher text is the

Â encryption of M1 which is something other than C zero so all the attacker does is he

Â just checks his C is = to C0 the output's zero in other words he outputs one. So, in

Â this case, the attacker is able to perfectly guess this bit B, so he knows

Â exactly [inaudible] given the encryption of M0, or the encryption of M1. And as a

Â result, his advantage in winning this game is one. Meaning that the system cannot

Â possibly be CPA secure. One is not a negligible number. So this shows that the

Â deterministic encryption schemes cannot possibly be CPA-secure, but you might

Â wonder well, what does this mean in practice? Well in practice this means

Â again that every message is always encrypted to the same cipher text. What

Â this means is if you're encrypting files on disk, and you happen to be encrypting

Â two files that happen to be the same, they will result in the same cipher text and

Â then the attacker by looking at the encrypted disk, will learn that these two

Â files actually contain the same content. The attacker might not learn what the

Â content is, but he will learn that these two encrypted files are an encryption of

Â the same content and he shouldn't be able to learn that. Similarly, if you send two

Â encrypted packets on the network that happen to be the same, the attacker will

Â not learn the content of those packets, but he will learn that those two packets

Â actually contain the same information. Think for example of an encrypted voice

Â conversation. Every time there's quiet on the line, the system will be sending

Â encryptions of zero. But since encryption of zero are always mapped to the same

Â cypher text. An attacker looking at the network will be able to identify exactly

Â the points in the conversation where there's quiet because he will always see

Â those exact same cypher text every time. So these are examples where deterministic

Â encryption cannot possibly be secure. And as I say formerly we say that the

Â terministic encryption can not be semantically secure under a chosen plain

Â text attack. So what do we do, well the lesson here is if the secret keys gonna be

Â used to encrypt multiple messages, it had better be the case that given the same

Â plain text to encrypt twice. The encryption algorithm must produce

Â different cipher texts. And so there are two ways to do that. The first method is

Â what's called randomized encryption. Here, the encryption algorithm itself is going

Â to choose some random string during the encryption process and it is going to

Â encrypt the message M using that random string. So what this means is that a

Â particular message, M0 for example, isn't just going to be mapped to one cipher text

Â but it's going to be mapped to a whole ball of cipher texts. Whereon every

Â encryption, basically, we output one point in this ball. So every time we encrypt, the

Â encryption algorithm chooses a random string, and that random string leads to

Â one point in this ball. Of course, the decryption algorithm, when it takes any

Â point in this ball, will always map the result to M zero. Similarly cipher text M

Â one will be mapped to a ball, and every time we encrypt M one, we basically output

Â one point in this ball. And these balls have to be disjoint, so that the

Â encryption algorithm, when it obtains a point in the ball corresponding to M one,

Â will always output the message M one. In this way, since the encryption algorithm

Â uses randomness, if we encrypt the same message twice, with high probability we'll

Â get different cipher texts. Unfortunately this means that the cipher text

Â necessarily has to be longer than the plain text because somehow the randomness

Â that was used to generate the cipher text is now encoded somehow in the cipher text.

Â So the cipher text takes more space. And roughly speaking, the cipher text size is

Â going to be larger than the plain text. By basically the number of random bits that

Â were used during encryption. So if the plain texts are very big, if the plain

Â texts are gigabytes long, the number of random bits is going to be on the order of

Â 128. So maybe this extra space doesn't really matter. But if the plain texts are

Â very short, maybe they themselves are 128 bits, then adding an extra 128 bits to

Â every cipher text is going to double the total cipher text size. And that could be

Â quite expensive. So as I say randomized encryption is a fine solution but in some

Â cases it actually introduces quite a bit of costs. So let's look at a simple example.

Â So imagine we have a pseudorandom function that takes inputs in a certain

Â space r which is gonna be called a nonce space. And outputs, outputs in the message

Â space. And, now, let's define the following randomize encryption scheme

Â where we want to encrypt the message m with the encryption of whatever it's gonna

Â do is first it's gonna generate a random r in this nonce space R. And then it's going

Â to open a cypher text that consist of two components, the first component is going

Â to be this value R and the second component is going to be an evaluation of

Â the pseudo-random function at the point R XOR with the message M. And my question to

Â you is, is this encryption system semantically secure under a chosen plain

Â text attack. So the correct answer is yes. But only if the nonce space R is large

Â enough so that little R never repeats with very, very high probability. And let's

Â quickly argue why that's true. So first of all, because F is a secure pseudo-random

Â function, we might as well replace it with a truly random function. In other words,

Â this is indistinguishable from the case where we encrypt the message M, using the

Â truly random function little F, evaluated to point R, and then XOR with M.

Â But since this little r never repeats every cypher text uses a different little r what

Â this means is that the values of F(r) are random uniform independent strings

Â every time. So every time we encrypt a message, we encrypt it essentially using a

Â new uniform random one time pad. And since XORing a uniform string with any string

Â simply generates a new uniform string, the resulting cipher text is distributed as

Â simply two random uniform strings. I'll call them r and r prime. And so both in

Â experiment zero and in experiment one, all the attacker gets to see are truly uniform

Â random strings r, r', and since in both experiments the attacker is seeing the same

Â distribution, he cannot distinguish the two distributions. And so since security

Â holds completely when we're using a truly random function it's also gonna hold when

Â we're using a pseudorandom function. Okay, so this is a nice example of how we use

Â the fact that the pseudo random function behaves like a random function to argue

Â security of this particular encryption scheme. Okay, so now we have a nice

Â example of randomized encryption. The other approach to building chosen plain

Â text secure encryption schemes is what's called a nonce based encryption. Now, in

Â a non-spaced encryption system, the encryption algorithm actually takes three

Â inputs rather than two. As usual it takes the key and the message. But it also takes

Â an additional input called a nonce. And similarly, the decryption algorithm also

Â takes the nonce as input, and then produces the resulting decrypted plain text. And

Â what is this nonce value n. This nonce is a public value. It does not need to be

Â hidden from the adversary but the only requirement is that the pair (k,n)

Â is only used to encrypt a single message. In other words, this pair (k,n)

Â must change from message to message. And there are two ways to change it. One way

Â to change it is by choosing a new random key for every message. And the other way

Â is to keep using the same key all the time but then we must choose a new nonce for

Â every message. And, and as I said, I wanna emphasize again, this nonce need not

Â be secret, and it need not be random. The only requirement is the nonce is unique.

Â And in fact, we're gonna use this term throughout the course. A nonce

Â for us, means a unique value that doesn't repeat. It does not have to be random. So

Â let's look at some examples of choosing an nonce, well the simplest option is

Â simply to make the nonce of the accounter so for example the networking

Â protocol you can imagine the nonce being a packet counter that's incremented

Â every time a packet is sent by a sender or received by the receiver this means that

Â the encrypter has to keep state from message to message mainly that he has to

Â keep this counter around and increment it after every message is transmitted.

Â Interestingly, if the decrypter actually has the same state then there is no need

Â to include the nuance in the cipher text since the nuance is implicit. Let's look

Â at an example. The https protocol is run over a reliable transport mechanism which

Â means that packets sent by the sender are assumed to be received in order at a

Â recipient. So if the sender sends packet #5 and then packet #6, the recipient

Â will receive packet #5 and then packet #6 in that order. This

Â means that if the sender maintains a packet counter, the recipient can also

Â maintain a packet counter and two counters basically increment in sync. In this case

Â there is no reason to include the nonce in the packets because the

Â nonce is implicit between the two sides. However, in other protocols, for

Â example, in IPsec, IPsec has a protocol designed to encrypt the IP layer. The IP

Â layer does not guarantee in order delivery. And so the sender might send

Â packet #5 and then packet #6, but those will be received in reverse order at

Â the recipient. In this case it's still fine to use a packet counter as a nonce

Â but now the nonce has to be included in the packet so that the recipient knows

Â which nonce to use to decrypt the received packet. So as I say, nonce based

Â encryption is a very efficient way to achieve CPA security. In particular if the

Â nonce is implicit, it doesn't even increase the cipher text length. Of course

Â another method to generate a unique nonce is simply to pick the nonce at random

Â assuming the nonce space is sufficiently large so that with high probability the

Â nonce will never repeat for the life of the key. Now in this case, nonce

Â based encryption simply reduces to randomized encryption. However, the

Â benefit here is that the sender does not need to maintain any state from message to

Â message. So this is very useful for example if encryption happens to take

Â place on multiple devices. For example, I might have both a laptop and a smart

Â phone. They might both use the same key. But in this case if I require state full

Â encryption, then my laptop and the smartphone would have to coordinate to

Â make sure that they never reuse the same nonces. Whereas if both of them simply take

Â nonces at random, they don't need to coordinate because it was very high

Â probability they'll simply never choose the same nonce. Again assuming the nonce

Â space is big enough. So there are some cases where stateless encryption is quite

Â important, in particular where the same key is used by multiple machines. So I

Â wanted to find, more precisely, what security means for nonce based

Â encryption. And in particular, I want to emphasize that the system must remain

Â secure when the nonce are chosen by the adversary. The reason it's important

Â to allow the adversary to choose the nonces is because the adversary can

Â choose which cipher text it wants to attack. So imagine the nonce happens to

Â be a counter and it so happens that when the couter hits the value fifteen, maybe

Â at that point it's easy for the adversary to break semantic security. So the

Â adversary will wait until the fifteenth packet is sent and only then he will ask

Â to break semantic security. So when we talk about nonce based encryption, we

Â generally allow the adversary to choose the nonce and the system should remain

Â secure even under those settings. So let's define the CPA game in this case and it's

Â actually very similar to the game before. Basically the attacker gets to submit

Â pairs of messages MI, MI0, and MI1. Obviously they both have to be of the same

Â length. And he gets to supply the nonce. And in response, the adversary is given

Â the encryption of either MI0, or MI1. But using the nonce that the adversary

Â chose. And of course, as usual, the adversary's goal is to tell whether he was

Â given the encryption of the left plain text or the right plain text. And as

Â before the adversary gets to iterate these queries and he can issue as, as many

Â queries as he wants, we usually let q denote the number of queries that the

Â adversary issues. Now the only restriction of course, which is crucial, is that

Â although the adversary gets to choose the nonces, he's restricted to choosing

Â distinct nonces. The reason we force him to choose distinct nonces is because

Â that's the requirement in practice. Even if the adversary fools Alice into

Â encrypting multiple messages for him, Alice will never use the same nonce

Â again. As a result, the adversary will never see messages encrypted using the

Â same nonce and therefore, even in the game, we require that all nonce be

Â distinct. And then as usual we say that the system is a nonce based encryption

Â system that's, semantically secure under a chosen plain text attack if the adversary

Â cannot distinguish experiment zero where he's given encryptions of the left

Â messages from experiment one where he's given encryptions of the right messages.

Â So let's look at an example of a nonce based encryption system. As before, we

Â have a secure PRF that takes inputs in the nonce space R and outputs strings in the

Â message space M. Now when a new key is chosen, we're going to reset our counter R

Â to be zero. And now we encrypt the particular message M, what we will do is

Â we will increment our counter R, and then encrypt the message M using the

Â pseudorandom function applied to this value R. And as before, the cipher text is

Â going to contain two components, our current value of the counter and then the

Â one time pad encryption of the message M. And so my question to you is whether this

Â is a secure, non-spaced encryption system. So the answer as before is yes, but only

Â if the nuance space is large enough. So as we increment the counter R, it will never

Â cycle back to zero so that the nuances will always, always be unique. We argue

Â security the same way as before. Because the PRF is secure, we know that this

Â encryption system is indistinguishable from using a truly random function. In

Â other words, if we apply a truly random function to the counter and XOR the

Â results with, the plain text M. But now since the nuance R never repeats, every

Â time we compute this F of R, we get a truly random uniform and independent

Â string so that we're actually encrypting every message using the one time pad. And

Â as a result, all the adversary gets to see in both experiments are basically just a

Â pair of random strings. So both the experiment zero and experiment one the

Â adversary get's to see exactly the same distribution, namely, the responses to all

Â this chosen plain text queries are just pairs of strings that are just uniformly

Â distributed and this is basically the same in experiment zero and experiment one and,

Â therefore, the attacker cannot distinguish the two experiments. And since he cannot

Â win the semantic security game with a truly random function he, also, cannot win

Â the semantics security game with the secure PRF, and, therefore, the scheme is

Â secure. So now we understand what it means for a symmetric system to be secure when

Â the keys used to encrypt multiple messages the requirement is that it be secure under

Â a chosen plan of attack. And we said that basically, the only way to be secure under

Â a chosen plain text attack is either to use randomized encryption, or to use, use

Â nonce spaced encryption where the nonce never repeats. And then in the

Â next two segments, we're gonna build two classic encryption systems that are secure

Â when the key is used multiple times.

Â