0:00

My goal for the next few segments is to show you that if we use a secure PRG We'll

Â get a secure stream safer. The first thing we have to do is define is, what does it

Â mean for a stream safer to be secure? So whenever we define security we always

Â define it in terms of what can the attacker do? And what is the attacker

Â trying to do? In the context of stream ciphers remember these are only used

Â with a onetime key, and as a result the most the attacker is ever going to see is

Â just one cypher text encrypted using the key that we're using. And so we're gonna

Â limit the attackers? ability to basically obtain just one cypher text. And in

Â fact later on we're going to allow the attacker to do much, much, much more, but

Â for now we're just gonna give him one cypher text. And we wanted to find,

Â what does it mean for the cypher to be secure? So the first definition that

Â comes to mind is basically to say, well maybe we wanna require that the attacker

Â can't actually recover the secret key. Okay so given ciphertext you

Â shouldn't be able to recover the secret key. But that's a terrible definition

Â because think about the falling brilliant cipher but the way we encrypt the

Â message using a key K is basically we just output the message. Okay this

Â is a brilliant cipher yeah of course it doesn't do anything given a message just

Â output a message as the cipher text. This is not a particularly good encryption

Â scheme; however, given the cipher text, mainly the message the poor attacker

Â cannot recover the key because he doesn't know what the key is. And so as a result

Â this cipher which clearly is insecure, would be considered secure under this

Â requirement for security. So this definition will be no good. Okay so it's

Â recovering the secret key is the wrong way to define security. So the next thing we

Â can kinda attempt is basically just say, well maybe the attacker doesn't care about

Â the secret key, what he really cares about are the plain text. So maybe it should be

Â hard for the attacker to recover the entire plain text. But even that doesn't

Â work because let's think about the following encryption scheme. So suppose

Â what this encryption scheme does is it takes two messages, so I'm gonna use two

Â lines to denote concatenation of two messages M0 line, line M1 means

Â concatenate M0 and M1. And imagine what the scheme does is really it outputs

Â M0 in the clear and concatenate to that the encryption of M1. Perhaps even

Â using the One Time Pad okay? And so here the attacker is gonna be given one

Â cipher text. And his goal would be to recover the entire plain texts. But the

Â poor attacker can't do that because here maybe we've encrypted M1 using the One

Â Time Pad so the attacker can't actually recover M1 because we know the

Â One Time Pad is secure given just one cipher text. So this construction

Â would satisfy the definition but unfortunately clearly this is not a secure

Â encryption scheme because we just leaked half of the plain text. M0 is

Â completely available to the attacker so even though he can't recover all of the

Â plain text he might be able to recover most of the plain text, and that's clearly

Â unsecured. So of course we already know the solution to this and we talked about

Â Shanon definition of security perfect secrecy where Shannon's idea was that in

Â fact when the attacker intercepts a cipher text he should learn absolutely no

Â information about the plain texts. He shouldn't even learn one bit about the

Â plain text or even he shouldn't learn, he shouldn't even be able to predict a little

Â bit about a bid of the plain text. Absolutely no information about the plain text.

Â So let's recall very briefly Shannon's concept of perfect secrecy

Â basically we said that you know given a cipher We said the cipher has perfect

Â secrecy if given two messages of the same length it so happens that the distribution

Â of cipher texts. Yet if we pick a random key and we look into distribution of

Â cipher texts we encrypt M0 we get exactly the same distribution as when we

Â encrypt M1. The intuition here was that if the advisory observes the cipher texts

Â then he doesn't know whether it came from the distribution the result of encrypting

Â M0 or it came from the distribution as the result of encrypting M1. And as a

Â result he can't tell whether we encrypted M0 or M1. And that's true for all

Â messages of the same length and as a result a poor attacker doesn't really know

Â what message was encrypted. Of course we already said that this definition is too

Â strong in the sense that it requires really long keys if cipher has short

Â keys can't possibly satisfy this definition in a particular stream ciphers

Â can satisfy this definition. Okay so let's try to weaken the definition a little bit

Â and let's think to the previous segment, and we can say that instead of requiring

Â the two distributions to be absolutely identical what we can require is, is that

Â two distributions just be computationally indistinguishable. In other words a poor,

Â efficient attacker cannot distinguish the two distributions even though the

Â distributions might be very, very, very different. That just given a sample for

Â one distribution and a sample for another distribution the attacker can't tell which

Â distribution he was given a sample from. It turns out this definition is actually

Â almost right, but it's still a little too strong, that still cannot be satisfied, so

Â we have to add one more constraint, and that is that instead of saying that this

Â definition should have hold for all M0 and M1. It is to hold for only pairs M0 M1

Â that the attackers could actually exhibit. Okay so this actually

Â leads us to the definition of semantics security. And so, again this is semantics

Â security for one time key in other words when the attacker is only given one cipher

Â text. And so the way we define semantic security is by defining two experiments,

Â okay we'll define experiment 0 and experiment 1. And more generally we will

Â think of these as experiments parentheses B, where B can be zero or one. Okay so the

Â way the experiment is defined is as follows, we have an adversary that's

Â trying to break the system. An adversary A, that's kinda the analog of statistical

Â tests in the world of pseudo random generators. And then the challenger does

Â the following, so really we have two challengers, but the challengers are so

Â similar that we can just describe them as a single challenger that in one case takes

Â his inputs bits set to zero, and the other case takes his inputs bits set to

Â one. And let me show you what these challengers do. The first thing the

Â challenger is gonna do is it's gonna pick a random key and then the adversary

Â basically is going to output two messages M0 and M1. Okay so this is an explicit

Â pair of messages that the attacker wants to be challenged on and as usual we're not

Â trying to hide the length of the messages, we require that the messages be equal

Â length. And then the challenger basically will output either the encryption of M0

Â or the encryption of M1, okay so in experiment 0 the challenger will output

Â the encryption of M0. In experiment 1 the challenger will output the encryption

Â of M1. Okay so that the difference between the two experiments. And then the

Â adversary is trying to guess basically whether he was given the encryption of M0

Â or given the encryption of M1. Okay so here's a little bit of notation let's

Â define the events Wb to be the events that an experiment B the adversary output one.

Â Okay so that is just an event that basically in experiment zero W0 means that

Â the adversary output one and in experiment one W1 means the adversary output one. And

Â now we can define the advantage of this adversary, basically to say that this is

Â called the semantics security advantage of the adversary A against the scheme E,

Â to be the difference of the probability of these two events. In other words we are

Â looking at whether the adversary behaves differently when he was given the

Â encryption of M0 from when he was given the encryption of M1. And I wanna make

Â sure this is clear so I'm gonna say it one more time. So in experiment zero he was

Â given the encryption of M0 and in experiment one he was given the encryption

Â of M1. Now we're just interested in whether the adversary output 1 or not.

Â ... In these experiments. If in both experiments the adversary output 1 with

Â the same probability that means the adversary wasn't able to distinguish the

Â two experiments. Experiments zero basically looks to the adversary the same

Â as experiment one because in both cases upload one with the same probability.

Â However if the adversary is able to output 1 in one experiment with significantly

Â different probability than in the other experiment, then the adversary was

Â actually able to distinguish the two experiments. Okay so... To say this more

Â formally, essentially the advantage again because it's the difference of two

Â probabilities the advantage is a number between zero and one. If the advantage is

Â close to zero that means that the adversary was not able to distinguish

Â experiment zero from experiment one. However if the advantage is close to one,

Â that means the adversary was very well able to distinguish experiment zero from

Â experiment one and that really means that he was able to distinguish an encryption

Â of M0 from an encryption of M1, okay?So that's out definition. Actually

Â that is just the definition of the advantage and the definition is just what

Â you would expect basically we'll say that a symmetric encryption scheme is

Â semantically secure if for all efficient adversaries here I'll put these in quotes

Â again, "For all efficient adversaries the advantage is negligible." In other words,

Â no efficient adversary can distinguish the encryption of M0 from the encryption

Â of M1 and basically this is what it says repeatedly that for these two

Â messages that the adversary was able to exhibit he wasn't able to distinguish

Â these two distributions. Now I want to show you this, this is actually a very

Â elegant definition. It might not seem so right away, but I want to show you some

Â implications of this definition and you'll see exactly why the definition is the way

Â it is. Okay so let's look at some examples. So the first example is suppose

Â we have a broken encryption scheme, in other words suppose we have an adversary A

Â that somehow given the cipher text he is always able to deduce the least

Â significant bit of the plain text. Okay so given the encryption of M0, this adversary

Â is able to deduce the least significant bit of M0. So that is a terrible

Â encryption scheme because it basically leaks the least significant bit of the

Â plain text just given the cipher text. So I want to show you that this scheme is

Â therefore semantically secure so that kind of shows that if a system is semantically

Â secure than there is no attacker of this type. Okay so let's see why is the system

Â not semantically secure well so what we are gonna do is we're gonna basically use

Â our adversary who is able to learn our most insignificant bits, we're going to

Â use him to break semantic security so we're going to use him to distinguish

Â experiment zero from experiment one Okay so here is what we are going to do. We are

Â algorithm B, we are going to be algorithm B and this algorithm B is going to use

Â algorithm A in its belly. Okay so the first thing that is going to happen is of

Â course the challenger is going to choose a random key. The first thing that is going

Â to happen is that we need to output two messages. The messages that we are going

Â to output basically are going to have differently significant bits. So one

Â message is going to end with zero and one message is going to end with one. Now what

Â is the challenger going to do? The challenger is going to give us the

Â encryption of either M0 or M1, depending on whether in experiment 0 or

Â in experiment 1. And then we just forward this cipher text to the adversary

Â okay so the adversary A. Now what is the property of adversary A? Given the cipher

Â text, adversary A can tell us what the least significant bits of the plain text is.

Â In other words the adversary is going to output the least significant bits of M0 or M1

Â but low and behold that's basically the bit B. And then we're just

Â going to output that as our guest so let?s call this thing B prime Okay so now this

Â describes the semantic security adversary. And now you tell me what is the semantic

Â security advantage of this adversary? Well so let's see. In experiment zero, what is

Â the probability that adversary B outputs one? Well in experiment zero it is always

Â given the encryption of M zero and therefore adversary A is always output the

Â least significant bit of M zero which always happens to be zero. In experiment

Â zero, B always outputs zero. So the probability that outputs one is zero.

Â However in experiment one, we're given the encryption of M1. So how likely is

Â adversary B to output one in experiment one well it always outputs one, again by

Â the properties of algorithm A and therefore the advantage basically is one.

Â So this is a huge advantage, it's as big as it's gonna get. Which means that this

Â adversary completely broke the system. Okay so we consider, so under semantic

Â security basically just deducing the least significant bits is enough to completely

Â break the system under a semantic security definition. Okay, now the interesting

Â thing here of course is that this is not just about the least significant bit, in

Â fact of the message for example the most significant bit, you know

Â bit number seven Maybe the XOR of all the bits in the message and so on

Â and so forth any kind of information, any bit about the plain text they can be

Â learned basically would mean that the system is not semantically secure. So

Â basically all the adversary would have to do would be to come up with two messages

Â M0 and M1 such that under one thing that they learned it's the value zero and then

Â the other thing, the value, is one. So for example if A was able to learn the XOR

Â bits of the message then M0 and M1 would just have different

Â XOR for all the bits of their messages and then this adversary A would

Â also be sufficient to break semantic security. Okay so, basically for cipher

Â is semantically secure then no bit of information is revealed to an

Â efficient adversary. Okay so this is exactly a concept of perfect secrecy only

Â applied just efficient adversaries rather than all adversaries. So the next

Â thing I wanna show you is that in fact the one time pad in fact is

Â semantically secure, they better be semantically secure because it's in fact,

Â it's more than that it's perfectly secure. So let's see why that's true. Well so

Â again it's one of these experiments, so suppose we have an adversary that claims

Â to break semantic security of the one time pad. The first the adversary is gonna do

Â is he's gonna output two messages M0 and M1 of the same length.

Â Now what does he get back as a challenge? As a challenge, he gets either the encryption

Â of M0 or the encryption of M1 under the one time pad.

Â And he's trying to distinguish between those two possible cipher texts that he gets, right?

Â In experiment zero, he gets the encryption of M0 and in experiment one, he gets the

Â encryption of M1. Well so let me ask you, so what is the advantage of adversary

Â A against the one time patent? So I remember that the property of the ones I

Â had is that, that K, XOR M0 is distributed identically to K, XOR M1.

Â In other words, these distributions are absolutely identical distribution,

Â distributions, identical. This is a property of XOR. If we XOR the random pad

Â K with anything, either M0 or M1, we get uniform distribution. So in both

Â cases, algorithm A is given as in input exactly the same distribution. Maybe the

Â uniform distribution on cipher texts. And therefore it's gonna behave exactly the

Â same in both cases because it was given the exact distribution as input. And as a

Â result, its advantage is zero, which means that the onetime pad is semantically

Â secure. Now the interesting thing here is not only is it semantically secure, it's

Â semantically secure for all adversaries. We don't even have to restrict the

Â adversaries to be efficient. No adversary, doesn't matter how smart you are, no

Â adversary will be able to distinguish K XOR M0 from K XOR M1 because the two

Â distributions are identical. Okay, so the one time pad is semantically secure. Okay,

Â so that completes our definition of semantic security so the next thing we're

Â going to do is prove to the secure PRG, in fact implies that the stream cipher is

Â semantically secure.

Â