0:00

Now that we know about the one-time pad, let's talk about making the one-time pad

Â more practical using something called the stream cypher. But before we do that,

Â let's do a quick review of where we were. So let me just remind you that a cypher is

Â defined over a triple of sets called a key space, a message space, and a cypher text

Â bare space. And a cypher is a pair of efficient algorithms called E and D; E

Â stands for encryption and D stands for decryption. And the only property. That we

Â need to satisfy is that decryption is the opposite of encryption. In other words if

Â I encrypt a message M using a particular key. And I de-crypt using the same key. I

Â get back the original message. Last time we looked at a couple of weak cyphers like

Â the substitution cypher and the vigonaire cypher. We showed that all of them can be

Â easily broken so you should never ever, ever use those cyphers. Those were just

Â for historical reference. And then we looked at our first exam.ple of a good

Â cypher, namely the one-time pad. So let me just remind you how the one-time pad is

Â defined. Basically the message space is the set of all bit end bit strings. The

Â cypher text is a set of all bit end bit strings. And similarly, the key. Is the

Â set of all N bit strings and the way we encrypt is by a simple exor to encrypt the

Â message we just exor the message in the key that gives us the cypher text. And

Â then to decrypt a cypher text, we just do this x over again and it's easier to show

Â by properties of x over that in fact decryption is the opposite of encryption.

Â And then we talked about this lemma, in fact, we proved it, that says that the

Â one-time pad has perfect secrecy, which means that if you're just an eavesdropper

Â and you just get to see a single cypher text, you're not going to be able to

Â deduce any information about the encrypted plain text. Unfortunately. We also said

Â that Shannon proved this lema, we called it the bad news lema, that basically says

Â that any cypher that has perfect secrecy must have really long keys. In other

Â words, the key length must be at least as long as the length of the message, which

Â means that the cypher is not particularly useful. Because if two parties have a way

Â to agree on really long keys that are as long as the message, they, in some sense,

Â might as well use that mechanism to already transmit the message itself. So in

Â this lecture we're going to take the idea of the one time pad and try to make it

Â into a practical encryption scheme. So this is called what's called a stream

Â cypher. So the idea in this dream cypher is rather than using a totally random key,

Â we're actually going to use a pseudo-random key. And to explain how that

Â works, I need to define what is a pseudo-random generator, PRG. So a PRG,

Â really, all it is, is just a function, I'll call it g for generator, that takes a

Â seed, so I'm going to use zero one to the s to denote all strings of length s, so

Â this we'll call the seed space. So he takes an s bit seed and maps it to a much

Â larger string which will denote by zero one to the n. And the property is that n

Â must be much larger than s. So in other words, we take a seed that might be only

Â 128 bits and we expand it into a much, much larger output string that could be

Â gigabytes long. That's what this pseudo-random generator does. And of

Â course, the goal is that first of all, the generator is efficiently computable, so

Â the function g. There should be some sort of an efficient algorithm that computes

Â it. So, efficiently computable by a deterministic algorithm. It's important to

Â understand that the function g itself has no more randomness, in it, it's a totally

Â deterministic. The only thing that's random here is the random seed, that's

Â given as input to the function g. And the other property, of course, is that the

Â output. Should look random and the question is what does it mean to look

Â random, and that's something we'll define later on in the lecture. Okay so suppose

Â we have such a generator, how do we use that, to build a stream cipher? We the

Â idea is that we're gonna use the seed, as our key, so our short seed is gonna be our

Â secret key. And then we're gonna use the generator to basically expand the seed

Â into a much, much larger random looking sequence, or pseudo random sequence, as

Â it's known, so this would be G of K. And then we are going to X over it just like in the

Â one time pad we're going to X over it the student random sequence with the message

Â and that's going to give us the cypher text. Or if we want to write this in math,

Â we'll write C equals the encryption of the message M with a key K, which is simply

Â defined as M XOR G of K. And then when we want to decrypt, basically we do exactly

Â the same thing. It's basically the cypher text XOR G of K, just like in the one-time

Â pad except then instead of XOR-ing with K, we XOR with the output of the generator

Â applied to K. So the first question to ask is why is it secure? So, basically you

Â now, we only have one notion of security so far, which we called perfect secrecy.

Â And so let's just quickly ask can a stream cipher have perfect secrecy. Remember in

Â the stream cipher the key is much, much shorter than the message. And so, never

Â the less, can it have perfect secrecy. So I hope everybody said the answer is, no.

Â The key is much shorter than the message. And we said that in a, in a perfectly

Â secure cypher, the key must be as long as the mesage. And therefore, it's not

Â possible that a, that a stream cypher actually has perfect secrecy. So the

Â question is, then, well, why is it secure? First of all we would need a different

Â definition of security to argue that Stream Safe is, is secure and in

Â particular, the security property is going to depend on the specific generator that

Â we used. So in fact the definition of privacy that we'll need to argue the

Â security of Stream Cipher is we'll see in the next lecture, but for now let me show

Â you one particular property. That a generator must have a minimum property

Â needed for security. This property is called unpredictability. So let's we

Â suppose for one second that in fact a stream cycle is predictable. So, what does

Â that mean? Both the PRG is predictable. What that means is essentially that there

Â is some pi. Such that if I give you the first I bits of the outputs. This notation

Â Bar one to I means look at the first I-bits of the output of the function.

Â Okay, so I give you the first I bits of the stream. There is some sort of an

Â algorithm, there's an efficient algorithm that will compute the rest of the string.

Â Okay, so given the first I bits, you can compute the remainder of the bits. I claim

Â that if this is the case, then the stream cypher would not be secure. So let's see

Â why. Suppose an attacker actually intercepts a particular cypher text, let's

Â call it c. If this is the case, then in fact, we have a problem. Because suppose

Â that just by some prior knowledge, the attacker actually knows that the initial

Â part of the message happens to be some known value. For example, we know that in

Â the mail protocol, smtp, the standard mail call used in the internet, you know that

Â every message starts with a word from colon. Well, that would be a prefix that

Â the adversary knows. That the site, that the message must begin with from a colon.

Â What it could do is it could [inaudible] the cipher text with the words from colon,

Â with the little prefix of the message that it actually knows. And what that would

Â give it is a prefix of. Of the pseudo random sequence. And I would learn as a

Â result of this, it would learn a prefix of the pseudo random sequence but then we

Â know that once it has a prefix of the pseudo random sequence it can predict the

Â remainder of the pseudo random sequence and that would allow it to then predict

Â the rest of the message end. Okay, so for example if the pseudo random generator was

Â predictable given five bits of the pad. Then every email encrypted using the

Â stream cypher would be decryptable because, again, the attacker knows the

Â prefix of the message from which he deduces the prefix of the pad, which then

Â allows him to compute the rest of the pad, which then allows him to recover the

Â entire plain text. Okay, so this is an example that shows that in fact if a PRG

Â is predictable then already there are security problems. Because a small prefix

Â would reveal the entire message. As it turns out, even if I could predict just

Â one bit of the outputs. Even if given, you know, the first I bits, I can predict the

Â next bits, the I plus first bits. Already, this is a problem. Because that would say

Â that given again, the first couple of letters in a message can predict, can

Â decrypt essentially, and recover the next bit of the message, or the next layer of

Â the message, and so on. So this predictability property shows that, you

Â know, if we use a PRG that's going to be used in a stream cypher, it had better be

Â unpredictable. So what does it mean that a PRG is unpredictable? So let's define more

Â precisely what it means for a PRG to be unpredictable. Well first we'll define

Â more precisely what it means for a PRG to be predictable. So we say that G is

Â predictable, if there exists an efficient algorithm. Let's call it A and there is

Â some position. There's a position I between one and N minus one such that if

Â we look at the probability over a random key. Probability if I generate a random

Â key. You remember, this notation means choose a random key from the set k. So

Â this arrow with r just means choose a random key from the set k. Basically, if I

Â give this algorithm the prefix of the output, if I give it the first I bytes of

Â the output, the probability that it's able to predict the next bit of the outputs,

Â this probability is greater than half plus epislon. For some non-negotiable. For some

Â non-negligible. >> [inaudible]. >> A non [inaudible], for example, would be

Â epsilon, which is great than one over two to the 30. One over a billion, for

Â example, we would consider it not negligible. So these terms, negligible and

Â non negligible will come back at the end of the lecture, and define them more

Â precisely. But for now, let's just, stick with the intuitive notion of what non

Â negligible means. And so this is what it means, for an algorithm, for a generator

Â to be, predictable. Basically, there is some algorithm that is able to predict the

Â I plus first bits, even the mutual prefix, okay? And then we say that an algorithm,

Â that a P or G is unpredictable. If in fact, well, if it's doesn't satisfy the

Â property that we just defined. In other words, it is not predictable. But what

Â does it mean, more precisely for it not to be predictable. It means that, in fact,

Â for all positions, for all i there is no efficient adversary no efficient

Â algorithm A that can predict the i + 1 bit with non negligible probability.

Â Excellent. Okay and this has to be true for all I. So no matter which prefix I

Â give you, you're not gonna be able to predict the next bit that follows the

Â prefix. Okay, so let's look at some examples. Here is a silly, silly example.

Â Suppose I give you a generator, and I ask you, is it predictable? Well, this

Â generator happens to have the property, that if I XOR all the bits of the

Â outputs, I always happen to get one. Okay, so I x over all the bits. Bit number one,

Â x over bit number two, x over bit number three. If I x over all those bits, I

Â happen to get one. The question is, is that a predictable generator? And again, I

Â hope everybody answered yes, that essentially given the first n minus one

Â bits of the outputs, I can predict the nth bit because it would just be the bits

Â that's needed to make the x over all the bits be one. In other words, I give you

Â all but one of the bits of the generator, you can actually predict the last bit of

Â the generator. Now that we've seen that PRGs have to be unpredictable, I just want

Â to mention a couple of weak PRGs that should never ever be used for crypto. This

Â is a very common mistake and I just want to make sure none of you guys make this

Â mistake. So a very common PRG that should actually never be used for crypto is

Â called a linear congruential generator. So let me explain what a linear congruential

Â generator is. Basically it has parameters. It has three parameters. I'll call them A,

Â B and P. A and B are just integers and P is a prime. And the generator is defined

Â as follows, essentially I'll say R zero is the seed of the generator. And then the

Â way you generate randomness is basically you [inaudible] run through the following

Â steps. You compute a times r of I minus one plus b modular p. Then you output a

Â few bits of the current states, output few bits of our i. Then, of course, you

Â increment I and you iterate this again and again and again. Okay? So you can see how

Â this generator proceeds. It starts with a particular seed. At every step there is

Â this leaner transformation that's being applied to the seed. And then you output a

Â few bits of the current state and then you do that again and again and again and

Â again. Unfortunately even though this generator has good statistical properties

Â in the sense that, for example, the number of zeroes it outputs is likely going to be

Â similar to the number of ones and so on; it has, you can actually argue all sorts

Â of nice statistical properties about this, nevertheless it is a very easy generator

Â to predict. And in fact should never ever be used. In fact, just given a few

Â outputs, a few output samples, it's easy to predict the remainder of the sequence.

Â And as a result, this generator should never ever be used. Another example is a

Â random number generator that is very closely related to the linear congruential

Â generator. This is a random number generator implemented in glibc, very

Â common library. That you can see. I just wrote down the definition here. You can

Â see that is basically outputs a few bits at every ideration and it just does this

Â simple linear transformation at every step. Again, this is a very easy generator

Â to predict and should never ever be used for crypto. And so the lesson I want to

Â emphasize here is never ever use the built-in glibc function random. For crypto,

Â because it doesn't reduce, cryptographic randomness, in the sense

Â that it is easy to predict. And, in fact, systems like Kerberos version four have

Â used random and, have been bitten by that. So, please don't make that mistake

Â yourself. We will talk about how to do secure random number generation actually in

Â the next lecture. Before we conclude this lecture I just want to give a little bit

Â more detail about these concepts of negligible and non-negligible values, so

Â different communities in crypto actually define these concepts differently, for

Â practitioners basically these the term negligible and non-negligible,

Â are just, particular scalers that are used in the definition. So, for example, a

Â practitioner would say, that if a value is more than one over, one over a billion,

Â one over two to the 30, we say that the value is non-negligible. The reason is,

Â the reason that's so, is, because if you happen to use a key, for example, for, for

Â encrypting a gigabyte of data, a gigabyte of data is about two to the 30 or maybe

Â even two to the 32 bytes. Then an event that happens with the probability one over

Â two to the thirty will likely happen after about a gigabyte of data. So since a

Â gigabyte of data is within reason for a particular key, this event is likely to

Â happen. Therefore one over two to the thirty is non-negligible. On the other

Â hand, we'll say that one over two to the eighty. Which is much, much, much smaller

Â is an event, an event that happens with this probability is an event that's

Â actually not going to happen over the life of the key. And therefore we'll say that

Â that's a negligible event. As it turns out, these practical definitions of

Â negligible and non-negligible are quite problematic and we'll see examples of why

Â they're problematic later on. So in fact in the more rigerous theory of

Â cryptography, the definition of negligible and non-negligible are somewhat different.

Â And in fact, when we talk about the probability of an event, we don't talk

Â about these probabilities as scalers, but rather we talk about them as functions of

Â a security parameter. So let me explain what that means. So these functions,

Â essentially, are functions that map, that outputs, positive real values. So, are non

Â negative real values that are supposedly probabilities. But they're functions that

Â act on non negative integers, okay? So, what does it mean for a function to be

Â non-negligible? What it means is that the function is bigger than some polynomial

Â infinitely often. In other words, for many, for infinitely many values, the

Â function is bigger than some, one over polynomial, okay? So I wrote the exact

Â definition here, and we'll see an example, in just a minute. Okay? So if something is

Â bigger, is often bigger than one of that polynomial, we'll say that it's non-negligible.

Â However, if something is smaller than all polynomials, then we'll

Â say that it's negligible. So, what this says here, basically, for any degree

Â polynomial, for all d, there exists some lower bound lambda-d such as, for all

Â lambda bigger than this lambda-d, the function is smaller than one over the

Â polynomial. So all this says is that the function is negligible if it's less than

Â all polynomial fractions. In other words, is less than one over lambda-d for

Â sufficiently large lambda. So let's look at some examples. And we'll see

Â applications of these negligible and non-negligible concepts later on. But I

Â just want to make, wanted to make it clear that this is how you would rigorously find

Â these concepts. Basically either smaller than one over poly or bigger than one over

Â poly, one would be negligible, the other would be non-negligible. Let's look

Â