In the last lecture, we defined formally the notion of perfect secrecy. In this lecture, we'll see a construction that achieves that definition. Just to remind you of our definition, we'll say that in an encryption scheme [COUGH] defined by the three algorithms, Gen, Enc and Dec, and with message space given by M. Is perfectly secret if for every possible probability distribution over the message base, any message in that message base, and any cipher text, it holds that the A [INAUDIBLE] probability, that the message, is equal to M, conditioned on the observed cipher text being equal to C. Is exa, exactly equal to the apriori probability, that the message, is equal to M. Mm where this apriori probability is exactly the given probability distribution, that we start with. The scheme that we'll see that achieves this definition is known as the one-time pad. This came out of, actually, an interesting history, it was patented in 1917, by Vernam, although recent historical research indicates that actually it was invented at least 35 years earlier. The scheme was patented in 1917, but there was no notion of perfect secrecy, until the later work of Shannon in the 1940s. In addition to defining the notion of perfect secret as we've defined it here, Shannon also proved that the one time pad scheme, does indeed achieve that definition. IE that the one time pad, is indeed perfectly secret. So here's the one time pad encryption scheme. First of all we're going to let our message base be equal to 0,1 to the n. This is some notation that I'm going to use frequently in this course since I want to highlight it. The notation 0,1 to the n means that the set of all strings, of all binary strings, of length exactly n, so the set of all n bit strings. The key generation algorithm will choose a uniform key k, also from the set of all n bit strings. That is to say that the key is a uniform n bit string. Each possible n bit string is chosen with probability one over two to the n. Right, there are two to the N different strings of length N. To encrypt a message M, using the key K, what we do is we simply XOR the key with the message bitwise. This means the icebid of the ciphertext is equal to the XOR of the icebid of the key with the icebid of the message. Decryption, just reverses the process. To decrypt a cipher text seed, which is going to be a string of length n, using a key k also a string of length n we simply exhort the two again using bit-wise XOR, so the message that we output is simply the key exhorted with the cipher text. You can go through the exercise of checking that this scheme is indeed correct, right? That when we decrypt the encryption of a message M using the same key, both for encryption and decryption, we recovered the original message, and I've gone through that here. So the decryption using key K of the encryption. Of m using key k, means that, right we look at the inner portion of the encryption of m using k is just k xor m. The decryption of that using the same key is obtained by simply xoring k with that, because xor is associative, we can rewrite that as k xor k. Quantity XOR'd with M. And then we have the nice property that any string XOR'd with itself is equal to the 0 string. And that comes from the fact that a 0 XOR with a 0 is equal to 0. And also a 1 XOR'd with a 1 is equal to 0, and then the 0 string XOR'd with M simply gives back M itself. So again, this just shows that encryption doesn't need to recover the original message. In pictures, we have this diagram here where the message and the key are both end-bit string, end-bit blocks if you'd like, and what we do is we simply [INAUDIBLE] them together. Resulting in an end bit cypher text that's output by this process. What we want to do is to prove that the one time pad is perfectly secret. So remember that in order to do this, we need to show that no matter what distribution of the messages we start with, for any message in any cypher cipher. The probability that the message was equal to m, conditioned on observing that cypher text, is exactly equal to the a priori probability that the message was equal to M. So let's just fix some arbitrary distribution over the message base, and fix some arbitrary message m and cypher text c. All right, the message space and the ciphertexts space here are both the set of all end byte strings. What we need to do again, is to prove that regardless of our choices of the distribution, and regardless of our choices of M and C, the probability that the message was equal to a M, conditioned on the ciphertext being equal to C. It's exactly the A priority probability that the message was equal to M. Now it's a little bit different difficult to think about this, because we don't know what that A priority distribution is, but let's keep going, and work with this expression, and see what we come up with. The first thing we're going to do is to just rewrite this expression using Bayes' Law. So using Bayes' Law, we get that the probability that the message was equal to m, conditioned on the ciphertext being equal to c, is equal to the probability that the ciphertext is c, conditioned on the message being equal to m times the probability that the message was equal to m divided by the probability that the ciphertext was equal to c. Let's compute the probability that the cipher text takes on the sixth value, lower-case c. Well, we can use the law of total probability to express it in the following way. It's simply the summation over all possible messages m prime. Of the probability that the ciphertext is equal to c conditioned on the message being equal to M prime times the probability that the message is equal to M prime. Right the events, M equal M prime do indeed partition the space because the message has to take on some value, and the only possibility that those va, that, of, of those values are the message every message in the message space. If we keep going, we can rewrite that conditional probability in the following way. The probability that the ciphertext is equal to c, conditioned on the message being equal to some value m prime. Is exactly equal to the probability that the key, right, capital K was the random variable denoting the key, is exactly equal to the probability that the key takes on the value m prime x over c. This is the crucial point and this is actually the first point in the proof. Where we're relying on the specifics of the one time pad scheme. Okay everything up 'til now was generic and applied for any, any possible scheme. Here's the first time we're using something specific about the one time pad, and so I want to go over this carefully and make sure you understand why it's true. So again, we, we claim that the probability that the cipher text is equal to C. Conditioned on the fact that the message is equal to m prime. Is equal to the probability that the key is equal to the value m prime x over c. Why is that the case? Well if we condition on the fact that the message is equal to m prime then the only possible way the cipher text can be equal to the fixed string c. Is if the key is equal to M prime X over C, and you can check this because if you XOR the message over prime with the key X over C, you get the [INAUDIBLE] C, and if you write it as an algebraic equation you'll see it's the only possible value of the key for which that happens. Is the key m prime XOR c. What's the probability that the key is equal to the fixed string, m prime XOR c? Well, m prime XOR c is just some fixed n bit string. We said that the one-time pad encryption scheme chooses the key uniformly from the set of all possible n bit keys. So the probability that the key takes on any particular value is exactly 2 to the minus n. So we've reduced the probability that the cipher text takes on the value c to the summation over all messages m prime in the message space of 2 to the minus n times the probability that the message is equal to m prime. Now it might look stuck, right? We dont know anything about this probability distribution. It's an arbitrary distribution so we dont know what the probability is that the message takes on any particular value in the message space. However we do know that regardless of what the distribution is it's a valid probability distribution, and, so summing over all possible messages and prime in the message space [SOUND] The probability that the message takes on that value must be one. Right remember that probabilities must sum to one. So this expression reduces simply to two to the minus n and what we have is that the probability that the cipher text takes on any particular value v is exactly equal to two to the minus n. Coming back to our expression, that we want to evaluate, we have that the probability that the five vortex is equal to phi, given that M is equal to M, is again that the probability that the key takes on the value M X or C. This is exactly like before. I've just replaced M prime with the particular message M that we're interested in. We carry around we carry the probability that M is equal to M, and on the previous slide, we calculated that the probability that the ciphertext is equal to C is exactly 2 to the minus N. Now again, the probability that key is equal to the particular string MX over C. Is exactly two to the minus n, and now we see that the two to the minus n and the numerator and denominator cancel. And we're left simply with the probability that m is equal to m. That is the probability that the message is equal to some particular message, little m. And now we're done. Right, what we've shown for any distribution over the message base. Conditioned on observing, the cypher text being equal to C, is exactly equal to the A priori probability, that the message was equal to M. This completes the proof, that the one time pad, achieves perfect secrecy. Just to summarize, we did it, right? We came up with a definition of perfect secrecy, we formalized it mathematically, we showed a particular encryption scheme, the one time pad, and then proved that, that encryption scheme satisfied our definition. All right this achieves the goals that we set out for ourself, for ourselves a few lectures back. In the next lecture, we'll talk about implementing the one-time pad. There's nothing particularly difficult here, but it's interesting actually to explore some of the, implementation level details that arise when trying to code it up. And it also can give an ultimate way of looking at and understanding the scheme. So again, we ca, came up with a definition of perfect sequencing that was in the last few lectures. In this lecture we showed a scheme, achieving that definition. What we'll see later on is that we're not done. Right? The one-time pad has a number of drawbacks that we'd like to address. And after we talk about implementing the one time pad we'll come back and revisit those drawbacks and explore to what extent we can avoid those.