0:00

In this segment, we're gonna continue with a few more tools from discrete

Â probability, and I want to remind everyone that if you wanna read more about this,

Â there's more information in wiki books article that is linked over here. So first

Â let's do a quick recap of where we are. We said that the discrete probability is

Â always defined over a finite set, which we're gonna denote by U, and typically for

Â us, U is going to be the set of all N bit binary strings, which we denote by zero

Â 130 N. Now a probability distribution P over this universe U is basically a

Â function that assigns to every element in the universe a weight in the interval zero

Â to one, such that if we sum the weight of all these elements, the sum basically sums

Â up to one. Now we have said that subset of the universe is what called an event, and

Â we said that probability of an event is basically the sum of all the weights of

Â all the elements in the event and we said that probability of an event is some real

Â numbers in the interval zero to one And I would remind everyone the basically

Â probability of the entire universe is basically the one by the fact that sum of

Â all the weights sums up to one. Then we define what a random variable is Formally,

Â a random variable is a, is a function from the universe of some other sets But the

Â thing that I want you to remember is that the random variable takes, values in some

Â set v And, in fact, the random variable defines the distribution on this set v. So

Â the next concept we need is what's called independence And I'm gonna very briefly

Â define this If you want to read more about independence, please go ahead and look at

Â the wiki books article. But essentially we say that two events A and B are

Â independent of one another if When you know that event A happens, that tells you

Â nothing about whether event B actually happened or not. Formally, the way we

Â define independence is to say that, the probability of A and B, namely, that both

Â events happened, is actually=to the probability of event A the probability of

Â event B So mult iplication, in some sense, the fact that probabilities multiply under

Â conjunction, captures the fact that these events are independent And as I said, if

Â you wanna read more about that, please take a look at the background material.

Â Now the same thing can be said for random variables. So suppose we have two random

Â variables x and y. They take values in some set v. Then we say that these random

Â variables are independent if the probability that x = a, and y = b is equal

Â to the product of these two probabilities. Basically what this means is, even if you

Â know that x = a, that tells you nothing about the value of y. Okay, that, that's

Â what this multiplication means And again this needs to hold for all A and B in the

Â space of values of these random variables So, just again to job your memory If

Â you've seen this before, a very quick example. Suppose we look at the, set of,

Â of two bit strings So, zero, zero, zero, one, one zero and, one, one And suppose we

Â choose a random, from this set. Okay so we randomly choose one of these four elements

Â with equal probability. Now let's define two random variables. X is gonna be the

Â least significant bit that was generated, and Y is gonna be the most significant bit

Â generated. So I claim that, these random variables, x and y, are independent of one

Â another. And the way to see that intuitively, is to realize that choosing r

Â uniformly, from the set of four elements is basically the same as flipping a coin

Â An unbiased coin twice. The first bit corresponds to the outcome of the first

Â flip and the second bit corresponds to the outcome of the second flip And of course

Â there are four possible outcomes. All four outcomes are equally likely which is why

Â we get the uniform distributions over two bit strings Now our variables X and Y. Y

Â the independents Basically if I tell you result of the first flip namely I tell you

Â the lest signify bit of R So I tell you how the first coin you know whether it

Â fell on its head or fell on its tails. That tells you nothing about the result of

Â the second flip. Which is why intuitively, you might, you might expect these random

Â variables to be independent of one another. But formally, we would have to

Â prove that, for, all 01 pairs, the probability of, x=0 and y=0, x=1, y=1, and

Â so on. These probabilities multiply. Let's just do it for one of these pairs. So

Â let's look at the probability that x is = to zero, and y is = to zero. Well, you see

Â that the probability that x is equal to zero and y is equal to zero is basically

Â the probability that r is equal to zero, zero And what's the probability that r is

Â equal to zero, zero? Well, by the uniform distribution, that's basically equal to

Â one-fourth. What it's one over side of the set which is one fourth in this case And

Â well low and behold that's in fact these probably provability multiply Because

Â again the probability that X is equal to zero. The probability that lest signify

Â bit of R is equal to zero. This provability is exactly one half because

Â there is exactly two elements that have their, lest signify bit equals to zero.

Â Two out of four elements gives you a provability of one half And similarly the

Â probability that Y is equal to zero is also one half so in fact the provability

Â multiplies. Okay, so that's, this concept of independence And the reason I wanted to

Â show you that is because we're gonna look at an, an important property of XOR that

Â we're gonna use again and again. So before we talk about XOR, let me just do a very

Â quick review of what XOR is. So, of course, XOR of two bits means the addition

Â of those bits, modular two. So just too kind of, make sure everybody's on the same

Â page If we have two bits, so 0001, ten and eleven. Their XOR or the truth table of

Â the XOR is basically just the addition modular two. As you can see, one+1 is= to

Â two, modular two. That's=to zero. So this is the truth table for [inaudible] XOR And

Â I'm always going to denote XOR by the circle with a + inside And then when I

Â wanna apply XOR to bit strings, I apply the, addition modular two operation,

Â bitwise. So, for example, the XOR of these two strings would be, 110, and I guess

Â I'll let you fill out the rest of the XORs, just to make sure we're all on the

Â same page. So of course is comes out to one, one zero one. Now, we're gonna be

Â doing a lot of XORing in this class. In fact, there's a classical joke that the

Â only think cryptographers know how to do is just XOR things together But I want to

Â explain to you why we see XOR so frequently in cryptography. Basically, XOR

Â has a very important property, and the property is the following. Suppose we have

Â a random variable y. That's distributed arbitrarily over 01 to the n. So we know

Â nothing about the distribution of y But now, suppose we have an independent random

Â variable that happens to be uniformly distributed also over 01 to the n. So it's

Â very important that x is uniform. N's independent of y. So now let's define the

Â random variable which is the XOR of x and y. Then I claim that no matter what

Â distribution y started with, this z is always going to be a uniform, random

Â variable. So in other words if I take an arbitrarily malicious distribution and I

Â XOR with independent uniform random variable what I end up with is a uniform

Â random variable. Okay and this again is kind of a key property that makes x or

Â very useful for crypto. So this is actually a very simple factor to prove,

Â let's just go ahead and do it, let's just prove it for one bit so for n = one. Okay,

Â so the way we'll do it is we'll basically write out the probability distributions

Â for the various random variables. So let's see, for the random variable y. Well, the

Â random variable can be either zero or one. And let's say that P0 is the probability

Â that it's = to zero, and P1 is the probability that it's =to one. Okay, so

Â that's one of our tables. Similarly, we're gonna have a table for the variable x.

Â Well, the variable x is much easier. That's a uniform random variable. So the

Â probability that it's=to zero is exactly one half The probability that's it's=to

Â one is also exactly one half. Now, let's write out the probabilities for the join

Â di stribution. In, in other words, let's see what the probability, is for the

Â various, joint values of y and x. In other words, how likely is, it that y is zero,

Â and x is zero. Y is zero, and x is one. Y is one and x is zero, and eleven. Well, so

Â what we do is, basically, because we assume the variables are independent, all

Â we have to do is multiply the probabilities. So The probability that y

Â is equal to zero is p zero. Probability that x is equal to zero is one-half. So

Â the proximity that, we get 00 as exactly p zero over two. Similarly for zero one

Â we'll get p zero over two, for one zero we'll get p one over two And for 1-1,

Â again, the probability that y is=to one, and x is=to one, Well, that's P1 the

Â probability that x is=to one, which is a half, so it's P1 over two. Okay? So those

Â are the four, probabilities for the various options for x and y. So now, let's

Â analyze, what is the probability that z is equal to zero? Well, the probability that

Â z is=to zero is basically the same as the probability that, let's write it this way,

Â xy. Is=to 00. Or xy is=to, eleven. Those are the two possible cases that z is=to

Â zero Because z is the XOR of x and y. Now, these events are disjoint, so, this

Â expression can simply be written as the sum of the two expressions given above. So

Â basically, it's the probability that xy is=to 00, plus the probability that xy,

Â is=to eleven. So now we can simply look up these probabilities in our table. So the

Â probability that xy is equal to 00 is simply p zero over two, and the

Â probability that xy is equal to one, one is simply p one over two. So we get P0

Â over two +P1 over two But what do we kn-, what do we know about P0 and P1? Well,

Â it's a probability distribution. Therefore, P0+P1 must =one And therefore,

Â this fraction here must= to a half. P0+P1 is =to one. So therefore, the sum of these

Â two terms must = a half And we're done. Basically, we proved that the probability

Â that z is = to zero. Is one half, therefore the probability that z is equal

Â to one is also one half. Therefore z is a unif orm random variable. So the simple

Â theorem is the main reason why x o is so useful in cartography. The last thing I

Â wanna show you about discreet probability is what's called the birthday paradox And

Â I'm gonna do it really fast here Because we're gonna come back later on, and talk

Â about this in more detail. So, the birthday paradox says the following

Â suppose I choose n random variables in our universe u. Okay And it just so happens

Â that these variables are independent of one another. They actually don't have to

Â be uniform. All we need to assume is that they're distributed in the same way. The

Â most important property though is that they're independent of one another. So the

Â theorem says that if you choose roughly the square root of the size of u elements,

Â we're kind of this one point two here, it doesn't really matter. But if you choose

Â square root of the size of u elements, then basically there's a good chance that

Â there are two elements that are the same. In other words, if you sample about square

Â root a few times, then it's likely that two of your samples. [inaudible] will be

Â equal to each other. And by the way, I should point out that this inverted e,

Â just means exists. So there exists in [inaudible] I and j such that r I is equal

Â to r j. So here's a concrete example. We'll actually see many, many times.

Â Suppose our universe consist of all strings of length of one hundred and

Â twenty eight bits. So the size of you is gigantic it's actually two to the hundred

Â and twenty eight. It's a very, very large set But it so happens if you sample say

Â around two the sixty four times from the set. This is about the square root of U

Â then is very likely that two of your sample messages will actually be the same.

Â So why is, this called the paradox? Well, traditionally it's described in terms of

Â people's birthdays. So you would think that each of these samples would be

Â someone's birthday And so the question is how many people need to get together so

Â that there are two people that have the same birthday? So, just as a simple cal

Â culation you can see there are 365 days in the year, so you would need about 1.2

Â times the square root of 365 people until the probability that two of them have the

Â same birthday is more than a half. This if I'm not mistaken is about 24, which means

Â that if 24 random people get together in a room it's very likely that two of them

Â will actually have the same birthday. This is why it's called a paradox, because 24

Â supposedly is a smaller number than you would expect. Interestingly, people's

Â birthdays are not actually uniform across all 365 days of the year. There's actually

Â a bias towards December. >> But, I guess that's not, that's not relative to the

Â discussion here. >> The last thing I wanted to do is just show you the birthday

Â paradox a bit more concretely. So, suppose we have a universe of about a million

Â samples, then you can see that when we sample roughly 1200 times, the probability

Â that we get, we sample the same number, the same element twice is roughly half But

Â the probability of sampling the same number twice actually converges very

Â quickly to one. You can see that if we about 2200 items, then the probability

Â that two of those items are the same, already is 90 percent and You know, 3000

Â then it's basically one. So this conversion is very quickly to one as soon

Â as he goes beyond the square root of the size of the universe. So we're gonna come

Â back and study the birthday paradox in more detail later on, but I just for now,

Â wanted you to know what it is. So that's the end of this segment, and then in the

Â next segment we'll start with our first example of encryption systems. [sound]

Â