Say that I have two random variables, X and Y, and these are random variables like BMF, my box or fruit, just two random variables. The first rule that we'll use is called the sum rule and I'll express it using these two variables X and Y. So, the sum rule tells us that the marginal probability of X, one of the random variables is the sum of the joint probabilities of X and Y over Y. That's the sum rule. The second rule is the product rule, and product rule gives us the joint probabilities this time of X and Y. It tells us that the joint probability of X and Y is the conditional probability of Y given X times the marginal probability of X. These two rules are very important and they allow us to solve all kinds of problems. So, let's take an example. So, back to my boxes and fruits, I might ask a question, for example, what is the joint probability of B = r and F = a. So, I'm doing these experiments and I would like to know what is the probability of selecting a box at random, reaching into the box, picking a fruit at random and ending up with an apple from the red box. What is the probability, in my experiments, of getting an apple from the red box? And I can immediately spot that I can use the product rule because it gives me the joint probabilities. So, I'll plug this thing here and I will get the joint probability of the box being red and the fruit being an apple, is actually the conditional probability of the fruit being an apple given that the box is red, times the marginal probability the box being red. Now, this guy here, we already know the probability of selecting an apple from the red box is just the fraction of fruits in the red box and we have it right here, that's 1/4. So, we say 1/4 times, now this guy, what is the marginal probability of selecting the red box? So, well, that's just the probability of selecting the red box which is 40 percent which is here, 4/10. So, we will also multiply by 4/10, and end up with a probability of 1/10. Let's do another example. Now, this time I would like to ask, in this setup, what is the probability of picking an apple? I don't care which box that will comes from, I just want to know what the probability of getting an apple is. And I will write, what's the probability of the fruit being an apple, does the marginal probability of F = a? And immediately, I can spot that if I'm working with marginal probabilities, the same rule can help me. So I will plug this in and I would say that this will be the sum over, now i'm summing over the variable that I'm not interested in, that's the box. So, this will be B which can be in r or b, the box is either red or blue over the probability of F equal a and b. And this B can be either red or blue. How do I calculate this? Well, now that I have this joint probability here, which I don't know, I can actually go back to the product rule and replace this by this. So, what I will get here is, first I'll do the red box, I will get that the probability of the fruit, the joint probability of fruit being an apple and the box being red is actually this guy. So, does the conditional probability of the apple being selected given the box is red, times the marginal probability of the box being red. Plus, now I have to deal with blue box. The probability, and I'm using again the product rule, the probability of the fruit being an apple given that the box is blue times the marginal probability of the box being blue. And I end up with, again, we need the fraction of fruits here in the red box, which is 1/4, times the marginal probability of the red box which is 4/10, plus fraction of fruits in the blue box, of apples in the blue box. That's 3/4 times the marginal probability of the red box. And I don't even have to go back to the previous slide because I know they have to add up to one so that this is 4/10, this is 6/10. And we get, 1/10 plus this is 9/20 and this will be 20, this will be two, so 11/20. Is the probability of picking an apple. The final thing I would like to discuss around probabilities is the Bayes rule, and it's a very important rule that's at the core of Naive Bayes. So, we have the sum and the product rule here. And I will start by tweaking a bit the product rule. I will start by using this equation to actually express the conditional probability of Y given X. So, using this equation, I can write that the probability of Y given X is what? It is the joint probability of X and Y over the marginal probability of X. So, far so good. Now, there is something in probabilities that we call the symmetry property that tells us, I will write it up here, that the probability, the joint probability of X and Y is the same as the joint probability of Y and X. They're still equivalent. So, what I will do here, I will just rewrite the numerator, which is the joint probability of X and Y, and reverse the variables. I will say that that's the joint probability of Y and X over P_X. Which is the same, I'm just swapping the variables and because of the symmetry rule, it is still valid. Now, the next thing I will do is I will grab this joint probability P of Y and X, and I will actually apply again the product rule to rewrite this as a conditional probability. So, I have here the probability, this probability of Y and X and now applying the product rule, but we have to be careful because the variables are now reversed. This will be the conditional probability of X given Y times the marginal probability of Y. And I will plug this here now and I will get the conditional probability of X give Y times the marginal probability of Y over the marginal probability of X. Let me rewrite this here. So, at the end, we get that the conditional probability of Y given X is the conditional probability of X given Y times the marginal probability of Y over the marginal probability of X. And this is the Bayes rule. If you think about it, it allows us to reverse the direction of the conditional probability because we start with Y given X and we can express it using X given Y. Another way to think about this rule is it helps us to describe the probability of an event based on some prior knowledge of events that might be related to this event. And this rule is very important. And we will use this in Naive Bayes. However, I would like to parred the probabilities for the time being and talk about something else.