So, here is data set. Very simple, very basic data set, it comes from the Bureau of Labor Statistics in the US. It gives us measurements for individuals and we have attributes like their height and weight, and the height is in inches, weight is in pounds. We have the sex of the individual, where zero stands for male, one stands for female. All right. So, those are data set and I have plotted it here on the right hand side. You see that we have some invalid data here, people with weight of zero. Maybe they didn't share their weight or data, or we have some outliers here and there. I don't really care about the quality of this data because I just want to give you an example how we can use Naive Bayes. So, I'm not wrangling the data, I'm not doing any data cleansing can and so on. I'll just use it as it is. The bottom line here is, what I want to do is I want to be able to get one attribute, let's say for simplicity, maybe I just focus on the height, and I have the sex as my target variable. So, now talking in machine learning terms, this is my class. So, my class is in M or F, male or female, and I want to be able to answer questions like, what is the probability of the height being, I don't know 69 inches, given that the class is male? What is the probability of the height, giving 69, are being 69 given that the class is female? If I can answer these two questions, then I can make a prediction. Of course, if it's more likely that someone with a height of 69 inches is a male, then obviously my prediction for this individual will be that their sex is male. If you look at the data set, there is not a clear decision boundary. You see here that they overlap, but again, that's just a demo and I want to keep it very simple. What I want to show you now is how we can use Bayesian inference to actually model these probabilities and compute likelihoods, and priors, and so on, using these datas. So, I start by splitting my data set into two sub-sets. I get one for the males and one for females. So, these are the males, these are the females. The data set is much larger. I've just sampled five, six, five examples for each class just to keep the calculation simple. So, my first step in the process is split the two classes, and then compute some statistics for the two samples. So, first of all, let's compute the prior. So, my initial hypothesis. I'm, again, focusing on the height into two subsets. So, what is the prior which is the probability of the class being M, so, male? Well, that's fairly simple. This will be the number of observations that I have, 1, 2, 3, 4, 5, over the total number of data points, which is, I have another five here, 1, 2, 3, 4, 5. So, my size of the data set is 10. So, this will be five over 10. So, my prior is 0.5. Of course, it's identical for the other class. We can either, again, count five and divide, or because these are mutually exclusive events, just take one minus 0.5, and get again 0.5. All right. Now, these are my priors. Okay. The next thing I will do is I will compute the mean for the two sub-sets. So, what is the mean? I'll use Mu for the males. Well, this will be the sum of the observations over n, which is the number of data points in this data set. So, I have here, what? 67 plus 69 plus and so on, plus 66, over, I have five observations here, and this will give me, I have already calculated it, this will give me 1.74. Okay. We do the same calculations for the other class, for the females which will be, again, 67 plus 67 plus and so on, until the last one 64, over five again, and this will give me 65.6. Now, I need the standard deviation for both of the classes. So, this will be, Sigma for the males will be the sum, again, over i of x_i, which are the individual samples in this data set, minus the mean, which I have just computed over here, squared over n, and I need to take the square root of this. The value here is 1.74. Then, I do the same for the females, Sigma f, 2.24. Now, say that new data comes in, running out of space, I can use this thing. Say that new data comes in, and the new data, my new evidence is a height of 69. What I will do next is, I will use the Gaussian and we had the formula right here. So, let's compute the probability using the Gaussian. Oops, yeah. So, the probability or the likelihood in this case of x being 69 given that the class is males, will be, and this is where we plugged the Gaussian formula, one over the square root of two pi Sigma square for the males, times z to the minus 69, minus Mu for the males squared, over two Sigma male squared. We will compute this and we will end up with some number. We do the same for the females. The probability of x equals 69, given that the class is the females, will be one over square root of two pi Sigma f squared e, minus 69 minus Mu f squared, over two Sigma f squared. Now, I'll get a null number. Say that I don't know, I don't want to compute this. Say that this is A_1, this is A_2. So, what do these numbers represent? This is the likelihood. This is the likelihood of the new data coming from this class and this class respectively, A_1 and A_2. Now, we can use this to make a prediction.