So let's apply it and once again, because actually there's a dot two is not simple, but it is not that in rattle. So once again, I have to take you through R, but it's fairly straightforward using it. But let me show you how it works. We're going to use a dataset from the package r sample package and in that we are trying to predict whether the employee will leave the company or not. That's objective. So you've got all kinds of variables that first of all, the attrition level, yes or no, leaving the job or not right? What's the job level, whether business travel, whether they do or not, department, they work in education, marital status, etc right? So many organizations are interested in trying to predict the attrition levels so that they can at least target how many people they want to hire. Because you can't just wait till people have left to high, especially for large organizations, you will need to be able to predict the total attrition level in advance. Sometimes it may be high, because people are retiring or maybe you can't hold onto your salespeople and they're all leaving, right? So here's what we're going to do in, let me very quickly show you, the first part is just having a nice way to load your data, right? Pacman loads the required packages and one of the loaded packages r sample also loads the data. Then there are 31 variables. We will use only nine variables, one target variable, which is the attrition level, eight input variables. The variables I'm going to use, our business traveling department, education, education field, marital status, training times, stock option. You're going to get the subset of the data, set a random seed as usual to fix the random numbers. Then this is a little trick we're going to use, convert some of these values into categorical levels and the variables which are numerical, I'm going to convert it into categories. That is done automatically here. We will do what you will have now become very good at, identifying which is creating a sample and using that to partition your data into the dependent and the independent variables. Then we will do prediction. So first we will predict only based on four of these variables. You will see the accuracy level will be about 68.59 on the training set and it is about 68.2 percent on the test set. Remember we have split the data. Now, one little twist i have to tell you, it's a holy predict. Well, you said that probability cutoff level, and we will see this only in session eight. How do you know somebody would leave? Because what Name base is giving you is a probability of attrition. Now it's your choice at what probability you will predict this person to actually leave. So for example, you could set the probability is 50 percent. So if the models says "this person will leave 40 percent, " we predict this person will leave. Because in this particular case there were so few people who were leaving. We kept the cut-off level is 20 percent. So that there is some data, otherwise it will predict everybody has not leaving. In session eight we will see how to find tuner model by changing these probabilities. So once again, there is something called a probability cutoff, which you use to divide the class into groups which will leave and a group which will not leave. This cutoff value in this case is kept at 20 percent. Because they have very few data points on attrition. In general, you'll have to play around with that to decide what should be the right value. And you'll have to wait for two sessions to learn that part. Next, as before, I'm going to add more variables to this data and look at the accuracy improvement into Bruce