In this video, we'll be discussing our second algorithm of the course, K Nearest Neighbors. So the learning goals for our section on KNN will be to, first, identify the K nearest neighbors approach for classification. We'll then discuss the KNN decision boundary. Now, we can adjust that boundary by our choice in K. We'll discuss distance measurement and the importance of feature scaling. Here, we'll make clear how important will be to first scale the data before using a model such as KNN. Then finally, we'll show the implementation of KNN for classification and sklearn, as well as showing you how KNN can be used for regression as well. Now, as in our previous example, when we did logistic regression, we want to predict the telecom customer churn. We have our historical data that includes the customers usage both in minutes as well as the customers usage in gigabytes, their data usage, and we can display this historical data on a plot as we see here, where the phone usage is going to be shown on the X-axis in minutes, and the data usage in gigabytes that is shown on the Y-axis. We can then color the points according to whether or not that customer churned, where blue means the customer churned and red means that they did not. Now, let's say we examine an existing customer, and we want to predict the likelihood that, that customer will churn. This purple dot signifies such a customer, and we can plot it alongside the other data. Now, I want you to think, when you look at how this customer compares to others, for which we have their churn outcomes, can we predict the customer's churn likelihood? Think, how would you intuitively predict this results? What many of you hopefully did or are doing is looking at nearby customers, or those with similar features, and using that to guess the results. It stands to reason that churn outcomes of newer customers are going to be similar to those of nearby customers. As long as each one of these features, of course, are important in regards to predicting this matter. Looking at how similar new data is to our labeled data, is essentially what the K nearest neighbor algorithm does for us. Again, we're trying to predict what this new value will be, and when you think what happens if the nearby results are in conflict. We see both churned and not churned nearby. With this in mind, how do we decide which one is correct? In the case that we have here, we're setting K equals to 1, which means we're looking at the one nearest neighbor, and we see with this one nearest neighbor, which we just highlighted in pink, that we are going to end up predicting not churned because that is the one closest labeled customer to our current customer that we're trying to predict. Now, what if we look at the two closest neighbors? Does this value change? So we see here, now, we're looking at the two closest neighbors, and we have a bit of a conflict, one churned and one did not. How can we decide which is correct? One solution is to only use an odd value for the number of neighbors. You can think, what else can we do to break this tie? You can weight the points according to their distance, we can choose the certainty of results, or we can just randomly choose one. What we want to keep in mind here is that there will be different options available to us to creating this tiebreaker, but a clean way to ensure that you don't have to create a tiebreaker would just be to choose an odd number of K. Now, let's say we are using the three closest neighbors. Now, we're increasing K to three, and we see now we have three of the closest neighbors. This time, we have two customers that are churned and one that did not. Here, we would predict that they would churn this new customer, and that's going to be a different result than K equals 1. This odd number of K eliminates the issues with ties, and this value we choose for K is something that we're going to continue to explore as we go through. It will be one of those hyperparameters that we will tune in order to come up with the best fit model. Now, for our last example. Let's take the four closest neighbors, so we're adding on one more. This time, our prediction stays the same because we have three neighbors that have churned and one that did not. Thus, our prediction of churn seems even more certain as we add on more neighbors. Also, we don't have this situation of a tie, so we are able to do four neighbors here, but again, we'd suggest moving up to five or down to three in regards to choosing your number of K in general. In the next video, we will continue with this example and dive deeper into how to actually set up your K Nearest Neighbors algorithm.