[MUSIC] We've seen how regression can be used to predict numeric values in general. And now, you've seen some of the shifts we need to make when we're using regression based approaches for classification. In this video, we're going to describe a major family of classification algorithms known as logistic regression. By the end of the video, you'll understand how logistic regression works and how to implement it in Python using scikit-learn. Contrary to its name, logistic regression doesn't actually create a regression that is, it doesn't answer questions with a real-valued number. It answers with a binary category like cat or not, but it's called logistic regression because it's doing that classification. Thanks to a regression operation and a transfer function. Remember transfer function is what we call a function whose job it is to translate the output of one function into some other space. In this case, the transfer function takes the number reported by our regression model and translates it into a class label. When we use linear regression to find a line, the model we get actually gives us the ability to predict beyond the exact values present in our training examples. Linear regression has the ability to predict the values outside the range of values that it had to fit the line. In cases where we want to classify data instances into one of the two categories, we don't want this unlimited range. One approach is to bound the output in the space of proper probabilities between 0 and 1, logistic regression takes this approach. The value it finds via the regression step can be thought of as the probability that a data point belongs to a particular class. For example, in the case of binary classification, if the probability is above 0.5, then we could simplify and say, it belongs to class 1. If it's below 0.5, we say it belongs to class 0. And we're not stuck with that division point, the threshold can be adjusted based on the domain and whatever bias you need for each class. But, how do we reach these values between 0 and 1? This is where the logistic transfer function comes into play. The logistic transfer function or simply the logistic function, maps the entire real line and transforms it into a space between 0 and 1. So the function our model is finding converts all rational numbers the entire space of possible values into something we can consider a probability. But, remember convexity, a convex function is a continuous function where when you draw a line between any two points on the function, all the points on that line, lie above the function. Tie a string and pull it tight, and look at that if we're using squared loss to evaluate the parameters of our logistic function, that is not convex. There's a line we can draw connecting two points on that function and the function peaks above that line. Since it's not convex, we can't guarantee that when we get to the bottom of a hill, it's actually the lowest possible point. We could get stuck in a local minimum. Fortunately, and thanks to the power of mathematics, for particular transfer functions, you can actually pick a complimentary loss so that you do get a convex loss landscape. These are called matching losses. If you have a matching loss for your particular transfer function, then the minimum point is the global minimum, which means you can find the best line. So, once you solve the linear regression problem, you can use the solution point and it will be the best solution after applying the logistic transfer as well. You can now see the logistic regression loss on your screen. I'm not going to read it out because we'd be here all week. But it uses the log of the estimates times the correct labels as the basis. If you use this matching loss with a logistic transfer function, you get a convex and smooth landscape. Huzzah, gradient descent is there to find your optimal model. So, how do we implement logistic regression using scikit-learn? It's very simple, import that linear model package and then use the logistic regression constructor to make your model builder. Then you can fit the data and predict the respective classes just like we have for our other classifiers. There you have it. Not only do you understand how logistic regression works to classify examples and how to use it in Python. You also understand why it works the way it does, you're ready for neural networks next.