So now, say, we're going to give this example to move to decision trees. We have this data related to customer turnout at a tennis facility. And for our upcoming discussion, suppose you are going to be the administrator of this facility. And in addition to predicting the number of customers, you want to evaluate the impact of various drivers on player turnout. Such as outlook of weather, where we have this ordinal value of sunny, which is greater than overcast, which is greater than rainy. Temperature, with the ordinal values of cool, mild, or hot, with hot being the greatest and cool being the least. We also have humidity levels and strong or weak wind. So using these features, we want to again predict whether or not customers will play tennis at our courts. Decision trees will seek to split up the dataset into two datasets at every set, for which the decision is going to be easier, and then continue to iterate. So this purple node asks the questions, is temperature greater or equal to mild? Remember, we're working with an ordinal variable, so it's mild or greater would be mild or hot. That question then divides the original training dataset into two subsets. And hopefully, these two sub-datasets contain more information in regards to the target variable to predict. So that we can say, if temperature is less than miles, then it's most likely that customers will not play tennis. And if temperature is greater than or equal to mild, then customers will be more likely to play tennis. So that's the idea, we split up our data, and hopefully we have a cleaner split between those that play tennis and those that do not play tennis. Now, the circles where we asked each one of our questions are going to be our nodes. And the circle where we reach a decision, yes or no, are going to be called the leaves, And those are our leaves here below. Now, there's no reason to ask just one question. We can keep asking question after question to further segment our dataset. The depth of this tree is now up to two since we've split our tree twice. And the decision tree has categories in the leaves. So categories here being will or will not play tennis. And it'll use the majority class left that at those final leaves to predict the class that followed the steps down this decision tree. So it's going to be a classification algorithm trying to classify according to the majority class here, either played tennis or did not play tennis. However, this idea can be used in predicting quantities instead of classes as well. And rather than decision trees, it will be called regression trees. So here's an example of a regression tree. Here the inputs are going to be numerical values for slope and elevation within the Himalayas. And we're going to want to predict the average precipitation, which is going to be a continuous value. So we're no longer trying to come up with a classification but rather a regression and trying to predict some continuous value. We'll use the same idea as we did with classification. At the node, we're going to ask this yes or no question. Here we see the question is is the elevation less than 7,900 feet? We use that to split our dataset into true or false. And that's going to be a split of our original dataset into two smaller datasets. And we can again ask another question or make a decision. Here we have slope less than or equal to 2.5. And then we have, at each one of these leaves, we're going to have the average value for the remaining of that subset. So we keep splitting up our dataset into smaller subsets. We will then have the outcome variables, which are going to be continuous values for each one of these subsets. And then with that subset, we can average out within that subset what is going to be the average precipitation, and we can use that as our prediction. So to see how this looks in regards to a two-dimensional graph where we have maybe just one value. So here our x is 0 through 5 and our y is going to be -2 to 2. And we're trying to predict using the continuous values on the x axis what the values will be on the y axis. Since there can ultimately be a certain number of leaves with binary splits depending on the depth of our tree, so we saw a depth of our tree increase from 1 to 2 before. The possible outputs of a regression tree are going to be bound by the number of splits that there are. So those are going to be the number of values that we can have. So we'll not get an output of a linear function that can basically spit out any value but rather limited to whatever amount of leaves we have at the end of our tree. So here is a regression tree of depth 2. That would spit out four different values, and we project that onto the y axis, ignoring each one of the jumps. And those are going to be the four values that we predict given our different values of x. Now, increasing the depth of a tree, one can allow for more possible values. The bigger the depth of the tree, the more different average and different subsets you're going to be working with. And that could be a good thing or a bad thing. This new tree of depth 5 seems a bit overfit to each one of our data points, so we may be overfitting here. So we want to find that right balance of the right depth so we don't overfit our data. In the next video, we'll begin to dive under the hood into how a decision tree is actually built. All right, I'll see you there.