In this next set of videos, we'll begin to hone in on best practices when performing classification on unbalanced classes. Now let's discuss some learning goals for this section. In this section, we'll quickly discuss some common issues that arise when we're working with unbalanced classes. We'll then discuss some approaches to dealing with unbalanced data at a higher level. And with this in mind, we'll mainly be discussing the pros and cons of upsampling, downsampling, and resampling to get a balanced dataset. And we'll discuss what each of these terms mean in just a bit. So we want to start off by coming back and thinking deeper about our cases with highly unbalanced classes. So we see here this example where we have 99% of our cases being in a certain class. And with this example in mind, we want to keep in consideration that although we have these different metrics to help measure air for unbalanced classes, and can even run a grid search on different hyper parameters with a scoring function other than accuracy, the classifiers themselves in order to learn the actual parameters and decisions are built to optimize accuracy specifically. They are built to get as many correct as possible no matter the class. And hence, they'll often perform poorly on under represent in classes. So this is the problem that we're going to be trying to solve. So what are some actual solutions to dealing with this problem of unbalanced datasets? The idea should be to try and find a way to actually balance our dataset before fitting our model. So one option is to downsample. And downsampling here means only taking as many of the larger class of our majority class, as there are available of our smaller class of our minority class. So you see here we have a lot of the majority class only six of the minority class So we randomly select only six from our majority class, so that we are now working with a balanced dataset. Now up sampling is essentially creating copies of the row of smaller outcome until we have a balanced sample. So we now rather than trying to bring down that majority class to six We try to bring up that minority class to 18. So we come up with duplicates of that minority class until we have a balance between majority and minority class. And then we end up with this balanced dataset that we can use to later on fit our model. And then finally, we can even do a mix of the two. In this example, we are going to be resampling. So, we talked about upsampling, downsampling, now we're talking about resampling. We have six of the minority class, 18 of the majority class. And let's say we choose here to try to bring each of those classes to ten. We will then limit the value of that larger class by randomly sampling out ten of those values. And then we can increase the values of that minority class by coming up with duplicates for those values. And then we have a balanced class where the size of both the majority and minority class are now equal to ten. With this in mind, in the next video, we will hone in a bit further on the pros and cons of each one of these different methods.