Now, this is a new data set that I created for learning opportunities. It's freely available, and it consists of about 3,000 images. They've all been generated using CGI with a diverse array of models, male and female, and lots of different skin tones, and here's some examples. If you want to download the data sets, you can find them at this URL. It will contain a training set, a validation set, and some extra images that you can download to test the network for yourself. Once your directory is set up, you need to set up your image generator. Here's the code that you used earlier but, note the class mode was set to binary. For multiple classes, you'll have to change this to categorical like this. The next change comes in your model definition where you'll need to change the output layer. For a binary classifier, it was more efficient for you to just have one neuron and use a sigmoid function to activate it. This meant that it would output close to zero for one class and close to one for the other. Now, that doesn't fit for multi-class, so we need to change it, but it's pretty simple. Now, we have an output layer that has three neurons, one for each of the classes rock, paper, and scissors, and it's activated by softmax which turns all the values into probabilities that will sum up to one. So what does that really mean? Consider a hand like this one. It's most likely a paper, but because she has her first two fingers open, and the rest joined, it could also be mistaken as scissors. The output of a neural network with three neurons and a softmax would reflect that, and maybe look like this with a very low probability of rock, a really high one for paper, and a decent one for scissors. All three probabilities would still add up to one. The final change then comes when you compile your network. If you recall with the earlier examples, your loss function was binary cross entropy. Now, you'll change it's a categorical cross entropy like this. There are other categorical loss functions including sparse categorical, cross entropy that you used in the fashion example, and you can of course also use those. Around this for 100 epochs, and I got this chart, it shows the training hits a max at about 25 epochs. So I'd recommend just using not many, and that's all really that you have to do. So let's take a look at it in the workbook.