So here's the notebook for lesson three. So a few things to check just before you begin is go to the runtime, click change runtime type, make sure you're running Python three and GPU. Then run this cell. If you run this cell, if it does not say 2.0.0- alpha, then come into this cell, uncomment this line, and then execute it. But as I have 2.0.0-alpha already I'm not going to do that. Once you've done it, you'll see a button to reset, click the page would reset, and then you can run this cell and just ensure that you have 2.0.0-alpha before continuing, otherwise, try to re-install or reset the runtime with reset all runtimes. The next thing to do now is to start loading your dataset. So we using TFDS for that and this is the code to install TFDS if it fails. If I run it now, we'll see that it worked, I didn't get any problems. Again, if you see issues with TFDS, then you can just pip install it with this code on common train if we go. I can now create my training data and test data sets, I can now create my tokenizer, I can print out my tokenizer sub words, and we can see there's lots of them including punctuation stuff. For take a sample string like TensorFlow, from basics to mastery and tokenize that, I'll say my tokens are this. If I take a look at how those tokens map 6307 is ten, 2327 is sor, etc. So I now have a tokenized string based on subword tokenization and I built that using the subwords8k vocabulary. So now let's create our model. So our model I'm just going to do another embedding with 64 dimensions. I'm going to pass the tokenizer vocabulary size into the embedding so we can learn from that. If I train I'm going to train for 10 epochs, epochs are pretty slow. I'm going to compile the model with binary-crossentropy and add an optimizer, and I'm going collect accuracy metrics, and then my model that fits pretty simple. You see I haven't really done a whole lot of pre-processing. It's very nice that the data's already pre-process for me. So I'm just going to pass it the training data, I'm going to pass it the number of epochs, and then I'm going to say the test data is my validation data test. So if I start training, we'll see that it is pretty slow. So this number here will be direct 25,000 records, so it's going to build that up as it doesn't know the record size on the first epoch. So that's going to take a few moments to build up. We've seen it's now approaching 25,000, we know there are 25,000 in the epoch will end. So this epoch took about four minutes. So now we can see there are more epochs. Epoch two has just kicked in. So on epoch one, we had about four and a half minutes, the accuracy was only 52 percent, and we'll see over time that this isn't going to improve. I'm going to pause the recording now and then come back when epoch 10 hits. Now, we can see epoch 10 coming to a close, and as it finishes, we can see our accuracy numbers went from 52.3 percent to 54.2 percent. Pretty flat throughout the training. Validation accuracy is similar. It started on measurable and then went 53.18-53.49. So it's pretty clear nothing much is really happening despite all the time we're spending on this. When I plot this out, we can see my accuracy might look really good, but when you look at the scale, it's very little, and the validation accuracy is the same. Now, the reason why this is happening of course is just because we're working on subwords, because we're training on things that it's very hard to pull semantics and meaning out of them and the results that we're getting are little better than 50 percent. But if you think about it in a binary classifier, a random guess would be 50 percent. So this leads us to a problem where we've taken a little bit of a step back, but that's okay. Sometimes you take one step back to take two steps forward, and that's what we'll be learning with RNNs next week, and that's as you start learning things when they're put in sequence. So now sor, after ten, ten so actually has some meaning, but ten could be misunderstood as the number and sor doesn't really mean anything. So when we start putting this together with RNNs and learning things where sequences are important, then we'll see how this step back will hopefully help us make a step forward.