Here you can see the tokenizer from the keras' reprocessing library. The tokenizer is your friend when it comes to doing natural language processing. It does all the heavy lifting of managing tokens, turning your text into streams of tokens etc. Now, the reason why you would need this is that when it comes to training neural networks, you're going to be doing a lot of math and math deals with numbers, and instead of having the words being trained in a neural network, you can actually have the number representing that word and it just makes your life a lot easier. So here you can see I have a body of texts where my sentences "I love my dog and I love my cat," and I'm going to tokenize those using the tokenizer. In order to know, the tokenizer often creates the tokenizer using the NumWords property or the NumWords parameter. In this case, what it's going to do is, in your body of texts that it's tokenizing, it will take the 100 most common words or whatever value that you actually put in here. I have a lot less than a 100 unique words here, so it's not really going to have any effect. What fit on texts will then do is it will go through the entire body of text and it will create a dictionary with the key being the word and the value being the token for that word. If I run this, we'll actually see that in action. So here you can see now it has created a word index for me and the word indexes; I would be number one, love will be number two, my will be number three number, dog will be four, and cat will be number five. So those are the unique words that are actually in this corpus of text. A few things to take note of; number one is that punctuation like spaces and the comma, have actually been removed. So it cleans up my text for me in that way too just to actually pull out the words. Number two, you may have noticed that I have a lowercase i here and an uppercase I here. As you can see to make a case insensitive, it's just using I and it's giving the same token for both of these. Now if I were to change this a little bit by adding some new words to it, for example here you love my dog, notice that U is capitalized and dog has an exclamation after it, but it's not going to confuse that with the previous dog. So if I run it, we'll see now that I have a whole new set of tokens. I have one new one, I have six downside of five and that's because the word you is the only unique new word in this corpus, because love my and dog or their previously, but you'll see the exclamation from dog was removed. So that's a basic introduction to how the tokenizer actually works, and you'll be using that a lot in this course.