So let's start looking at it. There are a couple of things that you need to take into account before you start working with this week's code in TensorFlow. The first is the version of TensorFlow you're using. Use this code to determine it. Also, do note that all the code I'm using here is in Python 3. There are some differences if you use Python 2. So if you're using a Colab, you can set the environment to three. If you're doing this in your own environment, you may need to make some changes. If the previous code gave you TensorFlow 1.x, you'll need this line of code before you can go any further. If it gave you 2.x, then you won't need anything because eager execution is enabled by default in TensorFlow 2.0. If you're using Google Colab, then you should have TensorFlow datasets already installed. Should you not have them, they're easily installed with this line of code. Now, you can import TensorFlow datasets, and in this case I call them tfds. With imdb reviews, I can now call tfds.load, pass it the string imdb reviews, and it will return the data from imdb, and metadata about it with this code. The data is split into 25,000 samples for training and 25,000 samples for testing. I can split them out like this. Each of these are iterables containing the 25,000 respective sentences and labels as tensors. Up to this point, we've been using the Cara's tokenizers and padding tools on arrays of sentences, so we need to do a little converting. We'll do it like this. First of all, let's define the lists containing the sentences and labels for both training and testing data. Now, I can iterate over training data extracting the sentences and the labels. The values for S and I are tensors, so by calling their NumPy method, I'll actually extract their value. Then I'll do the same for the test set. Here's an example of a review. I've truncated it to fit it on this slide, but you can see how it is stored as a tf.tensor. Similarly, here's a bunch of labels also stored as tensors. The value 1 indicates a positive review and zero a negative one. When training, my labels are expected to be NumPy arrays. So I'll turn the list of labels that I've just created into NumPy arrays with this code. Next up, we'll tokenize our sentences. Here's the code. I've put the hyperparameters at the top like this for the reason that it makes it easier to change and edit them, instead of phishing through function sequences for the literals and then changing those. Now, as before, we import the tokenizer and the pad sequences. We'll create an instance of tokenizer, giving it our vocab size and our desired out of vocabulary token. We'll now fit the tokenizer on our training set of data. Once we have our word index, we can now replace the strings containing the words with the token value we created for them. This will be the list called sequences. As before, the sentences will have variant length. So we'll pad and or truncate the sequenced sentences until they're all the same length, determined by the maxlength parameter. Then we'll do the same for the testing sequences. Note that the word index is words that are derived from the training set, so you should expect to see a lot more out of vocabulary tokens in the test exam. Now it's time to define our neural network. This should look very familiar by now, except for maybe this line, the embedding. This is the key to text sentiment analysis in TensorFlow, and this is where the magic really happens.