I had previously mentioned, that you need context words and center words to train your continuous bag-of-words model. The question now is, how do you actually get these words? Let's dive in. Previously, you cleaned and tokenized the corpus, and you now have this clean corpus as an array of words or tokens. Now, I will show you how to extract center words and their context words, which will serve as examples to train the continuous bag-of-words model. Here's the code to do this in Python. The get_windows function takes two arguments, words, which is an array of words, or tokens. But I'll stick with the term words here. The context have size stored in the variable C, which is the number of words to be taken on each side of the center word. This was 2 in the previous video, for a total window size of 5. The function initializes a counter with the index of the first word that has enough words before it. In the working example, I am happy because I am learning, the tokenized array would be, I am happy because I am learning, where I has the index 0, am index 1, and so on. For a context have size of 2, where the context words are the two words before and after the center word. The first center word that can be used is happy. It's index is 2, which is the size of the context. I then start a loop in this index, which will run until it reaches the last possible center word, which is the last word that has two words after it. Stopping just before it reaches the index corresponding to the number of words in the array minus the size of the context. In each iteration of the loop, I extract the center word, which is the word at the current index. Next, I create an array with the C words before the center word, and the C words after it. I can now return the context word and center words. I'm using a special way of returning value in Python. As you can see, I'm using the yield keyword instead of return. Without going into too much detail, where return would immediately exit the function, yield returns the values and simply pauses the main get_windows function. The function known as a generator function with the use of yield will then continue running, if more values are needed. Simply put, with a yield, I can return values from a function several times, which is what I'm doing at each iteration of the while loop. Finally, I increase my index by one to move my sliding window one word to the right. Just to recap. The get_windows function takes in a corpus and the context size, and returns the context words and center words for each successive window. Here's how to use this function. I'm using a loop to get the successive tuples of context words and center words into x and y. Then displaying these words, you will notice that I'm using usual machine learning notation for features and target, as this is what the context words and center words are for the continuous bag-of-words model. If I run the code on the array, I am happy because I am learning, which I got by tokenizing the sentence, I'm happy because I'm learning and the context have size of 2, this is the output. Up next, you'll convert this sets of words into a set of vectors is going to be consumed by the continuous bag-of-words model. You have seen how to extract center words and contexts words using a sliding window in Python. This is useful for the programming assignments. You also saw how you can use the yield functionality in Python, which is commonly used for data generators, or you can think of them as functions that keep giving you data in small batches.