Sometimes I find it easier to understand how models work by looking at their dimensions at each step. I personally find it easier to follow along the equations while keeping the dimensions in mind. So I thought this would help you too, let's take a look at the dimensions of the layers at each step of the model. You just saw the architecture of the continuous bag of words model, and the vectors, and matrices that are involved to make calculations on it. Now I'll introduce you to the dimensions of these vectors, and matrices. Here is the architecture of the neural network, again, first, the input layer is represented by lowercase, x, a column vector with zeros where capital V, is the size of the vocabulary. To get the values that will be stored in the hidden layer, you first calculate the weighted sum of the values from the input layer, and add the bias. So W1x+b1, and I'll call the results z1, lower case h, refers to the column of values stored, and the hidden layer, to get h, you pass the values of z1 into a ReLU activation. So in terms of dimensions, W1, the weighting matrix between the input layer, and the hidden layer, has n rows, and V columns, where n is the size of the word embeddings. b1, the bias vector for the hidden layer, has one row for each neuron in the hidden layer, so n in total. So when you multiply W1 by x, and add bias vector b1, you get a column vector with n rows, and passing it through ReLU preserves the dimensions. So as expected, the hidden layer is represented by a column vector with n rows. Next to get the values of the output player, you first need to calculate the weighted sum of the values from the hidden layer, and add the bias. So W2h+b2, which I'll call z2, the values in z2 are sometimes referred to as loggings. To get the output y hat, you pass the values of the z2 logging through a softmax activation function. Again, you'll see details of the activation functions, later in the lecture. W2, the weighting matrix between the hidden layer, and the output layer, has V rows, and n columns, b2, the bias vector for the output layer, has one role for each output neuron, so V rows. When you multiply W2 by the vector for the hidden layer h, and the add bias vector b2, you get a column vector with V rows, and again, no change when you pass it through softmax activation. So finally, you get the output column vector y hat with V rows.. If you have a situation where instead of column vectors you're working with row vectors, then you'll need to perform the calculations with transposed matrices, and inverted terms in the matrix multiplication. So for instance, instead of calculating h=W1x + b1, as you did with the column vectors, if x, and b1 are row vectors, then you would calculate xW1 transposed plus b1 to get z1, and then h, as a row vector. Next, I'll show you how to use the neural network with batches of several examples instead of just one example at a time. Now that you know the dimensions, this will help you in your programming assignments. Hopefully, you'll not get the famous dimension mismatch anymore. Let's look into this model a little bit more in the next video.