All right, so now it's time to take a bit closer at how your intermediate noise vector is actually integrated into the network. And that is looking at adaptive instance normalization or AdaIN for short. So first, I'll talk about instance normalization and I'll compare it to batch normalization, which you're more familiar with. Then I'll talk about what the adaptive and inadaptive instance normalization means, and also where and why AdaIN or Adaptive Instance Normalization is used. So first, AdaIN in the context of style game. So you already learned about progressive growing in this intermediate block, and you also learn about the noise mapping network over here, where it injects W into these different blocks that progressively grow. Well, within each of these blocks, you see that there's an up-sampling layer of convolution and another convolution to help learn additional features, and also to double the image in size with the up-sampling layer. But this is actually not all, Adoptive Instance Normalization, AdaIN actually comes in after each convolutional layer. And you'll see how it comes in, and how the W that's coming into each of these blocks actually goes into here as well. So first, let's focus on just as one block here and the first step of adoptive instance normalization will be the in part or instance normalization part. And what happens here is, well, if you remember with normalization is it takes the outputs from your convolutional layers X, and it puts it at a mean of 0 and a standard deviation of 1. And it does this by getting the mean of those values, and also the standard deviation to then center it around 0 with a standard deviation of y? But that's not it, because it's actually not based on the batch necessarily, which you might be more familiar with batch norm. Where batch norm you look across the height and width of the image, which is along this axis that you see highlighted in blue here. You look at our channel, so among RGB, you only look at R for example, and you look at all N examples in the mini batch. And then, you get the mean and standard deviation based on all of these highlighted blue cells for one channel in one batch. And then you also do it for the next batch., and also the next channel. But instance normalization is a little bit different. Instance normalization, comparing it using the same graph here, actually only looks at one example or one instance, so an example is also known as an instance. So it doesn't look across statistics of the entire batch, it only looks at one example, and again only one channel of that example. So if you had an image with channels RGB, that would just be looking at, let's say the B here, which is just as blue channel, and getting the mean and standard deviation only from that blue channel. Nothing else, not additional images at all, just getting the statistics from just that one channel, one instance. In normalizing those values in there based on its mean and it standard deviation. So to represent that in this equation, we actually call an instance i hear, so it'll be xi or that instance, and the mean or mu here over that instance as well stand deviation. And again, here it just means every value in that instance will be normalized to a mean of 0 and a standard deviation of 1. So that's the first step of adaptive instance normalization, so that's the instance normalization part. And where the adaptive part comes in is to ally adaptive styles to this now normalized set of values. And the instance normalization here probably makes a little bit more sense than nationalization, because it really is about every single sample you are generating, as opposed to necessarily the batch or normalizing across a batch, for example. Okay, so the adoptive styles are coming from your intermediate noise vector w which you heard about being inputted into multiple areas of the network. And so adaptive instance normalization is where w will come in. So you have your original vector which was sampled from a normal distribution for each of its values. It was sent through this noise mapping network that you learned about earlier, which is a multilayer perceptron to get your intermediate noise vector w. So w inform styles which are then imported into AdaIN, but actually how that happens where it informs those styles is actually not directly inputting w there. Instead it goes through learned parameters, such as two fully connected layers, and produces two parameters for us. One is ys that stands for scale, and the other is yb, which stands for bias, you might be able to guess what these terms are four. [LAUGH] So scaling parameters for this in the bias parameter or the shift parameter is the other one, and these statistics are then imported into the AdaIN layers. So that is put in there and then another set of these values will be put into the other one. And so exactly how they're put in ys and yb is that after you do this instance normalization step, which is this middle part, then you want to multiply your values which are now normalized around 0 and stand deviation of 1. This is similar to batch norm by the way, then you want to reshift and rescale your values based on these statistics that are extracted from the intermediate noise factor. So the second step is getting these adoptive styles, and their adoptive, because your w can change or these values extracted will change. And so looking at this, style really comes down to just re scaling and re shifting values to a certain range, mean, and standard deviation. So you can think of it as an image that has some kind of content and with different ys and yb values. It will be like Picasso Druid, or like Monet drew that image, but it will have the same content, right? It'll still be a face or a puppy field, but it would be a different style. So shifting your values to different ranges means standard deviations will actually get you those different styles. And so that's pretty cool. And note that there is again I here for the instance because this is instant normalization still, so you have this ys and yb value just for that instance. And know that before you're applying these styles, essentially this middle instance normalization part is trying to undo any style related information that was originally there. So just getting some kind of content so that you can effectively ally these styles next in the second step. All right, so zooming out of a bit, we are just adding in fit in again. Well, the generator is made up of lots of blocks where earlier blocks actually roughly aligned with coarser features and litter blocks with finer details. And this is pretty consistent across all neural networks. And so here what's interesting is that AdaIN is used at every single block here in the generator to essentially get the style information from w into those feature maps. So w is added into all of these blocks, and when it's added in into these earlier blocks, it'll be those coarser details that are changed or affected by this w style. Whereas in these later blocks, it'll be finer details that are informed by w. And because the normalization step AdaIN, Adaptive Instance Normalization renormalizes these statistics back to a mean of 0, and a standard deviation 1. Because this occurs at every single one of these blocks, this means that every block will control styles at that block. And at the next block, it will be overwritten by the next AdaIN in the normalizes those previous outputs, and that even happens within the block you see here. And this allows for control over the model in terms of what's generated, and in terms of what's kind of style is being generated. Imagine injecting different styles at different blocks to allow for either course or finer grained controls, and that will be in the upcoming video, so w might not just be w here. So in summary, adaptive instance normalization are what transfers style information from the intermediate noise vector w onto your generated image. Instance normalization essentially normalize each instance. While the adoptive part in adoptive instance normalization is able to ally the different styles from the intermediate noise vector w onto that image or onto that intermediate feature map. In the following video, you see exactly how the style part comes in, and maybe how to adopt your w a little bit more so that you can control those styles.