0:00

For this final for this week, let's talk a little bit about why convolutions are so

Â useful when you include them in your neural networks.

Â And then finally, let's briefly talk about how to put this all together and how you

Â could train a convolutional neural network when you have a labeled training set.

Â 0:21

I think the two main advantages of convolutional layers

Â over just using fully connected layers.

Â And the advantages are parameter sharing and sparsity of connections.

Â Let me illustrate through example.

Â Let's say you have a 32 by 32 by 3 dimensional image.

Â And this actually comes from the example from the previous video,

Â but let's say you use a 5 by 5 filter with 6 filters.

Â 0:56

And so this gives you a 28 by 28 by 6 dimensional outputs.

Â So 32 by 32 by 3 is 3,072.

Â And 28 by 28 by 6 if you multiply out those numbers is 4,704.

Â And so, if you were to create a neural network with 3,072

Â units in one layer, and 4,704 units in the next layer,

Â and you were to connect every one of these neurons, then the weight matrix,

Â the number parentheses in the weight matrix would be 3,072

Â times 4,704, which is about 14 million.

Â So that's just a lot of parameters to train.

Â And today you can train your networks with even more parameters than 14 million, but

Â considering that this is just a pretty small image,

Â this is a lot of parameters to train.

Â And of course if this were to be a 1,000 by 1,000 image,

Â then this weight matrix will just become invisibly large.

Â 2:05

But if you look at the number parameters in this convolutional layer.

Â Each filter is 5 by 5.

Â So each filter has 25 parameters plus the base parameter means you

Â have 26 parameters per filter, and you have 6 filters.

Â So the total number of parameters is that which is equal to 156 parameters.

Â And so the number of parameters in this ConvLayer remains quite small.

Â And the reason that a ConvNets has routed this small parameter is two reasons,

Â one is parameter sharing.

Â 2:40

And parameter sharing is motivated by the observation that

Â feature detector such as vertical edge detector that's useful in one part of

Â the image is probably useful in another part of the image.

Â And what that means is that if you've figured out say a 3 by 3 filter for

Â detecting vertical edges, you can then apply the same 3 by 3 filter over here,

Â and then the next position over, and in the next position over and so on.

Â And so each of these feature detectors, each of these outputs

Â can use the same parameters in lots of different positions in your input image

Â in order to detect say a vertical edge or some other feature.

Â 3:22

And I think this is true for low level features like edges as well as for

Â higher level features like maybe detecting the eye that indicates a face or

Â a cat or something is there.

Â But being able to share in this case the same nine parameters to compute all 16 of

Â these outputs as one of the ways the number of parameters is reduced.

Â And it also just seems intuitive that a future detector

Â like a vertical edge detector computed for the upper left hand corner of the image.

Â The same feature seems like it will probably be useful,

Â has a good chance of being useful for the lower right hand corner of the image.

Â So maybe you don't need to learn separate feature detectors for

Â the upper left hand and lower right hand corners of the image.

Â And maybe you do have a data set where the upper left hand corner and

Â the lower right hand corner have different distributions or

Â maybe look a little bit different but they might be similar enough.

Â They're sharing feature detectors in all across the image works just fine.

Â The second way that ConvNets get away with having relatively few parameters

Â is by having sparse connections.

Â So here's what I mean.

Â If you look at this 0, this is computed via 3 by 3 convolution.

Â And so it depends only on this 3 by 3 input degrader cells.

Â So is as if this open unit on the right is connected

Â only to 9 out of these 6 by 6, 36 input features.

Â And in particular, the rest of these pixel values, all of these pixel values,

Â all of these pixel values do not have any effect on that output.

Â So that's what I mean by small sea of connections.

Â As another example, this output depends

Â only on these nine input features.

Â And so it is as if only those nine input features are connected to this output, and

Â the other pixels just don't affect this output at all.

Â And so, through these two mechanisms, a neural network has a lot fewer parameters,

Â which allows it to be trained with smaller training sets, and

Â it's less prone to be over-fitting.

Â And sometimes you also hear about convolutional neural networks being very

Â good at capturing translation of the areas.

Â And that's the observation that a picture of a cat

Â shifted a couple of pixels to the right is still pretty clearly a cat.

Â 5:51

And a convolutional structure helps the neural network encode the fact that

Â an image shifted a few pixels should result in pretty similar features, and

Â should probably be assigned the same output label.

Â And the fact that you're applying the same filter

Â yields all the position of the image.

Â Of both the early layers and in the later layers,

Â that hopes in your network automatically learn to be with a more robust.

Â So to better capture this desirable property of translation and variance.

Â 6:58

And let's say you've chosen a convolutional neural network structure.

Â Maybe started the image and having convolution and pulling layers, and then

Â some fully connected layers, followed by a soft max output that then outputs y hat.

Â The ConvLayers and the fully connected layers will have

Â various parameters to help you as well as biases b.

Â And so, any set in the parameters therefore lets you define a cost

Â function similar to what we have seen in the previous courses,

Â where were randomly initialized parameters w and b.

Â You can compute the cost J as the sum of losses of the neural network's

Â predictions on your entire training set, maybe divide it by m.

Â 7:50

So to train this neural network all you need to do is then use gradient

Â descents or some other algorithm like gradient descent with momentum,

Â or armlets prop or add them or something else.

Â In order to optimize all the parameters in the neural network

Â to try to reduce the cost function J.

Â And you find that if you do this, you can build a very effective cat detector or

Â some other detector.

Â 8:19

So congratulations on finishing this week's videos.

Â You've now seen all the basic building blocks of a convolution neural network,

Â and how to put them together into an effective image recognition system.

Â In this weeks' programming exercises, I think all of these things will become more

Â concrete and you get a chance to practice implementing these things yourself and

Â seeing it work for yourself.

Â Next week, we'll continue to go deeper into convolutional neural networks.

Â I mentioned earlier that there's just a lot of hyper parameters in convolutional

Â neural networks.

Â So what I want to do next week is show you a few concrete examples of some of

Â the most effective convolutional neural networks, so you can start to

Â recognize the patterns of what types of network architectures are effective.

Â And one thing that people will often do is just

Â take the architecture that someone else has found and

Â published in a research paper and just use that for your application.

Â And so by seeing some more concrete examples next week,

Â you also learn how to do that better.

Â And beyond that, next week,

Â we'll also just get that intuitions about what makes ConvNets work well.

Â And then in the rest of the course, we'll also see a variety of other computer

Â vision applications such as object detection and neural static transfer.

Â How to create new forms of artwork using these types of algorithms.

Â So that's it for this week.

Â Best of luck with the homeworks, and I look forward to seeing you next week.

Â