So definitely that see how Newton's method may be applied for multi varied nonlinear programs. So that's a our F now has an input variables still is twice differentiable. Then all we need to do is to generalize the idea of single dimensional derivatives to gradients and Hessians. So I'm not going to go through all the details but maybe you may understand we are trying to give a function as the crow genetic approximation of f at any given point. Okay, so we still want to do the Taylor expansion thing, it's just that now we are doing the thing on one dimensional space. So you still have F first order derivative of F and the second order derivative of F. It's just that the first order derivative now is gradient and also your X minus. SK is a vector now, so to multiply two factors, you really need to take transports of one of them. Your second order derivative now is hashing, and this is a vector that is a vector. So you need to do some transposing to eventually get just one value. Okay, so I hope you have some linear algebra background and you may evaluate that this is a skater. This is a skater and that this somehow represents the quadratic approximation of F. Okay, so if you do that, then now we are ready to derive the formula for Newton's method, we will move from X K two x k plus one by moving to the global minimum of the energetic approximation. And don't forget again. What we are trying to do is to say okay for this particular function a suppose it has an upward curvature. Then we are going to global minimum by looking for the point where the gradient for this one is zero. Okay, so if it hasn't done worker venture, of course it's just a weird solution were the result. So if we want to have the gradient zero here, then pretty much we are having this particular equation. So don't forget to do some analogy for this multi variant one with the previous single variant one for the single varied one, your X K plus one is x k minus. You do a second order derivative at the bottom and you do the first order derivative at the top. So here you still have the first order derivative and to move your second order derivative somehow to the denominator for metrics. What you do is to take inverse. Okay, so this may also be observed. If you look at this equation to solve X K plus one, indeed, you need to move something to the right hand side. And then you also need to take inverse for this particular Haitian metrics. So you may suddenly want to ask. Well, how do I know the hasher metrics is convertible, and that's beyond the scope of this course. So as long as you're function has a nice behavior, I mean your objective function, then you're hashing metrics would be in vertebral. But I'm going to ignore the details here. Okay, So now let's try to again do a numerical example. So that's try to minimize this function. So this function is fourth order function. Let's say we want to solve this problem, and maybe you still have some memory about this function, so we may still do some analysis first. The optimal solution is 00 and then we have our gradient, and now we don't want to just use gradient. We also want to use hash in. So we first get our Hashim metrics. So let's say we start at a point, which is B B for some positive B. Then you plug in BB into your gradient. So you get a B square, it be at the cube at the Cube, or you plugging that to your hasher metrics. So you get this. So then you may apply this formula, and then from B B, you are going to move to to over five B 2/5 B. And if you try this again, you're going to see it becomes to over five square B to over five square B after you do one more iteration, right? So for this particular example, our XK may actually be expressed as a close form function of Cape and be directly so this is nothing but showing you. If you want to apply the new test method numerically, what do you do? Get a gradient, get a hessian, get the inverse of hessian, get the formula, and then you're done. All you need to do and all the things that you may do with Newton's method, is to get this formula and then keep updating, updating, updating until you converge. So finally, we have some remarks for Newton's method, Newton's method does not have the step size issue, so that's one thing that is good. And in many cases it may be fast, for example, Newton's method fights an optimal solution in one iteration. If your function is already quadratic and having a minimum point, but unfortunately it may fail to converge for some functions, like in the previous example, we've shown it actually may have an increments when you are doing the iterations, so that may be weird somehow. So a general short sentence for Newton's method is that if you're function looks nice, if you know it has a downward curvature, it's just that you don't have a solution. The new tense method may actually get to your optimal solution faster because it used more information. It used the second order information, but because there is no step size issue, so it's harder to control the process of Newton's method. Okay, so there are something good, there are some in bad for different situations you may use different algorithms. There are certainly more issues in general, so nonlinear programming algorithms itself. It may be offered as a single semester course, and we don't have time, so we only stop here. What are other issues that people want to study in general first, how may we guarantee that an algorithm may converge across the condition for us to guarantee convergence? Maybe we also need to do some analysis on convergence speed because as long as you are talking about algorithms, you care about efficiency, okay. But that's also too much and we are unable to talk about this. But still, we gave you some intuition about how to compare the speed between gradient descent and the Newton's method. And sometimes you face non differentiable functions, so you don't have gradient. Okay, then you need something else, so sometimes, for some problems, people use sub gradient, okay, Gradient. So that's again something that we don't have time to talk about in this course. But anyway, there are some ways to deal with non differentiable functions. And finally, of course, we need to have some way to deal with constraint optimization. But still, all I want to do for today is to give you some foundations for nonlinear programming algorithms for unconstrained problems, for constrained optimization. For this particular course, we are unable to give you any algorithms for that. But still, there are some analysis that we may do, and there are some insights we may obtain. We need to ask you to wait for further lectures, so for this lecture, we're pretty much done. Gradient descent Newton's method, they may be used to solve unconstrained problems, and there are still a lot of things that we need to teach. But still, we want to stop here to just give you the main focus and the main idea. So that's pretty much I have here, thank you.