Now, we are ready to solve the nonlinear least square problem. The typical problem setting is, we want to minimize the norm of fx minus b, where fx is a vector of dimension m, m for every constraints we have. B is corresponding a vector m as well. X is also a vector. It's the solution we are seeking. If you're trying to figure out a position of the camera, we need to find the three dimensional vector in xyz. As well as the rotation represented at three angles rotation. So, that's x. So first, we can do is take the vector form of fx and b and expand it out as shown in the equation here. We obtain a slightly simple form of the problem where we want to minimize the inner power between fx, that horizontal vector transposed and fx minus 2b is the vector transposed with fx. Know the fx is a nonlinear function and that's where our difficulties lies. We'll look at a simple problem here. A two dimensional nonlinear square problem, where we want to localize this our self in a two dimensional image, or localize self in a two dimensional world relative to sediment beacons. We have here, five beacons. And we imagine, we can measure our self the distance to each of the beacons as illustrated here. So, we're given the distance to five of the beacons and we want to localize our self in x, y locations. So, we have five constraints. So, as you imagine fx is a vector of the measure of 5. And x we're looking for is a two dimensional vector. So, x is of dimension two. So, we can minimize this cost function E defined as F vector transpose times F minus two times E transpose times F. And that's illustrated here on the right as well. We here enumerate every possible X location that could be possible in this space. So, we scan through every possible location in space, for every x we computed this error function E. And here is color coded, red is large value, blue is small value. So, if you have infinite compute power, we can always find the solution to this problem by enumerating of all possible solutions. For every x and y, we compute a cost function e and plot it out. As you can see, this call space is nonlinear. It has places where it go down and places where it go up and go down again. Additionally, we can see where the local as well as the global solutions are. The global solution is shown in the lower left, this is the valley of the cost function. And the top right, there's also a valley, which is the local solution. So, the two problems we want to solve in the space, imagine little ants crawling on this energy function and we need to find the local as well as the global minimum of this cost function. How do we do that? And two problems are the following. The first problems we need to know is when we add a local global minima. Second problem is, if I'm in location, if I'm not in local or global minima, which direction should I go to minimize the cost function. So again, imagine your ants crawling on this energy function. Everywhere you go, you know the height where you are. And you need to figure out if you have reached the global minimum. And therefore, you should be stopping the search. Or if it's not, which direction to go. The first problem is relatively easy to solve. We know we are a global or minimal, local minimal, a global minimal solution if the gradient at that point is 0 meaning that everywhere it go I couldn't change my cause function anymore. If I'm at a slope of the cause functions the gradient will be non-zero. So therefore, at least I should follow the gradient walk down. But at some point, I couldn't go anywhere in the sense the gradient's always zero, a flat plateau. And this is exactly a point where we could be add a global minimal or local minimal. And mathematically, this is defined as [INAUDIBLE] durable function of a cost function E respect to the variable X and that great in vector is equal to zero. So, we can take this calculation and make a function with respect to E and expand the formula further out. And so, we can see this formula consists of three variables. The first variable is what we call the Jacobian matrix, is a derivative function of F respect to X. So, F, we'll call this vector of dimension M. In this case, we'll have five beacon examples, five dimensional vectors. And X is a vector of two dimension, in this case, XY. And Jacobian matrix is, in this case, five by two dimensional matrix. It tells me that as I move in x space, how much is this function, respect to each of the beacons, is changing. For every column of the Jacobian Matrix, it tells you for that particular variable I have, how much the error function increase or decrease with respect to each of the constraint I have. So, that's the Jacobian matrix. The other two variable we have is simply f(x). If I'm an ant crawling in a space, it's the height of this ant relative to the ground. And b is the constraint vectors. Now, the second question we have is, if I'm crawling the space, I'm not at local minimum, or global minimum, which direction should I go? [INAUDIBLE] fairly easy to figure out. That we should go down the slope, meaning that I checked gradiently every direction. I changed the cost function maximally, meaning that I need to follow gradient direction. So, that's easy to figure out. The question is how far should I go along the gradient direction. Should I just move a little bit or should I move a lot? So, this question can be answered by doing the following Taylor expansion. Add the point x we're currently at, and recall Taylor expansion, it can be written down as f of x plus delta x. As a linear function of fx which is the curve value, plus gradient, Jacobian matrix times delta x. Delta x again, is a vector, along the gradient direction That's the Taylor expansion, and the beauty of a Taylor expansion is no longer have nonlinear function of f(x), where the f(x) becomes a linear approximation, gradient times delta x plus the current value. Now, take the Taylor expansion. We check back in the previous equation we have. We obtain this set of equation here. Again, we have a Jacobian matrix multiplied the Taylor expansion minus Jacobian transpose times the b constraining vectors. And that's the derived function we want to send to 0 indicating where the local minimal or global minimal. Reshuffle the matrix one more time. We obtain the following equations. On the left-hand side, we have the Jacobian matrix transpose of a self multiplied by itself. Times delta x and delta x, in this case, is how far I should move and which direction shall I move to minimize my cost function. On the right side, we'll have a Jacobian matrix times the difference between the constraint and the current value. And that's the value, we want to minimize. You can think of it as an error function of itself, not squared. So, an unknown variable here, we want to seek is delta X, it tells me which direction to go and how far shall I go in that direction. And this is very similar to the linear least square problem that we saw earlier. And the solution, in fact, the same. With the delta x computed, we then move our self in that direction, by, exactly, that lens, arriving new point, And we repeat the procedure again So, this slide shows that the delta x would need to move in terms of Jacobian matrix and in terms, of the error function that we have at the current location. Again, for this beacon example, we can see fx is simply the distance in Euclidean form. And it's nonlinear because we have a square term and square root term. And b is simply the distance for each point to the beacons. We can also compute the Jacobian analytically. It's a matrix again, in this case, five by two. And we can plug in the current value of xy to get a numerical value of this Jacobian. Once I compute it, we can move our self in the direction delta x indicated in the slide here. In fact, it's a very large movement we see. It's not a tiny step we just took. In fact, gradient tells me that I can go further to the left. And notice, we're not going straight to the global minimal. But we're going fairly close to it. At one more step, we can see that we're getting very close to this. Global minimal out, and one more step. We actually reach the global minimum. So, with this illustration, you can see that we are crawling the space but we're not crawling blindly. In fact, this ant has a fairly good idea how to go from one point to another. And he also have an idea when to start. And now, is to scarce thru the entire space. And we'd sample hundreds of millions of points before I figure out where the solution is. And this method, allow me to precisely local global minimal or local minimal in finite step. However, the caveat is that we have started our self in the right part of the cost function, the cost function here is highly nonlinear, if I started myself in a different valley, I would have converged into different points in the cost function, and that point is what I call a local minimal Same practice could start from multiple points and take the gradient methods here, find the local minimal and compare. So, we see this is a very powerful method and it's going to be used in a method called bundle adjustments. And the bundle adjustments, the variable we are trying to seek is the three dimensional location of the point x, y, z as well as the camera orientation, the rotation and translation. All of them are the vector we seek. For every image feature points we can measure, we'll have one of the constraints. So, we can have 1,000 constraints in order of ten or so variables. By applying this methods, we can find tune the estimations of rotation translation such that back projection into the image is minimized. And this will allows to obtain much more precise estimates of the camera post as well as three dimensional positions.