In the last lecture, we have talked about the camera projection model, the model of perspective projection. Today, we're going to talk about camera calibration, how to characterize a camera, how to find the parameters of the camera. Whether it be the focal length, the image center, the image distortion, or even the position of the camera with respect to fixed reference frame. Let us look again at the projection matrix. The projection matrix is a 3 by 4 matrix P, which if I will multiply the world coordinates, we get lambda times uv1, where u and v is given as the pixel position. So we have to think about this the following way. Let's say somebody gives us the world coordinates of some point. If P were known, we would immediately get u and v. Let's say somebody gives us u and v. Then, if P were known, we will not be able to immediately get Xw Yw Zw because there is one unknown there, the lambda, the unknown depth. But instead of this, we're going to get a range space through the central projection and the pixel. But in both cases, we need P, the 3 by 4 projection matrix. Now this 3 by 4 projection matrix depends on the camera parameters, the f, the u0 and vo, and the parameters with respect to the world, what we call extrinsic parameters, R and t. In addition to those parameters, we might encounter some other parameters which cannot be modeled linearly, not with a nice, linear algebraic equation we have seen in the last picture. This happens when we have lenses like fisheye lenses, lenses which have a very big field of view. Then, you might have already observed in cameras, like the GoPro, that when we get straight lines in the world, you don't get straight lines anymore in your picture. You get really curved lines. But these curved lines have a particular characteristic. They all of them are distorted with respect to the center. And this distortion we're going to call the radial distortion, which means that the pixel point is distorted proportionally to the radius from the center. We model this radial distortion as opposed to the perspective distortion, which is naturally perspective projection. We model just with a polynomial, which is 1+k1 radius +k2 radius squared + k3 radius cubed, and so on. This k1, k2, k3 are unknown parameters, which we have to find with a procedure called calibration. r is the radius, and it can easily be found as the square root of u square + v square. The calibration estimates the intrinsic parameters of a camera. In the past when people didn't have computers, they were trying to compute these parameters by just looking at the specification of the camera. For example, the focal length would be given in millimeters. One would know that the film has a dimension of 35 millimeters, then somebody would really use this focal length exactly to find the projection in the field. The image center you could find it with some adjusting procedure, and the k1, k2, again, by measuring something directly on the film or on the print of the camera. These days, having computers and having a lot of software doing geometric transformations and optimization, we can do this on a computer. So we never rely anymore on the readings of the camera specifications. But we apply a procedure called calibration to estimate the focal length f, which are going to be measured in pixels, the image center in pixels. And k1, k2, etc, the parameters that are unitless, which describe the radial distortion. These intrinsic parameters are as opposed to the extrinsics which really depend on where is the camera with respect to a fixed frame. Let's see what is the result of such a calibration looks like. On the left, we see a picture taken with a very wide field of view camera. You see all the lines in the picture up here, really curved. And on the right, you see the result after removing the radial distortion. You see all the lines appearing straight. And this is what we want in many applications because this radial distortion is actually distracting us from the linearity of our equations. It's a nonlinear effect which we try to remove first before we make any geometric computations. Let's see now how exactly we peform this calibration. To perform this calibration, we need to compute the intrinsic and extrinsic parameters given world coordinates of some points in the world and the corresponding coordinates in images. After we find, again just to remind you, after we find the rotation translation and the intrinsic K, RT and K, and having removed the radial distortion. Then we can find the projection of the rays in the world, it can do any computation we want. For example, compute the motion of the camera, or compute a regulation of the point. Let's see how we calibrate in Matlab. We are going to use what is called the calibration toolbox. After we have download the calibration toolbox and while in the directory where the calibration toolbox lies, we call a GUI, a graphics user interface. And then we have a menu where we set and select to load our images. These images are the image of a checkerboard that can be, for example, 8, 10, 12, or 20. And then we have saved them in a directory, and we load them all in memory. What do we exactly want to find? Remember that we want to find these matrix parameters and the radial distortion parameters. We know the world coordinates because we have measured our checkerboard, and we know the pixel coordinates if we click on them. Now this procedure, we saved us some clicking, which is the selection of extracting corners. When we're extracting corners, every image is presented by us, and we call with a pointer and select four characteristic points. We're going to see later in a lecture about projective transformation, why we really select these particular four points. The upper left point, the first one that we click has always to be at the same position and visible. The other points can be, for example, shorter or less deep in the image. The images have to be taken from several viewpoints so that the checkerboard appears quite obliquely and varies in appearance in all the ten images. If possible, not only rolling the image around optical axis, but also tilting it and panning it. After we have extracted the corners of the checkerboard, what this procedure really does internally is to compute all the pixels, all intersections of the checkerboard in the images. So instead of having us clicking in all 64 positions, we clicked only 4, and does a computation of the 64 by itself. In the next point, we click calibrate. And what really happens at this point is really that given the world coordinates and the pixel coordinates, the software computes the camera parameters. You will see that these camera parameters are in a vector of the focal length, the image center c, the radial distortion, and then a rotation matrix in the translation. A small detail here which we have not presented, for the focal length, we actually see two parameters, a different focal length for x, and a different focal length for y. And this still stems from the time when pixels were not really square. In the old TV, the pixels were rectangular, and that's resulted in two different scaling factors for the x and the y. So we still keep it, and usually these two values the way you see it here are really almost equal up to some decimal places. Another interesting phenomena that you're going to see is that the u v that you are actually seeing there is not the exact center of the image. And indeed, this might be true that actually the optical axis does not exactly intersect the chip in the middle, because the chip might have not been positioned exactly. But it might be also just an effect of the optimization that happens here, where all parameters are seated in a way that their projection error is minimized. The projection error of these two last values in pixels you would really want it to be very small. And this is the optimization criterion to find these parameters, that if we take the points in the world and we apply the estimate parameters that we're going to fall very, very close to original pixel positions. As a result of this calibration you can see also all the estimated poses of the camera you have used. You might have taken the same images by waving the calibration board, the checkerboard in front of you, or by moving the camera in front of a checkerboard. But these two relative motions are exactly the same. So what we really see in the visualization here are really just calibration poses with respect to a fixed coordinate system. The R and T you really find from this procedure is the pose of the camera, the very first image where you were holding it when you started taking the images. So, this is what you get out of the calibration procedure, and you save these parameters. And you have to use them on many other problems, like the absolute pose estimation, structure for motion, triangulation, or stereo. So, for verification, what you can do is take all of the parameters that you've found and press Reproject Coordinates, which is going to show the undistorted image without the radial distortion. And you will see how close the points in the world are projected to the original points in the image. You can use this also in other pictures, and see how well this projection is close to the original projection of your images.