A common task in computer vision is measuring changes in perspective, or motion between images. Applications include image registration and stitching, video stabilization and motion estimation, just to name a few. In this video, you will learn to describe the changes in perspective between images by estimating the geometric transformation. Geometric transformations for 2D images are based on five types of perspective changes, which we'll illustrate using this image of a stop sign. The types are translation, rotation, scale, shear, and tilt. There are four geometric transformations, each encompassing a different subset of these five types of changes in perspective. The first and simplest is the rigid transformation. If two images only differ by rotation and translation, then the transformation relating the two is rigid. Note that the size and shape of an object remain constant after a rigid transformation. There should not be any changes in scale, shear, or tilt. Next, there is the similarity transformation, which allows for scaling differences in the x and y directions in addition to translation and rotation changes. The affine transformation additionally includes shearing. So parallel lines are maintained even though the object may look quite different. Finally, projective is the broadest type of 2D geometric transformation, with tilt, or out-of-plane rotation being included too. To perform geometric transformation estimation, you need the locations of matching points on the images. Also sometimes known as control points. For example, a matching point pair here could be the bottom-left corner of the letter T in stop. Depending on the type of transformation, you will need a different number of matched pairs to compute an estimate for the transformation. A simpler transformation like rigid only needs two matched pairs. But a complex one like projective, needs at least four pairs. However, you usually want many more than a minimal number of matched pairs, because it's often difficult to perfectly identify the same location in multiple images. Feature-based matching is a great way to get a lot of matched points, but it usually produces some incorrect matches. Take these matched pairs of points for example. Because features are matched based on the pixel values in their local neighborhood, there are some matched points where the local areas look very similar, but they're clearly not referring to the same position on the object. To make this transformation estimation more robust, an algorithm known as random sample consensus or RANSAC, is often used. For simplicity, imagine we had only for matched point pairs. This algorithm randomly selects a subset of the matched points, estimates a geometric transformation from those pairs, applies this transformation to all the matching points in the first image to get a new set of locations, and calculates the distances between these transformed points and the actual points. By repeating this process many times, the algorithm identifies and returns the transformation with the least amount of error. It also identifies the matched points that fit the transformation known as inliers, and the outliers that do not. Let's now estimate the geometric transformation in MATLAB. Earlier, you saw how to match features between images. Here, we read in the to stop sign images, convert them to grayscale and detect, extract, and match SURF features. The matched points coordinates are saved as two variables. To estimate the geometric transformation , use this function. Pass in the matching points from the second image, the corresponding matching points from the first image, and the transformation type as inputs. The stop signs in the images only seem to vary in translation and scale. So similarity is the transformation type here. This creates a transformation that maps points in the second stop sign image to their corresponding locations in the first. The function produces two outputs, the geometric transformation object and a logical array indicating the matched point pairs that fit this transformation. This vector of inlier indices can be used to remove the erroneous matches found during feature matching. Using this geometric transformation object, you can warp the second image, so it's aligned with the first image using the imwarp function. To compare aligned images, it's helpful to ensure that the output sizes are the same. Use the output view option to set the size of the warped image. Now, use the imshowpair function to see the original two images overlaid and compare that to the result from the warped second image. The images are clearly aligned. In this video, you'll learn about the four types of 2D geometric transformations: rigid similarity, affine, and projective and how to estimate them in MATLAB given two images with matching point pairs. Next, you get to practice estimating geometric transformations on your own.