Linear regression is a very common algorithm to build regression models. After this video you will be able to describe how linear regression works, discuss how least squares is used in linear regression, define simple and multiple linear regression. A linear regression model captures the relationship between a numerical output and the input variables. The relationship is modeled as a linear relationship hence the linear in linear regression. To see how linear regression works, let's take a look at an example from the Iris flower dataset, which is a commonly used dataset for machine learning. This dataset has samples of different species of iris flowers along with measurements such as petal width and petal length. Here we have a plot with petal width measurements in centimeters on the x axis, and petal length measurements on the y axis. Let's say that we want to predict petal length based on petal width. Then the regression task is this, given a measurement for petal width predict the petal length. We can build a linear regression model to capture this linear relationship between the input petal width and the output petal length. The linear relationship for this samples is shown as the red line on the plot. From this example we see that linear regression works by finding the best fitting straight line, through the samples. This is called the regression line. In the simple case with just one input variable, the regression line is simply a line. The equation for a line is y = mx + b, where m determines the slope of the line and b is the y intercept or where the line crosses the y axis. M and b are the parameters of the model. Training a linear regression model means adjusting these parameters to fit the regression line to the samples. The regression line can be determined using what's referred to as the least squares method. This plot illustrates how the least squares method works. The yellow dots are the data samples. The red line is the regression line, the straight line that goes through the samples. This line represents the model's prediction of the output given the input. Each green line indicates the distance of each sample from the regression line. So the green line represents the error between the prediction, which is the value of the red regression line and the actual value of the sample. The square of this distance is referred to as the residual associated with that sample. The least squares method finds the regression line that makes the sum of the residuals as small as possible. In other words, we want to find the line that minimizes the sum of the squared errors of prediction. The goal of linear regression then is to find the best fitting straight line through the samples using the least squares method. Once the regression model is built, we can use it to make predictions. For example, given a measurement of 1.5 centimeters for petal width, the model will predict a value of 4.5 centimeters for petal length base on the regression line that it has constructed. In linear regression, if there is only one input variable then the task is referred to as simple linear regression. In cases with more than one input variables, then it is referred to as multiple linear regression. To summarize, linear regression captures the linear relationship between a numerical output and the input variables. The least squares method can be used to build a linear regression model by finding the best fitting line through the samples.