Hello everyone. Welcome to our lecture, applying the method of least square. In this video I'm going to apply the method of least square under ice cream data. I already downloaded the data set and then save it to my local machine and I want to be able to predict the price of the total sale when the temperature is 89, 92, and 101 degree Fahrenheit. First I need to convert our CSV file to open a data frame and look at the dataset. It has 12 rows. Now, just by looking at it, the sales is fluctuating. The first number is not the smallest because two, two, eight is a smaller. It's not the biggest either. What I'm going to do, I'm going to sort it by sales. We'll go from smallest to bigger. The smaller is two, two, nine, the biggest sale is five, five, six. Now we'll choose the temperature to be the independent variable, and then the sale to be the dependent variable, because the temperature does not depend on how many ice cream you will be selling. X is a temperature and then Y is the dependent variable because the sale depend on the temperature. If it's freezing outside, everybody is closed. Nobody will come out and buy your ice cream. At least fewer people will show up than before. Now, we know our dependent and independent variables. The model A is equal X transpose times X, you take the inverse of that multiplied by X transpose times Y and then this is the formula for the error. Now, to find out we needed to do it step-by-step. I want to take X transpose times X so that we have a variable called P1 and I take the inverse of P1 to solve it to P. P is X transpose times X inverse. Now, Q is just X transpose times Y, as it is of A is just our PQ. If you want to see A, you do this algebra. This divided by that is 35 something and then the second number divided by nine, five, nine, nine is 5.96. A equal this. That means your model for this ice cream problem is 5.96 X plus 35.56. X is the temperature here, so with temperature. The sale depend on the temperature. This is our true data set. What we need to do here is to plot the true data set against our model. The model is in dash blue, and then the true dataset is in red dots. The y-axis is the sale, the x-axis is the temperature. As you can see, our model is not far too up, and it's not far too below the real data point. The next one we need to predict the sale when the temperature is 89, and when the temperature is 92 and lastly, when the temperature is 101. You just plug in 101 for X. That will respond to the question being asked. Now the next thing we need to raise is to get the error that we are making Y approximating this true sale data with our model. SSE is sum of square error is the difference of this of a true data set two, two, eight, starting there minus two, seven, eight, so you square that plus the next one is 292 minus 256, you square that and so on. We did down here and then that's our error that we're making Y approximating the sale with our true data sets. In this video, we'll learn how to convert our CSV with dataset to a panel data frame. We started by sales so that we have an increasing fraction. We identify our independent variable and the dependent variable. The independent here is the temperature, because it doesn't depend on how many ice cream you would be able to sell. However, the sales is depending on the temperature. When it is cold outside, fewer people will come to buy your ice cream. Once we know the depend and the independent variable, we are able to use the normal equation to solve for our ice cream problem. Then, once we get a model for the ice cream problem, we visualize above the data point and the model on the same graph. Lastly, we're able to predict the sales when the temperature is 89, when the temperature is 92, and 101, as well as the error that we're making Y approximating the total set up this ice cream as the temperature is fluctuating. Thanks everybody. I'll see you in the next video.