In this lesson, we introduce hypothesis testing in the regression context. We pose a question based on our data, and solve for it using a hypothesis test. We introduce the first of the three approaches to do a hypothesis test using regression output. Let us begin by posing a question based on our data file from previous lessons. Namely, the toysales.xlsx. In the previous lessons, we had estimated a regression model with the unit sales as dependent, or y variable, and price the expenditure, and the promotional expenditure as independent, or x variables. Let us quickly re-estimate this regression model. So, the dependent variable here is the unit sets, and the independent variables are price, advertisement expenditure, and the promotional expenditure. So, to run the regression, you go to Data, Data Analysis, Regression. We select our y range, which is the unit sales. I check the Labels box. Because I'm using labels, my y range is price added expenditure, promotion expenditure. I select the three variables, and I select an empty cell in my spreadsheet to put the output in. The other box, I will keep them unchecked. Doing okay, this is my Regression Output. The estimated Regression Model is as shown. Unit sales is equal to a negative 25096.83 minus 5055.27 times price plus 648.61 times AdExp plus 1802.61 times PromExp. The interpretation of the coefficient of AdExp is that for every $1,000 increase in advertisement spending. The sales increase by 640.6, or if you round it off increase by 649 units. All of the variables remaining at the same level. That is, if you keep the price of toy, and the promotion expenditure fixed at whatever level they are, and only increase the advertisement expenditure by $1,000, you would expect a sales increase of 649 units. Remember, the unit of measurement for added exponent expenditure is a $1,000. However, there is a common beliefs amongst the sales people in the company that every additional $100 spent on advertising, the unit sales increase by 50. Or in other words, every $1,000 increase in advertising expenditure, the unit sales increase by 500 units. So, there is an upper dichotomy here. Your regression model is telling you that sales increased by 649 units for every additional $1,000. While salespeople tend to believe that the increases 500 units for every $1,000, additional advertisement expenditure. Translating this belief into regression model implies that the salespeople believe that the true value of beta two coefficient is 500. Can you use your regression resource to reject the belief held by sales people at your company? This is where the concept of hypothesis testing comes in, in the regression context. Remember, that the estimates that you get from the regression model are but based on a sample of data. There are only sample estimates of the true betas of the model. And as we saw in previous lessons, the true beta values are fixed, but unknown. And when you estimate a regression model, you're obtaining based on the particular sample, the estimates for those fixed true betas. In other words, had you used a slightly different sample, say added a few more months or included some more geographic regions where the toy is sold, you would have gotten some different value of the coefficient on additional expenditure. That is a different estimate for beta two than what you have currently obtained. There are some inherent uncertainties in the sampling process, which then lead to these different values for the estimates of the same true beta. So, but reject the belief held by sales people. You need to take into account the uncertainties behind your results, and then make a judgement whether a value of 500 for the true value of beta two is sustainable or not, based on the data. In other words, you need to set a hypothesis test to test whether beta two is equal to 500 or not. The result of this hypothesis test will tell you whether 500 is an admissible value for the true value of beta two coefficient. Remember, the true value of the coefficient is never known. However, what we are trying to do in the hypothesis test is to see whether, given the uncertainties surrounding our estimation, could it be possible that the true value of beta two is 500? Note that in course three of this specialization on business statistics and analysis, we introduce the topic of hypothesis testing. We will now apply that in the regression context. The steps to be followed in doing this hypothesis test are as follows. Step one is to formulate your hypothesis. The claim or belief that you wish to test is the called null hypothesis denoted by eight subscript zero. Whatever is not the null hypothesis then becomes the alternate hypothesis denoted by eight, subscript eight. In our example, the null hypothesis is beta two equal to 500. And alternate hypothesis is beta two, not equal to 500. Remember, the null and alternate hypothesis are always with respect to the unknown true value of beta two. This is a two tailed test, because there is a strict equality in the null hypothesis. A two tailed test implies that there will be two rejection regions on either side of the t-distribution when the problem is translated onto a t-distribution. Step two is to translate the problem onto a t-distribution by calculating the t-statistic. It is calculated as shown, b two minus beta two, divided by sb2. Where b2 is our estimated coefficient from the regression, which is 648.61. Beta two is the claim around which we are doing the hypothesis test. In our case it is 500, sb2 is the standard error of the estimated coefficient, b2. And it is provided in the regression results, next to the coefficient estimate as shown. Let us calculate this hypothesis in Excel. To calculate the t-statistic, we use the formula as shown. So, the t-statistic is calculated as b2, which is the estimated coefficient. For add expenditure, we pick it up from the regression output minus beta two. Beta two is the claim against which we're doing the hypothesis test, which, in our case, is 500 Divided by sb2, the standard error of the coefficient on add expenditure. Again, we pick it up from the regression output, it is produced next to the coefficient estimate. And then we do an Enter, so that is our estimated t-statistic. If you round it off to three decimal points, that is 0.711. Given this calculated t-statistic of 0.711, step three is to calculate rejection region for the t-statistic. This is a two tailed test with rejection region on either side of the t-distribution. The formula to calculate the cutoff value for rejection region is plus minus absolute value of T.INV open paranthesis alpha/2, residual df, which is also producing the regression output and has the value of n minus k minus one. Nb, the number of observation, and kb, the number of x variables in the model. Since no value for alpha is given, we will chose alpha to be the industrious standard value, which is value, which is 0.05. Using the formula, let us calculate the value of t-cutoff. So, we'll calculate the absolute value, and then we can put a positive and a negative sign in front of it for the rejection region on the right-hand side, and the left-hand side. So, we do is equal to for the absolute value, use the abs function, open parenthesis, T.INV, another open parenthesis, alpha by two. Alpha, in my case, is 0.05. 0.05 divided by two, the next simple test that residual degrees of freedom. The residual degrees of freedom are produced in the regression output. And the table there, we pick up the value from there. The value of 20. Close parenthesis, and another closed parenthesis. To close the absolute value, parentheses, Enter. So, the value of t-cutoff, if I round it off to three decimal places, is 2.086. So, the two rejection regions are to the left of negative 2.086, and to the right of plus 2.086. The final step, step four, is to check whether our calculated t-statistic falls in this rejection region. It does not. Thus, we do not reject the null hypothesis. And since we cannot reject the null hypothesis, we cannot falsify or reject the belief held by salespeople. The true value of beta two could be 500. And we are 95% confident of this, because we have performed this hypothesis test with an alpha value of 0.05, or 5%. So, a regression model estimates the true value to be 648.61. However, after accounting for the uncertainties in the sampling process, a 500 could also be the true value of beta two. We cannot rule that out.