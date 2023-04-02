In this video, we look at performing hypothesis tests on means when there are two samples to be considered. Suppose we have two populations, the Population A and Population B. Population A has a mean of three units, and Population B has a mean of six units. If we choose samples from both A and B, using those samples, can we tell the means of the two populations apart? Here we see the sampling distributions of the sample means from both populations. If we superimpose these two distributions, we get this. Our question is, if we get one such sample from Population A and another such sample from Population B, could we find out that the two populations had different means by looking at the samples? Now the problem is easy if the two sampling distributions have relatively small standard errors when compared to the differences in the means. Here, for example, most of the observations from Population A would be in this range, and most of the observations from Population B will be in this range, and so it should be relatively easy to guess that there is a difference between the two means. If however, the standard errors are large compared to the differences in the means, then the difference may be difficult to see as the overlap is significant. Two sample tests formalize the problem of finding out whether the difference in means is significant between two populations. We also have two sample tests for proportions, but we do not cover them in this video since the treatment is very similar to the problem for means. Let's take an example. We have seen this example earlier. We choose a random sample of 50 coders working from home. They wrote 124.76 lines of code per day on average, with a standard deviation of 19.037. Working from office, they wrote 132.68 lines of code per day on average with a standard deviation of 21.656. Now, do you think that the coders performance depends on the place of work, that is whether they work from home or whether they work from office? We use the same hypothesis testing procedure that we normally use. That is, we first create null and alternate hypothesis. We then choose an estimator and a significance level. We then collect the data and obtain estimates of those estimators. Finally, we take a decision based on the estimates. Let us understand how each of these steps work for the two-sample tests. In the first step, we create a null hypothesis and an alternate hypothesis about the population. Let Mu_h and Mu_o be the average number of lines of code written by coders working from home and from office, respectively. Our null hypothesis would be that Mu_h - Mu_o = 0 We will compare it with the alternate hypothesis which says Mu_h - Mu_o is not equal to zero. In the second step, we decide on a statistic that measures the difference in means, and we'll also decide a significance level. We estimate the population means Mu_h and Mu_o, with the sample means x_h bar and x_o bar,. So we will use x_h bar - x_o bar to estimate Mu_h - Mu_o Let us also set a significance level of 0.05. Now let the sample size be n, and let the sample standard deviations of the number of lines of codes written per day at home and at office be s_h and s_o respectively. These sample standard deviations are unbiased estimators of the population standard deviations, and to get those, we multiply the standard deviations of the observations in the sample with the square root of n / n-1, where n is a sample size. The sampling distribution of x_h bar - x_o bar is the t distribution centered around Mu_h - Mu_o. The standard error is given by the formula that you see on the slide. These terms of sample mean and sample standard error should look familiar. They are terms just like those in a one-sample test. We only replace the parameter in the statistic with the difference in parameters and difference in statistics in two sample tests. We now obtain estimates from the samples drawn. From the samples we have n = 50, x_h bar = 124.76, and x_o bar = 132.68. The values of s_h and s_o are 19.230 and 21.876. Note that these values are not the standard deviations of the values that we observed, but these are unbiased estimates for the sample standard deviation working from home and working from office. The estimator of a statistic is 124.76-132.68 , that is -7.92. Since we have estimated the sample standard deviation from the sample itself, the estimator follows a t-distribution with a standard error of 4.110. The degrees of freedom for the t-distribution is computed by using a complicated formula. You do not need to worry about the formula at this time, just know that for our problem, this works out to 96 degrees of freedom. Now, we have to do the test. The sampling distribution, as we saw, is a t-distribution with a standard error of 4.110 and 96 degrees of freedom. We have a two-tail test, and so we have two equal regions of rejection. From the t-distribution, we obtain the critical values as + and -8.156. The sample statistic is here. Since the sample statistic is not in any region of rejection, we cannot reject the null hypothesis at Alpha equals 0.05, and we conclude that there's no significant difference between the average number of lines of code written at home and written at office. This is a two-tailed two-sample test. But what happens if we can connect observations in the two samples? For example, suppose we can connect each data point in the work from home sample with the data point from the work from office sample as coming from the same quarter. For instance, Coder 1 wrote 87 lines of code per day when working from home, and 95 lines of code per day when working from office. She wrote eight less lines of code per day when she worked from home. Along the same lines, we see that Coder 3 wrote two more lines of code per day at home than at office. We can find the average of these differences as -7.92 and the standard deviation of these differences as 5.892. To check if the average difference is significant, we simply apply a two-tailed one-sample test. We again have the testing procedure. Our null hypothesis is that the population mean difference Delta is equal to 0, and our alternate hypothesis is that this difference is not zero. We use the sample average difference, d bar to estimate delta and choose a significance level of Alpha equals 0.05. For one sample, we found out that d bar is -7.92. The sample standard deviation is 5.951. The standard error of the distribution is 0.841. Remember that the sample standard deviation is an unbiased estimator of the population standard deviation, and it is not simply the standard deviation of the sample values obtained. Based on the estimate, our region of acceptance is from -1.692 to +1.692. Since d bar lies outside this region, we reject the null hypothesis and conclude that there is a significant difference in the productivity from office than from home. This test tells us that the difference is significant, while the previous test could not reject the claim that the difference was not significant. Why do these tests show different results? Now, here we see the data points for the two samples. The red points in each sample gives us the sample mean values. Looking at the spread of these data points, the difference in means is not significant. But what happens if we pair these data points? Some employees codes more in office, while some other code more at home. The increase was observed in 44 employees and the decrease was observed only in six employees. The quantum of each decrease was about the same as the quantum of each increase. This cost the average value of the change to be significantly different from zero. While doing the previous test, we had no information about individually increases and individually decreases. We were just looking at the relatively small difference in means compared to the individual spreads. In this video, we saw how to perform a hypothesis test by comparing two samples. In case the sample points did not have a one-to-one correspondence, we can only do a two-sample test to check for the difference in means of the two populations from which the samples are drawn. If we can pair the observations of the samples, we can find out individually changes and do a paired sample test for the significance of these changes. The paired sample test is a much more powerful test.