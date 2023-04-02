In this video, we do hypothesis testing for means when the population standard deviation is unknown. As usual, we will start by looking at an example. Suppose you make a part with a design diameter of 10 millimeters. You do not know the standard deviation of the population from which these parts come. You take a random sample of 30 parts and discover that the average diameter of the parts in the sample is 9.97 millimeters, with a standard deviation of 0.05 millimeters. Since you do not know the population standard deviation, you can use the sample standard deviation as an unbiased estimator of the population standard deviation. Based on our sample, the sample standard deviation is 0.051. Knowing s, we can compute the standard error of the sampling distribution as s over the square root of n, which evaluates to 0.009. The sampling distribution, however, is not a normal distribution, it is a t-distribution with n-1 degrees of freedom. But what is a t-distribution? A t-distribution is one that looks like a flattened normal distribution. Now, it gets closer and closer to a normal distribution as the sample size increase. For instance, if the sample size is 5, the t-distribution looks like the one shown in green, while the normal distribution looks like the one shown in blue. Now, as the sample size increases, we find out that the t-distribution approaches the normal distribution. So for large sample sizes, we see that the normal distribution can be a reasonably good approximation for a t-distribution. Now, let us see how we will perform the test. We set up the null and alternate hypothesis as usual, choose the significance level as 0.05, obtain the sample mean, and use the sample standard deviation and the sample size to compute the standard error of the sampling distribution, then we do the test. In a critical value approach, if we use a t-distribution, we will see that the region of acceptance will be the interval from 10-2.045*0.009 to 10+2.045*0.009. This 2.045 comes from a t-distribution with 29 degrees of freedom. The region of acceptance is therefore from 9.9816 millimeters to 10.0185 millimeters. The sample mean is 9.97 millimeters. So the sample mean lies outside the region of acceptance, and hence, we reject the null hypothesis. If a normal distribution was used instead of a t-distribution, then the interval would be from 9.9824 millimeters to 10.0176 millimeters. That is slightly narrower than what we have. But our conclusion would have remained unchanged. In a p-value approach, we see that the sample mean is 0.03 millimeters away from the hypothesized mean. The probability that the sample mean will be 0.03 millimeters away from the hypothesized mean is found using a t-distribution with 29 degrees of freedom as 0.0012. Since this value is less than 0.05, we reject the null hypothesis. Now see, the hypothesis test that we have performed so far is a two-tail test. Let us take a look at an example of a one-tail test. Suppose you manufacture, pack, and sell flour in 500 gram packs. The average weight of the contents in a randomly selected sample of 50 packets is 505 grams with a standard deviation of 6.6 grams. Now, should you be worried? Now, this is clearly a one-tailed test, and the steps involved in the hypothesis test are given here. How do you perform this test? Remember that since the standard error is obtained based on the sample, the sampling distribution of the sample mean is a t-distribution with 49 degrees of freedom. Based on the alternate hypothesis, the region of rejection is to the right of the sampling distribution. Using the t-distribution, we can compute that the critical value is 1.581 grams, more than 500 grams. That is, the critical value is 501.581 grams. Now the sample mean is 505 grams. It is clearly in the rejection region. We reject the null hypothesis and conclude that the average weight of packets is significantly more than 500 grams. In the p-value approach, based on the null hypothesis, the sampling distribution has a mean of 500 grams and a standard error of 0.943 grams. The sample mean is five grams more than the hypothesized mean. Since the sample size is 50, the sampling distribution will be a t-distribution with 49 degrees of freedom, and a standard error of 0.943. Given this distribution, the probability p that the sample mean will be five grams more than the population mean of 500 grams is 1.36 into 10^-6. Since this value of p is less than 0.05, we will reject the null hypothesis and conclude that the average weight of the packets being produced is significantly more than 500 grams. In this video, we have seen how to conduct a one sample test for means when the population standard deviation is unknown. We have seen two approaches for performing this test. In the first approach, we set the boundaries for the region of acceptance. If the sample mean lies outside these boundaries, we reject the null hypothesis. In the second approach, we find the probability that the sample mean will be a particular distance away from the population mean in the direction of the region of rejection. If this probability is less than the significance level, we will reject the null hypothesis.