How confident are we about the intervals that we create? Before we answer this question, let us recall some properties of the sampling distribution. If we talk about the sampling distribution of the sample mean, then we know that if the sample size is not too small, the sample mean is normally distributed around the population mean. We also know that the standard deviation of the sampling distribution, which we call the sampling error, has a value of sigma over square root of n, where sigma is the population standard deviation and n is the sample size. Now the interval estimation question that we had was that if we are given a sample mean, what is the minimum width of an interval so that the population mean will lie inside that interval with a pre-specified probability regardless of the value of the sample mean? So here we change the question a little bit. The question that we ask is, given an interval of a specified width, what is the probability that if a random sample is chosen and the interval is constructed around the sample mean, the population mean will lie inside this interval? Remember, this is not the interval estimation question. This is a modification where we specify the interval width and try to find out the probability that the population mean will lie within the interval. Now, if we did the experiments that we did in the last video, we will see when the interval width was 20, the probability that the population mean would lie within that interval was somewhere around 30%. When the interval width increased to 40, then the chance increased to 58 or around 60%. When the interval was 60, the chance increased further to about 76%. These are the probabilities that we want to find out. Let us suppose that we are talking about a population with mean 274.2 and a standard deviation of 122.65. We have chosen a sample of size 25 and we have decided that we will be dealing with an interval of width 40. The question we ask is, what is the probability that the population mean lies inside such an interval? We will answer a slightly different question. We will try and find out, what is the probability that the population mean would lie outside that interval? Now remember, a sample mean can either underestimate or overestimate the population mean so the answer to this question will be divided into two parts. One in which the sample mean underestimates the population mean, and another in which the sample mean overestimates the population mean. Let's answer the first question. What is the probability that the sample mean underestimates the population mean and the population mean is outside that interval? If we draw a number line and have the population mean there, the sample mean will underestimate the population mean if the sample mean lies to the left of the population mean. Now since we have chosen an interval of width 40, we can draw a symmetric interval around the sample mean and the interval looks like this. Now, in this case, you see that the population mean lies inside that interval. For the sample mean to be such that the population mean would lie outside that interval, the sample mean has to be this far away, so which means that the sample mean needs to be 20 units away from the population mean to the left. The question is, how do we find out the probability that if we draw a sample of size 25 the sample mean would be this far and even further to the left of the population mean? We know that if we draw a sample of size 25, the sample mean is normally distributed around the population mean and we also know that the standard error of this probability distribution is 122.65 / the root of 25, which is 24.53. The required probability is this shaded area. Now how do we find out the probability here? Now, since the distribution is a normal distribution, the area of the shaded region equals the area to the left of -20/24.53 standard normal distribution. This -20 is the negative of the minimum distance that we want the sample mean to be away from the population mean and 24.53 is the standard error of this distribution. Now this probability can be obtained as 20.74%, that is 0.2074. The sample mean, if it underestimates the population mean and it is such that the population mean does not lie within the interval, the chance of that happening is 0.2074. Now next, if we want to find out the probability that the sample mean overestimates the population mean and the population mean is outside that interval, we do an exactly similar calculation where the sample mean has to be at least 20 units away from the population mean to the right and we know that the probability of this happening is also 20.74%, that is 0.2074. Now, therefore, we answer the original question by saying that the probability that the population mean is outside that interval is 0.2074 to the left plus the 0.2074 to the right. That is a total of 0.4148. The probability that the population mean would be inside that interval is 1 - 0.4148, that is 0.5852. In other words, the probability that the population mean would be inside such an interval is 58.52%. We conclude that if a sample of size 25 is chosen from a population with a standard deviation of 122.65 and a symmetric interval of width 40 is chosen around the sample mean, the probability that the population mean will be inside this interval is 0.5852. The interval sample mean -20 and sample mean +20 of width 40 is a 58.52% confidence interval for the population mean when the sample size is 25. We can also see this from the result that we had from our experiment, where we see that if the interval width was chosen to be 40, the percentage of time that the population mean would be inside that interval is approximately 58.52%.