Okay. So in our last class, we talked about testing these hypotheses about our Portland cement mortar, and we concluded that we should reject this null hypothesis at the five percent level of significance. So in other words, we've have pretty strong evidence here that the alternative hypothesis is true. We used a procedure that I call up a fixed significance level test because we suggested or proposed a critical value of 1.96 that gives me a five percent chance of being wrong if I conclude that the means are different, if the value of the test statistic lies outside this range of minus 1.96 up to plus 1.96. This fixed significance level approaches is very common, very widely used. But there's another approach that is also very popular and it's actually become popular because of using computer software to do these tests. That's called the P-value approach and for a Z-test, its very easy to find the P-value. Here's how you do it. Here is your standard normal table again that I showed you last time. We want to find the probability above or the probability that the standard normal variable is greater than 2.09. Now, if you think about this for a moment, you say, "Wait a minute. Didn't we calculate the Z_0 was equal to minus 2.09?" Yes, we did, but our table only contains positive values of Z. So we need to take the absolute value of that, and enter the table and find the probability that is above and is greater than positive 2.09. So we go into the table and we look for a value that's greater than 2.09. There it is, 0.98169. All right, the 98169. That is the area to the left of 2.09. We need the area to the right of 2.09, and so that is 0.01832. Just subtract 0.98169 from one. So the P-value will be twice this probability. Why twice. Well, it's a two-sided test, and so you want half of the risk of being wrong to be on one side of zero and the other half to be on the other. So the P-value for this test is actually twice this computed probability. Well, twice that probability is 0.03662. So we would reject this null hypothesis at any level of significance that is less than 0.03662. Typically, in most science and engineering applications, 0.05 is used as the cut-off. Although frankly, there's nothing magic about 0.05, you could use 0.01 or 0.02 or really any value you want. This value of 0.05 is basically a risk measure. It's the risk of you being wrong when you conclude that the means are different. Depending on the consequences of that, you may choose larger or smaller values of the cut-off depending on the context of the problem. I believe that in the early stages of experimental work, where you're really doing a lot of discovery and you're trying to find out which factors in a system might be important, you could be a lot more liberal with your choice of a cut-off. You could use 0.1 or you could use even 0.15 in some cases. But the problem is if you wrongly conclude that a factor isn't important early on in research work, quite frequently what happens is that factor is then ignored and we don't pay any attention to it for the rest of the work. If the factor really turns out to be important, that could have negative consequences on our work. So making some, what we call type I errors, that is concluding that factors are important when they really aren't in the early stages of research work, that's typically not that big a problem because ultimately we will figure out that factors are important or not, but you don't want to throw away all useful one too early. Now, the Z-test, which we've just described works great if you know what the two population variances are, but we don't. If we knew them, we'd be in great shape. But what if you just plugged in the sample variances instead? Instead of Sigma_1 square in your Z-statistic, plug in s_1 square, and instead of Sigma_2 square in your Z-statistic, plug in s_2 square. Well, if the sample sizes are large enough, this works okay. By large, I mean that the sample sizes for both of your samples would have to be at least about 30, some people say 40. In other words, the Z-test is a very good large-sample test for the difference in means. So if the sample size is big, whether you know the variances or not, is not as big a deal. But many times that isn't possible because your sample size is small. In fact, Gosset actually said that. He actually wrote a paper on the probable error of a mean. He actually in the paper said, "But what if the sample size is small?" Well, it turns out if the sample size is small, you can't use this normal 0,1 distribution as your reference distribution anymore. So let's talk about using s_1 square and s_2 square to estimate the two variances. Well, now your previous ratio, your Z-statistic now changes. It looks like this. Instead of Sigmas, it's got Ss. But now remember, we're talking about the case where these variances are assumed to be equal. So let's combine or pool the individual sample variances to get a single number. What you see down at the bottom of this slide is the pooled estimate of variance S square_P. The way this is done it's a weighted average. We simply combine the two sample variances, s_1 squared and s_2 squared, in proportion to the sample sizes. So this is a pooled estimate of variance and when we plug that in, then we get the test statistic for the two-sample T-test, or some people call this the pooled t-test because we've used this pooled estimator variance. It works a lot like the Z-test that we described earlier. Values of t_0 that are close to zero are consistent with the null hypothesis. Values of t_0 that are very different from zero are consistent with the alternative. So t_0 is a distance measure, just like the Z-statistic was. It measures how far apart the averages are in standard deviation units. You can interpret t_0 as a signal-to-noise ratio. The numerator is a signal that's being generated by your sample data from your experiment, and this thing down in the bottom is a measure of variability, scatter or spread or noise. So when you think of t_0 as a signal-to-noise ratio. So here's how we perform the two-sample or pooled t-test for the Portland cement problem. First of all, we have to calculate S square of P. That's straightforward and we get a calculated value of 0.081 and the square root of that is 0.284. So now, we substitute that into our test statistic t_0, and we get minus 2.20 as the computed value of our test statistic. So the two sample means are a little bit more than two standard deviations apart. Is this a large difference? In other words, how unusual is this value if the means are really equal? Well, that's the question of course, that our friend Gosset answered. Gosset developed the T-test as the way to specifically answer this question. Here's a picture of a t-distribution. The t-distribution looks a lot like the normal distribution, it's symmetric around zero. It has a little bit more spread in the tails than the normal distribution. In this case, the spread in the t-distribution is controlled by something called the number of degrees of freedom on T. The number of degrees of freedom on T here would be the sum of the two sample sizes, N_1 plus N_2 minus 2. So it'd be 18. We can use a table of the t-distribution to find, let's say, the two-and-a-half percent point of T with 18 degrees of freedom, and that value turns out to be 2.101. So minus 2.101 and plus 2.101 would be the boundaries of what we call the critical region for our test. T_0, the computed value of our test statistic falls into that lower critical region. So we would end up rejecting that null hypothesis. Here's the t-distribution table. The rows are the number of degrees of freedom on the test and then the tail areas are the column headings. So we had 18 degrees of freedom and we want the 0.025 level. So that's two-and-a-half percent area in the upper tail and the T-value there is shown to be 2.101. So that's where that value came from. In other words, a value of t_0 from your sample data that lies between minus 2.101 and plus 2.101 would be consistent with equality of means. It is of course possible that the means are equal and t_0 lies outside that range, but it's a rare event. So typically, when we find the value of t_0 that falls in that prescribed critical region, we reject the null hypothesis. By the way, you can also use a P-value approach to doing this, and we'll get into that in the not distant future. Okay, thanks for listening, and we'll resume next time.