In the last class, we looked at a simple comparative experiment involving two different recipes or two different formulations of Portland cement mortar. And we saw that there might be an indication in the sample data that the tension bond strength was different in the two recipes. And I mentioned that we're going to use a statistical methods to try to investigate this conjecture. The framework that we typically use for solving these kinds of problems is called statistical hypothesis testing. The origins of this methodology, go back to the early 1900s and actually even before that. But it's a well-established technique and one that's highly useful in looking at data from planned experiments. Now, we're going to initially use a procedure called the two-sample t-test. But as we go through this material from from chapter 2 or module 2 of the course, we're going to look at a couple of variations of this two-sample t-test procedure. Here's a picture that tries to describe or were illustrate visually, the hypothesis testing framework. The two diagrams that you see at the top of the page here are probability distributions. And the probability distribution on the left represents the population of measurements from factor level 1 or treatment 1. And on the right, that's the population of measurements from factor level 2 or treatment level 2. In our problem, each of these represent a different formulation of the Portland cement mortar. Now, we're going to assume that these populations are normal random variables. They're normally distributed observations. The mean of sample of population 1 is mu1 and the variance of that distribution is sigma 1 square. And on the right, those observations are also normally distributed, and that is a normal distribution with mean mu2 and variance sigma 2 square. So this is the sampling situation. This is the situation that we assume exist that we're studying. The key thing here is that we're sampling from a normal distribution. What we want to investigate is the claim that the means of these two populations are the same. Now, the way we structure that is in terms of a pair of statistical hypotheses. H-naught is called the null hypothesis. And that's the statement that says the two means are indeed equal. So H-naught, mu1 equal to mu2 is the null hypothesis, and H1 is the alternative hypothesis and that's the other state of nature. And in this case, it would be that the two means are not the same. How do we estimate these parameters? We have a mean mu in each population and we have a variance sigma square in each population. The way we do this is by using the sample average y-bar to estimate the population mean. Now, the way you calculate the sample average is easy. You simply add all the observations in the sample together and divide by the sample size n. And the sample variance is also very easy to calculate. One simply computes the differences between each observation in the sample and the sample average y-bar squares those differences add them up. And then we divide that sum by n- 1 and that estimates the variance sigma square. These are straightforward calculations. You can you can do these on many pocket calculators automatically. And of course statistical software does this for you without any real difficulty. Here's the results. And these calculations are done in the book if you want to look at them. For the new recipe, the modified mortar, the average bond strength is 16.76 and the sample variance is 0.1. The sample standard deviation is 0.316 and of course the sample size was 10. For the unmodified mortar, the original recipe, the sample average y-bar 2 is 17.06 and the sample variance is 0.061. And the sample standard deviation is 0.248. Again, the sample size is 10. Now, you notice that those two standard deviations are not exactly the same but they're fairly close together and that's consistent with what we saw in the dot diagrams and in the stem-and-leaf plots for these two samples. We saw that there was a pretty sizably noticeable difference in the averages or in the means but the spread or variability in the samples was pretty similar. And that's what we're seeing reflected in the summary statistics in this display. So how does the two-sample t-test work? We're going to use the two-sample t-test to test this null hypothesis that says, the two means mu1 and mu2 are equal. How does this procedure work? Well, it uses the sample means to actually draw conclusions or draw inferences about the population means. And specifically, it uses the difference in those two means. Y-bar 1 minus y-bar 2. Well, if we plug in the sample data here, the difference in the sample averages y-bar 1- ybar 2 turns out to be -0.28. So that's the difference in the sample means. The way the t-test works is we then divide that difference in the sample means by the standard deviation of the difference in sample means. So this ratio becomes a measure of how different the sample means are in standard deviation units. That's how this works. Well, we know that the standard deviation of an average sigma square of y-bar is sigma square the variance of an individual observation divided by n, the sample size. That's that's basically Statistics that we've probably seen before. The standard deviation of the difference in averages, sigma square of y-bar 1- y-bar 2 is the sum of those sample variances, sigma 1 square over n1 plus sigma 2 square over n2, as long as the two averages y-bar 1 and y-bar 2 are independent. And I think we can comfortably assume independence here, because these are two completely different samples that were generated at different times. They're random samples and the treatments were applied essentially in random sequence. So independence is probably a very reasonable assumption here. So this statement here suggest a statistic of the form that you see here. This ratio is z-naught is y-bar 1- y-bar 2, that's the difference in sample averages. And the denominator of that ratio is the square root of sigma 1 square over n1 + sigma 2 square over n2. That is the standard deviation of the difference in sample means. Now, how do we use this information? Well, if the variances were actually known, if we actually knew sigma 1 square and sigma 2 square, it turns out that this ratio z0 follows a normal distribution. And in fact, if the two means are equal, if mu1 is equal to mu2, this ratio would have a standard normal distribution. That is a normal distribution with mean 0 and variance 1. And we could use that as the basis of a statistical test. And we're going to see how that works, right now. How do we do it? Here's the way it works. Now, we don't know the variances or standard deviations but let's assume we do. Let's just make up a number. Let's let sigma 1 and sigma 2 both be equal to 0.3 just for purposes of illustration, okay? Soon as we know those two numbers, we can plug them into our test statistic z0. It's what we call z-naught at test statistic. Okay, we plug in the numbers, we do the arithmetic, and that value of z-naught turns out to be -2.09. Okay, now here's how we use that information. How unusual is this value of z-naught = -2.09 if the two population means are really equal? Well, remember z-naught, if the means are equal, has a normal 01 distribution. Well, in a normal 01 distribution, it turns out that 95% of the probability or area under that normal curve falls between the values 1.96 and -1.96. 1.96 is called the upper 2 1/2 percent point of the normal distributions denoted z sub 0.025, and -1.96 is the lower 0.025 percentage point of the standard normal. So if the means are equal, 95% of the time, you would expect to see and observe value of z-naught that's in that interval, -1.96 up to +1.96. So what about this value that we just calculated, -2.09? That's pretty unusual, isn't it, if the means are equal. This is a value that would only occur less than 5% of the time if the population means were equal. So this is a fairly strong indication that those means are not equal. You can find these z values from any standard normal table. This is a standard normal cumulative distribution table, and it plots values of z from 0 up to about 3.99. Most standard normal tables are organized like this one. They only give you areas to the left of positive z scores or positive z values. Now, that isn't really much of a problem, because the normal distribution is symmetric and so areas to the left of a negative z are the same as the areas to the right of a positive z. So it's very easy to actually use these tables to show you how I got that value of 1.96. Simply look at the table and scan the table until you find 1.96. Well, here's 1.96 right there. The 1.9 row and the .06 column. And if you look at the entry in the body of the table in 0.975, that is the probability or area to the left of 1.96 on the standard normal curve. So the upper alpha percentage point z of 0.025 would be 1- that. So that's the upper 2 1/2 percent point of the standard normal distribution. And you can use the normal table to calculate these probabilities, or to find these probabilities, or z score values very easily. So if the variances were known, what would we conclude? We would conclude that we should reject this null hypothesis, and a statistician would say we would reject this hypothesis, this null hypothesis at the 5% level of significance. Because the calculated value of -2.09 is outside the + or -1.96 range that corresponds to 5% significance. This is called a fixed significance level test. Because we compared the value of the test statistic to a critical value, in this case 1.96, that we typically select in advance before we run the experiment. And the standard normal distribution is called the reference distribution for this test. Now, there's another way to do this. It's very popular and it's called the P-value approach. The P-value is basically the observed actual significance level. And for the Z-test, it's really easy to find the P-value. And I'll show you how to do that next time.