This is module 4 of our experimental design course, and in this module we're going to talk about the blocking principle. The text reference for this material is chapter 4 in the textbook. I really encourage you to dig into that chapter because it's got a lot of really useful information. We're going to talk about the highlights of this chapter. In particular, we're going to focus on one or two particular types of designs that deal with blocking. Blocking, of course, as we've talked about earlier in the introduction to the course, is a technique for dealing with nuisance variables. Nuisance variables occur in a lot of real experiments. Typically, a nuisance factor is a factor that has potentially some effect or some impact on the outcome, on the response variable. But you're not particularly interested in studying that factor. What you really want to do is you want to make the variability that's transmitted from that nuisance factor essentially go away so that it doesn't affect the results of your experiment and in particular, so that it doesn't inflate your experimental error. The type of design that is very often used to do this is something called a randomized complete block design. It's usually known by its initials, the RCBD. We're going to study the RCBD and see how it works. We're going to also see how the analysis of variance can be extended to the situation of an RCBD. Then we'll briefly discuss some other blocking scenarios. But the other type of block design that will introduce briefly is the Latin square design. So blocking, as I've said, is a technique for dealing with nuisance factors. These typically have to be nuisance factors that you can control. Nuisance factor typically has some effect on the response. But as I say, it's not of direct interest to the experimenter, but the variability it transmits needs to be either reduced or minimized. Typical nuisance factors in industrial research and development type experiments involve things like batches of raw material, which can differ from supplier to supplier or from batch to batch, operators, people are always a source of nuisance variability. Pieces of test equipment. Time. Time is a very common source of nuisance variability because the system you're studying may not work the same way on the second day of the experiment than it does on the first, or even between the morning and the afternoon of a particular day or even hour to hour. Many experiments involve blocking. Over the years when I've looked at experiments where people have felt that they haven't really been successful, what I've very often found is that they probably should have blocked and they didn't. So failure to block is a very common flaw in designing in lots of experiments. It can have really disastrous consequences because what happens if you don't block when you should is the variability from that nuisance factor ends up in error. So there's a lot of inflation of the error term, which makes it difficult to see the real factor effects that you're interested in. As I say, if you're nuisance variable is known and controllable, then the blocking techniques that we're going to describe in this module work very well. If the nuisance factor is known but not controllable, then sometimes we can use a technique called the analysis of covariance. That will be discussed later on module 15 in the course chapter 15 of the book talks about that. But on the other hand, if the nuisance variable is unknown and uncontrollable, well, there's not much you can do about that. So what we rely on is randomization to balance out the impact of that nuisance variable across the time period that you're running the experiment. Sometimes we can combine several nuisance sources of variability into a single block, so that the block becomes essentially an aggregate variable that summarizes or contains the effect of several nuisance variables at once. Of course, when you do that, you can't really separate one nuisance variable from another, but that really isn't typically an objective in an experiment that involves blocking. So here's an example, a very simple example involving hardness testing. There is some discussion of this little example in the text. We have a situation where we have four different tips that can be used on a hardness tester, or say a Rockwell hardness tester. The way these testers work is the tip is depressed with a known force into a metal specimen. Then some physical characteristic of the depression that's made is used to measure the relative hardness of the specimen. We're interested in seeing whether these different tips produce the same hardest readings on this tester. This is a typical example of what I would call a gauge or measurement systems capability study. These studies are done a lot in the science and engineering world. Now, assignment of the tips then is made to an experimental unit, and that experimental unit is a test coupon. Well, suppose we were to run this as a completely randomized experiment, so in other words, we have a selection of randomly selected coupons and we randomly assign a tip to a coupon. This could be a problem because the tips may come from material that was produced in different heats, and they may have inherently different hardnesses. So the different hardness is between the test coupons would then transmit into the error term in your experiment. It would inflate the error and it might make it more difficult. In fact, it should make it more difficult to detect any differences that exist between the tips. The test coupons here are the source of nuisance variability. Now the way you may want to run this experiment is you may want to take each test coupon and assuming that coupon is large enough, test all of the tips on a single coupon. That would be a way that we could eliminate the variability between the coupons. Now sometimes experimenter may want to test the tips across coupons of known different hardness levels. If that's the case, you can also use same randomized complete block design structure to do that. Generally, this is a really simple example of the need for blocking because the experimental units could be different. If we don't find a way to minimize that variability, it's all going to end up in error. The randomized complete block design for this experiment would consist of assigning all four tips to each coupon. We're going to assume now that the coupons are large enough that we can make four tests on each one. In the language of experimental design, each coupon is referred to as a block. It's a more homogeneous experimental unit on which to test the tips. Variability between the blocks could be large. The tips could come from different heats, for example, but variability within the block should be relatively small. That is, we want to be able to assume that the blocks are relatively homogeneous. In general, the term block is used to represent a specific level of your nuisance factor. So a complete replicate of the basic design is then conducted in each block. Remember, we test each one of the four tips once in each block. So the way we run a randomized complete block design in general, is we do a complete replicate of the basic experiment within each block. The blocks are generally considered to be a restriction on randomization because we're not completely randomizing the coupons and the tips. We're restricting the randomization by treating the coupons as a block. Now all the runs within the block should be randomized. The position within the block should be chosen at random for each tip. The order in which the tips are tested within each block should also be randomized. Here's an example of what this experiment might look like. Assuming that we used for blocks, the columns here represent the blocks, and you see the tips being tested within each block, and I assigned those tips to the positions within each block in random order. Notice the two-way structure of the data. The treatment factor represents say, rows in this table and the blocking factor represents the columns. Once again, as I said earlier, we're interested in testing the equality of the treatment means, but we have to find a way to remove the variability associated with the blocks in order be able to do that.