0:01

In today's class, my goal is to show you how we analyzed

Â data from factorial experiments when three factors were used.

Â This example is based on one from this textbook by Box,

Â Hunter and Hunter, called Statistics for Experimenters.

Â The experiments described in that example were run to find the combination of

Â settings that would reduce the amount of

Â pollution discharged from a water treatment facility.

Â This is clearly a case where we would like to minimize the amounts of pollutants.

Â So minimizing our outcome variable would be the objective.

Â Three factors were considered.

Â The first one factor C,

Â was the chemical compound used.

Â Let's call that compound P and compound Q.

Â We don't really know their names.

Â Factor T was the temperature of the treatments,

Â whether we were treating the water at 72 or 100 Fahrenheit.

Â And factor S was the stirring speed,

Â either a slow speed of 200 revolutions per minute or a high speed of 400.

Â Notice that every factor has two levels.

Â And going back to that mathematical idea that two to

Â the power k is the total number of experiments,

Â K is equal to three in this example.

Â So we get a total of eight possible combinations.

Â Here's a short quiz to test that knowledge.

Â So let's take a look at the results.

Â We will always present our data and analyze it using what we called standard order.

Â Standard order requires that we create a column for each of our factors.

Â So C, S and T. Note that I could have used a,

Â b and c for the three factors,

Â but very often we'll switch to letters that actually match our factor names.

Â But you don't have to.

Â So back to the standard order table,

Â and the rule is we vary the first factor the fastest

Â minus plus minus plus minus plus minus plus.

Â The second factor temperature is varied the next fastest,

Â between its low and high levels.

Â So two minuses, two pluses,

Â two minuses, two pluses.

Â And then the last factor S is very the slowest.

Â So four low levels and four high levels.

Â Those make up our entire table.

Â Never run the experiments in the order of this table.

Â The order must be randomly selected.

Â So what we will do is add a column to our table to

Â keep track of the order in which we actually run the experiments.

Â Also add a column over here for the outcome variable.

Â In this case, the outcome was the pollutant amount measured in pounds.

Â One thing that's so great about the standard order table is that we can get

Â a quick sense of the factors influenced on the outcome variable.

Â Take a look for example,

Â at how the pollution amounts changes when we change

Â the chemical compound factor C. That factor goes low high,

Â low high, low high, low high.

Â We see that same pattern in the pollution amounts.

Â Take a look at the effect of factor S.

Â The first four experiments have a very high level of pollution on average,

Â while the last four experiments have a low level of pollution.

Â That also matches with factor S. We can already tell just from

Â this table that factor C and factor S are

Â going to be really important to understanding the results.

Â Let's go back to our Q plot,

Â and this time our Q plot is actually a cube.

Â We can draw it by showing the first factor along the horizontal axis,

Â the next factor on the vertical axis,

Â and the final factor S is shown in and out of the page in this diagonal way.

Â Next, we transcribe the values onto this cube.

Â This is really easy when we follow the standard order sequence.

Â Take a look, 5, 30, 6, 33,

Â then 4, 3, 5, 4.

Â I love this visual representation of the experimental data.

Â It really helps us achieve our objective so quickly.

Â Take a few seconds and answer this question.

Â At what levels should we set our three factors in order to

Â achieve the lowest pollution amounts? That's right.

Â It's very clear we need to use chemical Q,

Â operate at low temperature,

Â and with high stirring speeds of 400 revolutions per minute.

Â Later on in the course,

Â we're going to start examining what happens when you move outside this cube.

Â And I want you to already start to think along those lines.

Â But let's come back to the data we have right here and

Â analyze the main effects and the interactions.

Â Start with the first factor C,

Â the choice of either chemical P or chemical Q at the high level.

Â If we look at the cube,

Â we actually have four estimates of

Â that main effect along each of the four horizontal edges.

Â At high temperature and high stirring speed, in other words,

Â high T and high S,

Â that effect is equal to 4-5.

Â At high temperature and low speed, that's 33-6.

Â At T minus and high speed,

Â in other words S plus, it is 3-4.

Â And finally, at low temperature and low speed, it's 30-5.

Â So four estimates of the effect of the chemical.

Â And the average of these four is equal to 50/4 = 12.5. Let's pause here.

Â I always tell my students it's no good just calculating numbers.

Â What does this value of 12 and a half really mean in plain language?

Â How would you describe this value to your manager

Â who doesn't really understand any statistics?

Â What it says is on average,

Â we expect an increase in

Â the pollution amount when we go from using chemical P to using chemical Q.

Â And remember by convention,

Â we report half of that amount.

Â So report a value of 6.25.

Â One further thing to notice is the discrepancy of that chemical effect at

Â high S and low S. Notice the very large difference there.

Â From the prior class, number 2C,

Â this should be alerting to you that there's an interaction between factor C

Â and factor S. But before we get to that,

Â let's take a look at temperature.

Â When we examined the table earlier,

Â we didn't really notice anything special about temperature,

Â and we should be able to confirm that numerically.

Â We have four estimates of the temperature effect along the vertical axis.

Â 4-3 here, 33-30 up here,

Â 5-4 back there, and 6-5 here up at the front.

Â So on average, we get a value of 1.5 as our difference.

Â Or if we report half of it,

Â that's an effect of 0.75.

Â Lastly, let's take a look at the effect of stirring speed

Â S. Along the four diagonal axes,

Â we have 4-33 up here,

Â 3-30 down here, 5-6 here,

Â and 4-5 over there.

Â The average of those differences is -14.5.

Â And if we report half of it, that's -7.25.

Â The -14.5 tells us that we expect on average a reduction of

Â 14 and a half pounds of pollution when

Â we go from a low stirring speed to a high stirring speed.

Â So clearly, it's in our favor to use

Â high stirring speeds in order to get that reduced pollution.

Â You should always step back at this point and make sure these results make sense.

Â Horizontally, we see going from chemical P to Q increases the pollution amounts.

Â That value of 6.25 looks about right.

Â The small value of 0.75 for temperature also looks right,

Â because it really has a very small effect.

Â And finally, increasing the stirring speed has the largest reduction on pollution.

Â A decrease of 7.25 units.

Â You noticed while I was reviewing these results,

Â I started to build up a numeric representation for you on the screen.

Â We did that in class where we considered the ginger biscuits,

Â and I just followed the same idea here.

Â Y represents the prediction of the pollution.

Â The 11.25 value here is the baseline.

Â It is the average of all eight of the outcome values: 5+30+6+33+4+3+5+4/8.

Â The other three terms are the separate effects of each factor.

Â Those are the main effects.

Â Let's see how we can use this model to make some predictions.

Â Consider the situation where we're using chemical Q.

Â In other words, XC is coded as a value of plus one.

Â Let's use low temperature.

Â So XT is coded as minus one.

Â And also let's use low stirring speed,

Â so XS is minus one.

Â The predicted value is 11.25+6.25-0.75+7.25.

Â That's a value of 24.

Â That's quite a bit difference to the value of 30 pounds which was actually recorded.

Â There's something we haven't accounted for.

Â And that's the interaction between C and S. An interaction is when

Â you have one variable behaving very

Â differently depending on the level of another variable.

Â We noticed earlier that the chemical effect has a change from

Â 30 to 5 and 33 to 6 over here on the front face at low stirring speed.

Â Yet, on the back face of the cube at high speeds,

Â the effect is almost zero.

Â 3-4, 4-5, very small amounts.

Â It's very clear the stirring speed modifies the effects of the chemical.

Â There's an interaction between S and C. How do we quantify this?

Â Well, like we did in class 2C,

Â we have to add a new term to our prediction model,

Â and that term the BCS is multiplied by XC and XS.

Â But how do we go calculate that BCS value?

Â Let's go follow the same idea as we did in class 2C.

Â We have two chances to calculate it.

Â One instance at high temperature and one instance at low temperature.

Â We will calculate both and then average the answer.

Â And then as we've always done,

Â report half the value.

Â So at high temperature,

Â the difference due to C at high speed is 4-5.

Â The difference due to C at low speed is much greater, 33-6.

Â As you remember, interactions are always

Â reported as half the difference going from high to low.

Â In other words, that's -1-27 which equals -28,

Â and half of that is -14.

Â Let's report that at the lower temperature.

Â The difference due to C at high speed is 3-4.

Â The difference due to C at low speed is much greater, 30-5,

Â report half the value from high to low,

Â that is -1-25 which equals -26.

Â Dividing that by 2 gives -13.

Â So now we have two estimates of the interaction effect.

Â One estimate is -14.

Â The other estimate is -13.

Â The average of those two numbers is minus 13 and a half.

Â And when we report it,

Â let's put in our model a value of half that amount.

Â So in other words, -6.75.

Â That's the value for BCS-6.75.

Â Now, let me just pause here for a second and

Â emphasize that this is all very tedious if you do it by hand.

Â And we're going to show some computerized ways to do this faster in the next few classes.

Â But I always recommend,

Â let's start with by hand and then see

Â the advantage of it later on when we go to computers.

Â So let's take our predictions now again from

Â the previous example and see if they improved.

Â The predicted value earlier was 11.25+6.25-0.75+7.25.

Â But now with this interaction turn,

Â we have an additional part -6.75x1x-1.

Â What that means is

Â we actually get an additional amount of 6.75 due to the interaction,

Â getting us a prediction of 30.75,

Â much much closer to the actual value.

Â Notice here that the interaction actually works against us.

Â That interaction has increased the amount of pollution.

Â We could also calculate CT interactions and TS interactions.

Â I've only shown you for CS interactions.

Â In fact, there's even a three factor interaction, the CTS interaction.

Â But all of this gets very tedious and error prone.

Â Coming up in the next module,

Â we can't wait to show you

Â some computerized shortcuts that will take care of all of this work for you.

Â Now, at the risk of this course going on a little bit too long,

Â I want you to sit back and just think about that interaction for a second.

Â Don't just see it as a number,

Â but let's try to interpret what's really going on over here.

Â Why does chemical Q appear to be less effective at low speed,

Â but at high speed it works really well.

Â Maybe chemical Q just takes a little bit longer

Â to dissolve in water than chemical P does.

Â At low stirring speeds,

Â chemical Q is not effective,

Â but at high speeds both chemicals are equally effective.

Â Now here's where experiments can be really powerful.

Â We saw that the lowest pollution was over here in this corner,

Â when we used chemical Q with high speed and low temperature.

Â But what if the government requirements was pollution had to be smaller than 10?

Â And imagine also that chemical Q costs you double

Â the amounts of chemical P. You can see where this is going.

Â We can see that any operating points on this plane would be effective,

Â as long as it's not the point with low speed and chemical Q.

Â In fact, it might be a whole lot more economically profitable to

Â operate at this point over here producing five pounds of pollution.

Â We still meet the requirements for safe operation

Â because we're below the government level of 10 units.

Â And we use less energy for stirring and a cheaper chemical P.

Â Actually, what we've done here is considered an additional outcome in our mind, profits.

Â Recognize that profits or costs often play a role in any system.

Â So you should always be aware of the economic impact of every corner in your cube.

Â To end this class and this example,

Â I want you to consider this.

Â Does the fact that temperature having a small effect imply

Â that temperature is meaningless?

Â The answer is no.

Â It is important to recognize that even effects

Â with small numbers have an important interpretation.

Â It means that over the range of temperature selected,

Â in this case between 70 to 100 Fahrenheit,

Â that temperature has a small to negligible change on the pollution amounts.

Â Now this is a key insight because the engineer or operator can take

Â this and select operating conditions which are the most economically advantageous.

Â Again, this comes down to profits.

Â It is conceivable that when using lower temperature we will save energy.

Â And because temperature has such a small effect on the system overall,

Â it means that we will not significantly affect

Â the pollution level when we operate at low temperature.

Â That's a great result.

Â So I want to thank you for staying with me during these examples.

Â I know that they've been longer than normal,

Â but I hope they have been insightful.

Â In the module coming up next,

Â we're going to start looking at how we can do fewer experiments,

Â but still get a good amount of information about our process.

Â Hope to see you over there.

Â