This video, is all about learning some new notation, and the use of fractional factorials. At the start of this section, I described how half-fractions work. The next logical question, is what about quarter fractions, or one eighth of a fraction, or even fewer experiments? Every time we do less and less work, what other information are we losing out on? So to answer all of this, we are going to introduce some fun and easy to use notation. The one point I want to make here, right at the start with the hope that you watch these videos all the way to the end, is that the techniques we're going to investigate, have been very well established for the last 80 years. But this field is evolving, and some interesting new fractional designs have emerged. I will give some pointers to them in the last video for this section. Now people in the forums for this course have already hinted at the problem we are going to discuss today. Imagine you're running a system in which you can create bacteria. These cells are grown, to generate valuable nutrients which are then used to create drugs, food product, and other items. These systems, can operate for a long period of time, and they're expensive. A scientist or engineer that is investigating the system, with five factors, could take well over a year to collect all the data necessary to run 32 experiments in a two to the five factorial. In most situations, we cannot wait this long for results. Think about your own case study. You might be working with a system that is expensive or takes a long time. In the cell culture example, imagine we had three months, that was our budget available, and that corresponds to about nine experiments. Now an inexperienced experimenter, we'll go tell the manager, that they can only investigate three factors, because that requires eight experiments which can be done in the three months. The experimenter actually does not have to eliminate other factors from consideration. They can go investigate all five factors. That's what this tradeoff table shows us, it tells us that if we can run 8 experiments, then we could actually investigate 3 factors in a full factorial, 5 factors in a half fraction. 5 factors in a quarter fraction, and so on. In fact, we can go all the way as far up as 7 factors in 8 experiments. That's pretty incredible! The scientist or engineer at their next meeting, can in fact ask their colleagues for suggestions on two extra factors, so they can go from five up to seven. Factors that they think might impact the outcome, but they're not quite sure about. Those are the perfect examples to go move along the row. Once you have a budget for a certain number of experiments, usually, try to go as far across to the right so you can include as many factors as possible. When we do that, we are generating what is called a screening design. We are screening to see which of the factors are important. We know some of them will be, we just aren't sure which ones yet. We don't need a detailed model of their effects at this point, just to be sure that they're important or not. I'm going to show you how we can deal with a case of 5 factors in this example, and you can practice with a case with 7 factors for homework. And you'll see that a bit in the next video too. So back to this trade-off table. Because we have five factors and a budget for 8 experiments, we know that we are in this entry of the table. A full factorial would have required 32 experiments. We're doing 8. So in fact, this is a quarter fraction. It is 2 to the power of 5, minus 2. Notice that all entries in the table have this general format. 2 to the power of "k" minus "p". The "k" is the number of factors. The "p" refers to the reduction in work. Let's focus on these two other items in the entry. D equals AB, and E equals AC. We call these two entries the generators, because they tell us how to create, or generate, the D and E factors in our experiment. If you're being observant, you will notice that coincides with the number of generators of half fractions. "p" is always equal to 1 because we have a work, and half fractions always have one generator. If you have a quarter fraction, "p" is equal to 2. And then we have two generators. That pattern continues in a logical way. So in our case, we have "p" equals 2, we're doing one quarter of the work and we have two generators. Why do we have to generate these factors D and E? Well, I've we've established that our budget is for eight experiments, we can immediately write three columns for factors A, B and C in a full factorial. You've done this many times by now in the course. Here's the table, and notice that it is mission columns D and E, but we can quickly generate them from these two generators. Now, notice that A, B, and C, refer to a column of plus and minus signs. So, a product of them, such as A times C, which equals E, refers to the element by element multiplication of the entries in column A and C to give column E. See how quick that was? There are my five factors to use in my eight experiments. All done. So what about that potential 9th experiment? I often recommend starting with the center point or some sort of baseline experiment as your first experiment. Put all the factors at their centers, their zero value. Now categorical variables, don't have a natural zero. Simply choose an arbitrary low or high value for that categorical variable. That first experiment is a great way to iron out any problems in your experimental protocol. There are always unexpected issues to deal with in the very first experiment, so rather do it on one that you're willing to throw away. However, if that first experiment does work out. You get to keep it, and it improves some of your predictive model parameters. Well that seemed almost too good to be true. You've got these eight runs, you go ahead and do them, record your outcome values and then you're finished. Surely there's got to be a bit more to these fractional factorials? And there is. We're going to have to figure out what the aliases are. Remember when we were looking at half fractions, and we introduced the word "alias"? If that's not something you're comfortable with yet, please review the prior videos again and make sure you understand what that is. Aliases are easy to figure out for half fractions by hand. But they're particularly messy for heavily fractionated designs. We need a system to help figure out what these aliases are, and I'm going to take a few minutes to show you that process. So we've already seen generators. These are the expressions we read from the trade-off table. Now, if we take a generator, we can multiply both the left, and right side, by the same single symbol, that appears on the left hand side. So here, for example, we multiply by D on both sides. So we get D*D = ABD. This creates a desirable simplification that I'll quickly demonstrate. Now we introduce a quick rule. Any time you see two of the same letters side by side, you can instantly eliminate it and replace it with a letter, capital I. This I corresponds to the intercept. Or another way of seeing it is the identity, the number "1". The reason is, is because these two columns multiplied by each other that are the same, will always result in a column of ones. If that column contains a minus entry, multiply by itself, you get a plus. If the column contained a plus entry, multiply by itself, it is still a plus. So two columns that are the same, multiplied together always equals I, and can in fact be eliminated. Okay, so we have taken our generator and slightly transformed it so that it has an identity I on the left and the rest of the generator on the right. <QUIZ> That's right. You should have found a generator or EE which equals I, which equals ACE. It is just another way of expressing that generator. So to summarize our progress, we have two of these generators, I equals ABD and I equals ACE, and we learned to rule how to read them from the table. I quickly want to introduce another term. Any collection of sequential letters is called a "word". ABD is a word, ACE is a word, even I is a word. The last piece of terminology we need, is what is known as the defining relationship. The defining relationship, is a sequence of words that are equal to each other. The defining relationship always has a length of 2 to the power of "p" words, and the first word in the relationship is always the identity, I. So, remember in our example, we have "p" equals 2. So, we should have 2 to the 2, in other words, four words in our defining relationship. And the first one is I, so where are the other three? We find them by taking all possible combinations of the rearranged generators. The simplest combination, is to take the words on their own. So my second word is ABD, and my third word in the defining relationship, is ACE. The next combination, is to combine two of the words together. Well, since we only have two words, we can do this by combining ABD with ACE. Now let's rearrange that and regroup our letters, AABCE. Next we can use the rule that two sequential letters are equal to their identity. So that, in fact, becomes IBCDE. Now remember that I is just a column of plus 1 entries. So it's kind of redundant when it's multiplied with other letters. So we can drop that and simplify it to BCDE. So here's my complete defining relationship with 4 words. I = ABD = ACE = BCDE. That simple set of words, holds the key to figuring out the aliases. We're going to see exactly how to do that in the next video. But, before we wrap up, are you brave enough to try this yourself on a different example? Practice makes perfect! So consider a system, with six factors and running 16 experiments. Your task is to write out the complete defining relationships for that case. First, write out the rearranged generators from the trade-off table. Then take all the combinations. Are you sure you have the correct number of words in your defining relationship? Prepare to pause the video, the solution is going to be shown shortly.