A critical building block in a lot of what we'll do both in terms of definition probability distributions and in terms of manipulating them for inference in the notion of a factor. So let's define what a factor is and the kind's of operations that you can do on factors. So a factor really is a function, or a table. It takes a bunch of arguments. In this case, a set of random variables X1 up to XK, and just like any function it gives us a value for every assignment to those random variables. So it takes all possible assignments in the cross products space of of X1 up to XK. That is all possible combinations of assignments and in this case it gives me a real value for each such combination. and the set of variables X1 up to XK is called the scope of the factor. That is it's the set of arguments, that a factor takes. Let's look at some examples of factors. We've already seen a joint distribution, a joint distribution is a factor. For every combination for example here of the variables I, D, and G, it gives me a number. As it happens this number is a probability. And it happens that it sums to one but that doesn't matter. What's what's, what's important is that for every value of I, D, and G, a combination of values, I get a number. That's why it's a factor. Here's a different factor and a normalized measure is a factor also. In this case we have a factor such as the probability of ID, G1 and notice that in this case the scope of the factor is actually I and D. Because there is no dependence of the factor on the value of the variable G because the variable G in this case is constant. So this is a factor who's scope is IND. [SOUND] Finally, a type of factor that we will use extensively is what's called a conditional probability distribution, typically abbreviated CPD. this as it happens is a CPD that's written as a table. Although, that's not necessarily the case. and this is a CPD that gives us the conditional probability of the variable G given I and D, so what does that mean? It means for every combination of values to the variables I and D, we have a probability distribution over G. So, for example, if I have an intelligent student I a difficult class, which is this last line over here, this tells us that the probability of getting an A is 0.5, B is 0.3 and a C is 0.2 and as we can see, these numbers sum to 1 as they should because this is a probability distribution over G for this particular condition and context. And you can easily verify that this is, that this is true for all of the other lines in this table. So this is again, a particular type of, of factor, one that satisfies certain constraints, in this case that each row sums to 1. Now this is, these are, the factors that we're dealing with will not always correspond to probabilities. So here is an example of a general factor that, that really doesn't match in any way to probability because the numbers aren't even in the range 01. as we'll see these fac- these kinds of factors are nonetheless useful. This is a factor whose scope is the set of variable A, B. And it still goes need a real valued number for each of those for each assignment A and B. Some operations that we're going to do on factors. one of the most common operations is what's called factor product. It's taking two factors say phi 1 and phi 2 and multiplying them together. So let's think what that means. Here we have a factor phi 1. It has a scope of ab, phi two has a scope of bc. And what we're doing is we're kind of like multiplying a function f of xy times g of yz you're going to get a function that is of all three arguments xyz. So in this case we have a factor whose scope is A, B, and C. And if we want to figure out. Oops, that didn't come out good. If we want to figure out, for example, the value of the row A1, B1, C1. It's going to come by taking the A1, B1 row from here. The B1 C1 row from here, and multiplying them together. So we're going to get 0.25. So this is effectively taking the functions or the tables, and just multiplying them together. Another important operation is factor marginalization factor marginalization is is, is very similar to in fact identical to the marginalization of probability distributions so here we accept that it doesn't have to be a probability distribution so for example if we have a factor here who's scope is A, B, and C. And we want to, marginalize out B to get a, a factor whose scope is AC, what we're going to be doing is again, taking both possible, values of B in this case there's on, because B is binary, there's only two values and we add them up, In order to get the entry for A1, C1 so 00.250.008, + 0.08. In the same, all the other rows in this in this table are acquired. are computed in exactly the same way, from the corresponding rows, in the, original larger factor. Finally, factor reduction, again, very similar to the context of probability distributions, we want to reduce, for example, to the context C1, so we're going to only focus on the rows that have the value C equals C1, and that's going to give us a reduced factor. Which only has c1 and once again the scope of this factor is AB because there's no longer any dependence on C. That's basically the final operation. Now, why factors? It turns out that factors are the fundamental building block in defining these distributions and high dimensional spaces. That is the way in which we're going to define an exponentially large probability distribution over N random variables is by taking a bunch of little pieces and putting them together by multiplying factors in order to define these high dimensional probability distributions. It turns out also that the same set of basic operations that we use to define the probability distributions in these high dimensional spaces are also what we use for manipulating them in order to give us a set of basic inference algorithms.