So we've argued that table CPDss are problematic, because of their exponential growth in the number of parents. One of the classes of structured CPDs that is most useful are classe, is the class of CPDs that encodes a dependence of a child on a parent. But a dependence that is only happening in certain contexts. One method for encoding that is using the class of what's called tree-structured CPDs. So, to understand what tree-structured CPDs are, let's look at this simple example. Imagine that we have a. are students. And the students is applying for a job. and the. Job, the prospects of the students to get the job depend on three variables. Depends on the quality of the recommendation letter that they get from the faculty member, their SAT scores, and whether the student chooses to apply for the job in the first place. So let's think about one possible CPD for this, for this model. so here we have, a tree structure that you can think of it as a set of, as a branching process. Where, the distribution over job looks at some variables and then decides what the distribution might look like. So for example, initially the, The dependence is on whether the student chooses to apply for the job or not. What happens if the student doesn't apply for the job, well you might say in that case the student doesn't get the job, but it turns out to be not the case, in the hay days of silicon valley, for example, we have the different, internet bubbles students were getting job offers without every applying for jobs and so it might actually might happen, but the students propability of getting a job is, non zero, even in this case. And notice that the student not having applied for the job didn't submit either a commodation or the SAT scores, which means that the students job prospect don't depend, in this scenario on either of these two variables. And so in all possible configurations of the s and l variable, the proba, ugh, s and l variables, the probability of the student getting a job is zero point two. What if the student did choose to apply for the job. Well, in this case we can imagine a recruiter whose primary interest in the student's SAT scores. They don't really believe recommendation letters all that much. And so the next, and so the recruiter first looks at the student's SAT score. And if the student got a good score on the SAT S1, then regardless of the recommendation letter that the recruiter doesn't even choose to look at, the student's probability of getting a job is 0.9. Only in the case where the student's SAT scores are not as strong, does the recruiter go back and look at the letter. In which case there is a, a certain probability, say 60%, of getting a job if the letter is strong. And ten% if the letter is weak. So we can see that we have a CPD that in this case depends on three binary variables. And so really we would need to represent in principle eight different probability distributions over the J variable. But we've only represented four because in certain context some of the variables don't matter. So in fact this notion of a variable not mattering, is, related to the notion of complex specific independence which we've defined previously. So one can formalize this in fact as a complex specific independence. So let's look at this tree, and think about which complex specific independencies arise in the context of this, tree structure CPD. So with looking at the first one does J the variable that we care about depend on L? In the context a1. S1. Well, we can see that in the context a1s1, the recruiter never looks at the letter, so in fact, j is independent of l in this context, so the answer to this one is yes. Okay what about the next one. J is independent of L given A1 alone. Well in this case we have we're going down here and now there's two scenarios one in which S is equal to S1 but the other S is equal to S0 and in this case the recruiter does look at at the letter and so this one is in fact not true. What about the next one? J is independent of L and S given A0. So let's look at the A0 case. And sure enough, in the A0 case, there's no dependence on either L or S. So this one is also true. The last one is a little bit interesting because it's a mix of context-specific and noncontext-specific independence. So we're asking if J is independent of L in the context of s1. For both values of the variable A. And so now let's, and so, to answer this question, we actually need to do a case analyses. Because this reduces to two different independent statements. J is independent of L, given S1 and A1. And J is independent of L, given S1 and A0. So, let's evaluate each of these two separately. J is independent of L, given S1A1, is exactly this assertion. So this one's true. J is independent of l given s one and a zero. Is Represents this. Which, in fact, is a special case of this scenario. And so, both of these, in fact, are true independent statements. And so, since both cases hold, we have another conditional independent statement that holds here. Let's look at another example that turns out to be representative of a large class of examples in this context. So here the student when applying for the job, needs to submit a recommendation letter, but has a choice between the two letters that they might that they might elect to provide. One from one course and another from a second course. So letter one and letter two. Now, the student's job prospects depend on the quality of the letter that's actually provided, because of course the recruiter does not have access to the letter that was not provided. So if we look at this in the context of the tree CPD, it would look like this. The first variable at the top corresponds to the student choice. And it has two branches, C1 and C2. And in the C1 case, there is dependence only on the quality of letter one. And in the C2 case, there is dependence only on the quality of letter two. So, this is an example of what a well. Related to something called a multiplexer CPD because effectively, the choice variable determines the dependence on one set of circumstances or another set of circumstances. Now, it turns out that this example has some interesting ramifications, because not only do we have context independent dependencies that, arise because of the tree structure. It turns out that this also implies non context independent dependance that we will see useful later in the course. Specifically, we have that letter one is independent of letter two given J and C. Now if you think about this from purely the, the prospective of the, the deseparation structure is the flow of influence in this graph we can see that the job actually activates. The v structure between letter one and letter two, so you wouldn't actually expect letter one and letter two to be conditionally independent, that is we have a flow of influence because of intercausal reasoning. But now lets think about this in more detail. And that's the location analysis just like we did before. So we're now going to ask if letter one is independent of letter two, given j and c1. But what happens in the context c=c1? Well in this case, the, there's no longer the dependence between job and letter two because the recruiters never given the second letter. And so in the context C one, the graph really looks like this where there's no edge from L to the J. Conversely, looking at the other case analysis, where, c equals c2, in this case, this other edge is going to disappear. And once again, there's no v structure, and so there's no active trail between these, two variables L1 and L2. So effectively, in both of these cases, the active trail disappears, and so that implies the independent assumption. I mention this example is related to a more general class of models called Multiplexer CPD. the Multiplexer CPD in this case actually has the following structure. We have a set of random variables Z1 up to ZK all of which take up some value in some particular space. and the variable Y is a copy of one of the [??]. The variable A over here, is the multiplexer, the selector variable. And the selector variable takes on values in the space one decay and it selects which of the ZIs the Y copies. And notice that the Y here is deterministic, as we can see by the fact that we have these two these two lines surrounding it which is our way of indicating deterministic dependencies. And so what is the CPD of the variable Y given the selector A and the parent Z1 up to ZK? We can think about this as remember we need to specify probability distribution. So, this probability distribution is one if, y is equal to z sub a. So what does that mean? It means that, and, and zero otherwise. So what does that mean? It means that if A stays equal to little A, then deterministically Y is equal to Z sub little A with probability one and that is just a formal way of saying that. So A tells us which, which of the variable Z Y needs to copy. This turns out to be an extremely useful concept in a variety of applications. So, for example, when we have perceptual, uncertainty, when you have noisy sensors, where we observe, say, we have, say, a sensor observation of one of several airplanes. But we don't know which airplane it is that we're observing. And so the position of the observation is the, represents the position of the airplane that we're observing. But the variable A, here, is the one that tells us which airplane it is, which we might also be uncertain about. And this gives rise to a whole set of problems, known as registration or correspondence or data association problems. Which are very common in many applications. Different type of application for this, type of structured CPD comes up in physical hardware configuration settings. So this is an actual, example from a troubleshooter, for printers used at Microsoft. And it turns out that all of the troubleshooters that are part of the Microsoft operating system are, Built on top of a Bayesian Network Technology. So here the task is to try and figure out why a printer isn't printing. So we have a variable here that tells us whether the printer is producing output, and, that depends on a variety of factors, but one of the factors that it depends on is where the printer input is coming from. Is it coming from a local transport? Or a network transport. And, depending on which of those it's coming from, there's a different set of failures that might occur. So the variable here that serves the goal of the selector variable is this variable print data out. And that's the root of the tree that's used here. And and depending on whether the print location is local or not. then you depend either on properties of the local transport. Or on properties of the network transport. And it turns out that even in this very, very simple network, the use of tree CPD's reduces the number of parameters from 145 to about 55, and makes the elicitation process much easier. So to summarize tree CPDs provide us with a compact representation that captures effectively this notion of dependence in in a context specific way, and as we've mentioned is relevant in a broad range of applications of which we're, we've only given some examples, hardware configuration, medical settings, where depending on the kind of situation that you're in you might depend on one set of predisposing factors, say, or another. Dependence on an agents action, as we've seen for example in the student's decision on whether to apply for a job or not, or which letter to submit. And we've also discussed perceptual ambiguity, where the value of a particular sensed observation depends on which real world that observation comes from.