So we've argued that table CPDss are problematic, because of their exponential

growth in the number of parents. One of the classes of structured CPDs

that is most useful are classe, is the class of CPDs that encodes a dependence

of a child on a parent. But a dependence that is only happening

in certain contexts. One method for encoding that is using the

class of what's called tree-structured CPDs.

So, to understand what tree-structured CPDs are, let's look at this simple

example. Imagine that we have a.

are students. And the students is applying for a job.

and the. Job, the prospects of the students to get

the job depend on three variables. Depends on the quality of the

recommendation letter that they get from the faculty member, their SAT scores, and

whether the student chooses to apply for the job in the first place.

So let's think about one possible CPD for this, for this model.

so here we have, a tree structure that you can think of it as a set of, as a

branching process. Where, the distribution over job looks at

some variables and then decides what the distribution might look like.

So for example, initially the, The dependence is on whether the student

chooses to apply for the job or not. What happens if the student doesn't apply

for the job, well you might say in that case the student doesn't get the job, but

it turns out to be not the case, in the hay days of silicon valley, for example,

we have the different, internet bubbles students were getting job offers without

every applying for jobs and so it might actually might happen, but the students

propability of getting a job is, non zero, even in this case.

And notice that the student not having applied for the job didn't submit either

a commodation or the SAT scores, which means that the students job prospect

don't depend, in this scenario on either of these two variables.

And so in all possible configurations of the s and l variable, the proba, ugh, s

and l variables, the probability of the student getting a job is zero point two.

What if the student did choose to apply for the job.

Well, in this case we can imagine a recruiter whose primary interest in the

student's SAT scores. They don't really believe recommendation

letters all that much. And so the next, and so the recruiter

first looks at the student's SAT score. And if the student got a good score on

the SAT S1, then regardless of the recommendation letter that the recruiter

doesn't even choose to look at, the student's probability of getting a job is

0.9. Only in the case where the student's SAT

scores are not as strong, does the recruiter go back and look at the letter.

In which case there is a, a certain probability, say 60%, of getting a job if

the letter is strong. And ten% if the letter is weak.

So we can see that we have a CPD that in this case depends on three binary

variables. And so really we would need to represent

in principle eight different probability distributions over the J variable.

But we've only represented four because in certain context some of the variables

don't matter. So in fact this notion of a variable not

mattering, is, related to the notion of complex specific independence which we've

defined previously. So one can formalize this in fact as a

complex specific independence. So let's look at this tree, and think

about which complex specific independencies arise in the context of

this, tree structure CPD. So with looking at the first one does J

the variable that we care about depend on L?

In the context a1. S1.

Well, we can see that in the context a1s1, the recruiter never looks at the

letter, so in fact, j is independent of l in this context, so the answer to this

one is yes. Okay what about the next one.

J is independent of L given A1 alone. Well in this case we have we're going

down here and now there's two scenarios one in which S is equal to S1 but the

other S is equal to S0 and in this case the recruiter does look at at the letter

and so this one is in fact not true. What about the next one?

J is independent of L and S given A0. So let's look at the A0 case.

And sure enough, in the A0 case, there's no dependence on either L or S.

So this one is also true. The last one is a little bit interesting

because it's a mix of context-specific and noncontext-specific independence.

So we're asking if J is independent of L in the context of s1.