0:34

Letâ€™s start by modeling repetition.

Â So in this case,

Â imagine that we're repeatedly tossing the same coin again and again.

Â So we have an outcome variable, and

Â what we'd like to model is the repetition of multiple tosses.

Â And so we're going to put a little box around that outcome variable,

Â and this box which is called a plate.

Â 2:00

of the CPD explicitly into the model.

Â So this random variable theta is the actual CPD parameterization.

Â And I'm putting it explicitly, so

Â that I can show how different variable depend on that.

Â And so if we have the parameters here,

Â we can see that theta is outside of the plate.

Â 3:04

Let's look at a slightly more interesting example.

Â Going back to our university with multiple students,

Â we now have a two variable model where we have intelligence and grade.

Â And we now index that by different students s, which again indicates

Â that we have a repetition, a copying of this template model.

Â In this case, I only made two copies for one for student 1 and the other one for

Â student 2.

Â 3:30

And once again, if we wanted to encode dependence on the parameters.

Â So we might have theta i, which represents the CPD for i.

Â And we might have theta g, which represents the CPD for g.

Â And we would have exactly the same idea of theta i and theta g.

Â Where theta i enforces the two i variables and theta g enforces the two g variables,

Â and again, they're out of the plate.

Â The importance, sometimes in many models,

Â we will include those parameters explicitly within the model.

Â But often when you have a parameter that's outside of all plates.

Â 4:46

And courses we're going to call a little c and

Â students we're going to call a little s.

Â And so now let's think about how you might replicate variables of correspond

Â to properties of courses and variables that correspond to properties of students.

Â So the difficulty variable belongs in the course plate because it's

Â a property of course.

Â So it's going to be difficulty of course and

Â always think about how we are going to put students in?

Â One possibility is that we're going to nest.

Â 5:20

Now what that means is that the student of each variable here,

Â both of these variables are indexed by both s and c.

Â Because when a variable is nested in a plate,

Â it means it has the indices of all plates that it's nested in.

Â So if the intelligence variable is in both the s plate and

Â c plate, it's going to be indexed by both.

Â 5:48

So let's build that model and see what it looks like when we sort of unravel

Â the courses and unravel the students.

Â It can look like that, that we're going to have the difficulty of,

Â let say this is a two course model and the two student model.

Â So we have the difficulty of course one and the difficulty of course two.

Â And now we have the variables in the nested plate I and G.

Â And we can see that they're both parametrized by both student and course.

Â 7:02

Now, let's think about the implications of this.

Â This tells us that there is a core specific intelligence for every student,

Â for every student in every course and that may or may not be what we want.

Â If you're taking radically different courses and one is in art class and

Â one is a math class.

Â Then you could say that there is an art intelligence representing skill if you

Â will in art.

Â Then you have a math skill or math intelligence

Â that you might actually want to have two different kinds of intelligence and

Â not assume that they're necessarily the same thing.

Â Of course, that's kind of complicates the model, and

Â if you have a bunch of corrupt courses that are in some ways similar to

Â each other and take a similar set of skills.

Â You might not want to have a bunch of independent, look independent.

Â 8:12

And so, that gives us an alternative representation, which is what's called

Â plates that are not nested, that overlap with each other.

Â So in this case, we have the course plate which is this plate over here and

Â we have the students plate which is this one over here and

Â the assumption is the difficulty of the property only of the course.

Â Though this is the difficulty.

Â 8:49

And when we unravel this one,

Â what we end up with a model is a model that looks like this.

Â So in this case, we only have a single,

Â we have a difficulty for the course, we have an intelligence for the student.

Â And over here, puts the note things in the intersection in green.

Â We have the grade of the student in the course depends on

Â the difficulty of the course and on the intelligence of the student.

Â 10:01

So why are these kinds of plate models useful?

Â So let's look at an example to convince ourselves

Â that by building these richly structured models, that involve multiple entities,

Â you can actually get much more interesting conclusions.

Â So let's look at this example over here.

Â Imagine that we have this first quarter freshman, came into our university, and

Â we'd like to figure out what we can determine about him.

Â So let's say that in this particular university,

Â our priority believe that most students have high intelligence and so

Â this is the intelligence distribution and 80% high.

Â Now, these students were in a call George took two classes.

Â He took Geo101 and got an A.

Â 10:44

So probability that he's intelligent goes up.

Â He took CS101, didn't do so well, got a C.

Â Now, the probability goes down, but it doesn't go down to a very low number.

Â And that's because we know from the CPD for grade that we've seen previously, so

Â there may be other multiple reasons why student's might not do well in the class,

Â for example, it was a really hard class, so everybody did battle and

Â didn't take issues seriously.

Â If these are the only two courses that George took, we're kind of stuck.

Â But now let's think about this in a more holistic context, or Collective Inference,

Â where we're going to think about a number of students taking a number of classes and

Â let's imagine that we have a bunch of grades for all of those students.

Â So what we see here are, the green ones are As,

Â the yellow ones are Bs and

Â 11:37

the red ones are Cs and

Â what you see here is a short transfer about to observe great variables.

Â I didn't put in all little dots that represents the great

Â variables I just put in these lines that indicate with their.

Â So you can think of this network if you will.

Â So now let's think about what kind of conclusions we can

Â reach from this network.

Â And seems even looking at this by eye,

Â we can see that a bunch of people took CS101.

Â And they all except for our friend, George, and furthermore,

Â even if we look at this guy over here,

Â who got a C in every other class that he took, he still managed to ace CS101.

Â 12:45

And so, we can reach much more important conclusion in the setting but

Â we can by reasoning about individuals and isolation.

Â Now, this is a toy example, but we'll see later on examples of collective

Â inference where we have multiple interrelated entities.

Â It could be related pixels in an image, it can be related

Â 14:00

And what we have is that each of these has to be a subset of this.

Â So what does that mean?

Â It means, for example, that for the template variable G ( s,

Â c), so the G corresponds to variable A, s and

Â c correspond to the indices, in this case, U1 and U2.

Â And what we have is two template peers.

Â We have I of s, and D of c.

Â 14:51

an index in the parent that doesn't appear in the child.

Â So, for example, we cannot have in this model, for

Â reasons that I'll describe in a minute, the notion of for example, honors for

Â student s, depending on the grade of the student in multiple courses.

Â 16:23

So specifically if we have this model, if we have this variable A of U1 up to Uk,

Â then for any instantiation little u1 up to uk, which are concrete

Â instantiations of the indices, we would have the following model.

Â We would have the variable A of little u1 up to little uk,

Â depending on the specific,

Â 16:59

Which is potentially confusing notation,

Â because the sets are a little bit hard to understand.

Â But just really, just think concretely of the example.

Â This exactly says that the grade of a particular student in a particular course

Â depends on the difficulty of that course and on the intelligence of that student.

Â That's all it says, okay?

Â So it's just a general way of saying [INAUDIBLE].

Â 17:48

So, to summarize, plate models are going,

Â which allows us to find a template for an infinite set of Bayesian networks.

Â Why infinite?

Â Because you can have 3 students, 10 students, 1,000 students,

Â a million students, an unbounded number of students.

Â So there's an infinite set of Bayesian networks that we can use this language to

Â encode.

Â And each of them use a different combination of domain objects in our

Â example, for instance, students and courses.

Â The parameters and the structure are reused in both

Â within the base net and across the different base nets.

Â So for example, within our university example, we will use the same parameter.

Â And if we have a different university with a different set of students and courses,

Â we would will still use the same parameters.

Â These models, by allowing us to represent an intricate network of dependencies,

Â allow us to capture very richly correlated structures in a concise way.

Â Which allows us to do this kind of collective inference,

Â which is potentially a very powerful source for informed conclusions.

Â Now I've presented place models, which are the perhaps earliest and

Â one of the simplest of these languages,

Â which allow us to represent template structures.

Â This is a simple one, for example, it has this restriction on the parents not

Â having variables that are not instantiated in the trial.

Â And so for example, you can't represent temporal models here because X(T)-1

Â is not instantiated in the variable X(T).

Â So you can't have X(T)-1 as a parent of x sub p.

Â Not in the price model, I mean, obviously, we have languages that can do that, but

Â not this one.

Â Similarly, you can't have the genotype and

Â the genotype of the father affect the genotype of the child, because,

Â once again, the child doesn't instantiate the mother and the father.

Â These are separate indices.

Â And so this is a limited language, but

Â there's many other languages that expand on it in different ways.

Â And they each have different tradeoffs in terms of what they express easily and

Â what they don't.

Â And there's an entire literature on this that we're not going to go into.

Â But it has provided a number of very useful languages representing

Â these kinds of richly structured models.

Â