0:01

CPD's that exploit additional forms of local structure,

Â Some kind of parametric form, are extremely valuable in the context of

Â the general, in in the general context of graphical

Â models, because they allow us to provide a much sparser representation.

Â But they are absolutely essential when we have a network that involves continuous

Â variables, because there, tables are simply not an

Â option. So let's look at some examples of

Â networks that involves continuous variables, and see what kind of

Â representations we might want to incorporate here.

Â So now let's imagine that we have a continuous temperature variable, say the

Â temperature in a room, and we have a sensor, a thermos, a thermometer that

Â mentions, that, that measures the, the temperature.

Â Now thermometers aren't perfect, and so what we would expect is that the sensor

Â is around the right temperature but not quite.

Â And so, one way to capture that is by saying that the sensor s is a normal

Â distribution. So here's a normal distribution,

Â 1:05

around the true temperature t with some standard deviation, sigma s.

Â So this defines for every value of t, a distribution over s, in a very compact

Â parametric form that has just really the parameter sigma s, and then we just say

Â that sigma s is of Gaussian around the variable, around the value of the

Â variable t. Now let's make the situation a little bit

Â more interesting. This is the temperature now,

Â and this is the temperature soon. So we have T and T prime.

Â Now, T prime now depends, the, the temperature soon depends on the current

Â temperature, as well as on the outside temperature.

Â Because of some equalization of temperatures from the inside to the

Â outside. So, what model we, might we have, for p

Â prime as a function of its two parents, temperature, the current temperature and

Â the outside temperature? Well, so one model might be just some

Â kind of diffusion model that says that P is equal to some weighted combination,

Â sorry, P prime is a Gaussian around, a mean that's defined as a combination of

Â the current temperature and the outside temperature.

Â So you kind of combine the two and because there is stochasticitian noise in

Â the process, we're going to say P prime isn't exactly equal to this, but rather

Â is a Gaussian around this mean with some standard deviation, sigma T, to be

Â distinguished from the standard deviation sigma S, which was the sensor, variance.

Â Let's moke, make, let's make life even more interesting.

Â Let's imagine that there's a door in the room.

Â The door can be opened or closed, so it's a discreet variable.

Â It takes two values. And clearly the extent of the diffusion,

Â is going to depend on whether the door is open,

Â and we would expect different parameters to the system in the case of, the two

Â values of a discreet variable. And so if we write the model now, we're

Â going to have that the temperature time. It's the temperature soon T prime, is

Â going to be a Gaussian that whose parameters, alpha and sigma, depend on

Â the value of the door variable. So if D equals zero, we're going to have

Â parameters alpha zero and sigma zero T. And if D equals one, we have a different

Â set of parameters that reflect the different diffusion process.

Â So, just to give all these things names, this model that we had over here was

Â called a linear Gaussian. And we'll define that more thoroughly in

Â the next slide, and this model is called a conditional, linear Gaussian.

Â Because it's a linear Gaussian, whose parameters are conditioned on the

Â discrete variable door. So, to generalize these models to a

Â broader setting, where, where we have a general variable

Â y. And y has parents x1 up to xk.

Â The linear Gaussian model has the following form.

Â It says that y is a Gaussian. So that's what the n stands for, whose

Â mean is a linear function, and that's why it's called a linear

Â Gaussian. It's a linear function of the parents x,

Â i, and importantly, whose variance doesn't depend at all on the parents.

Â The variance is fixed. That's the definition of a linear

Â Gaussian CPD. And obviously it's restricted,

Â It doesn't capture every situation, but it's a useful model, and a useful first

Â approximation in many cases. Conditional linear Gaussian introduces

Â into the mix the possibility of one or more discrete parents. In this case we

Â just drew one for simplicity, but you can have more than one.

Â And this is just a linear Gaussian whose parameters depends on the value of a.

Â 5:52

So, writing it down, it looks exactly like this.

Â We now have, for every one of the parameters we have, the ability for the

Â parameters to depend on A. And in this case, the variance can depend

Â on the, continue on the discrete parent, but not on the continuous ones.

Â And this is the conditional in your Gaussian] model.

Â And again, similarly to restricted model, one can certainly generalize beyond that,

Â as we'll show in a moment, but it's a very useful model that's used in a large

Â number of applications. One example application that we've seen

Â that involves continuous variables is the task of robot localization.

Â I'm not going to show this video again now.

Â We're going to see it again when we talk about Temporal models.

Â But, just as a reminder. What we have here is a robot whose

Â location is a continuous quantity. so are the sensor observations that give

Â a noisy version of how far away the robot is from an obstacle looking in each one

Â of the different directions. Got it.

Â And so we have both the continous, state variables as well as continous

Â observations that the robot needs to deal with.

Â So what kind of observation model makes sense in this setting? So here, let's

Â imagine that this line over here represents the true distance from the

Â robot's current location, from a given location, one obstacle.

Â 8:17

Now, the lasers, which is a different sensing modality for the same robot,

Â is also a Gaussian around that true distance.

Â But because the laser is a more accurate sensor, the standard deviation is lower,

Â for the laser and for the sonar. And that reflects the accuracy of these

Â two different sensor modalities. Now this is, an idealized version.

Â But, surprisingly, it corresponds in useful ways to the real model.

Â So let's actually look at some of the lets look first at the model that was

Â used in the system and then at, which is the red line, and then we can look at the

Â blue line. The red line, actually involves three

Â different components. So this is the actual sensor model.

Â 9:14

used by the robot, and we can see that it has three components.

Â It has this peak, which is this Gaussian around the true

Â distance. The next most obvious phenomenon is this

Â ridiculous peak over here. This corresponds to a max range reading.

Â This is what you get if there isn't an obstacle in that direction within a

Â reasonable distance for the laser or the sonar to return any signal.

Â And so that's why there's a very large peak, at, as, beyond a certain distance.

Â That's the entire rest of the probability mass.

Â The final most, more subtle aspect of this probability distribution is that you

Â see that this is higher than that. So, this, the, there's more probability

Â mass for the density before the obstacle than after the obstacle.

Â And why is that? Because once you get to the obstacle,

Â 10:22

though the beam returns, but before I get to the obstacle, there

Â might be some other, like, transient things, like a person walking in front of

Â the obstacle. And that's going to return the beam in a

Â way that doesn't represent the actual structure of the map.

Â And so that's why we have a certain probability of having the, of having the,

Â the beam return sooner than the obstacle, and that probability doesn't exist on the

Â downside once the obstacle's been reached.

Â So the actual probability distribution is an aggregation of these three signals.

Â The sensor model around the obstacle in a given direction, this uniform,

Â probab-, distribution before the obstacle,

Â which for just the spurious return, and the max range reading at the end.

Â The red line is the model that was used, and the blue line is the actual measured

Â distances in different, different settings,

Â to sort of show whether this really does, the model that was used really represents

Â reality. And the answer is that, it does, to a

Â surprising extent. So these, this is an example of how

Â continuous sensor, continuous distributions are going to be used in in

Â a real world application. The next example of that is actually the

Â robot motion model. So here is a robot,

Â and the robot is heading in a given direction, alpha.

Â Okay? Heading, sorry, heading in this

Â direction, that's where it thinks it's going.

Â And, the question is, if it moves, a certain distance, it thinks it's moving a

Â certain distance in a given direction. what is the actual distribution over it's

Â next location? And the answer is, it's a little bit

Â tricky, because robots actually have a heading,

Â and there's a certain uncertainty alpha over, which is a difference between where

Â they think they're going and where they're actually going.

Â And then there's also noise on the distance delta that they think they

Â moved. And so when you put tall these together,

Â the actual cloud the distribution, of where the robot is going to be, following

Â that move, is this weird banana-shaped distribution,

Â which is centered around this area where the robot thinks it is but is, but has a

Â sort of banana shape that's, where the banana shape is induced by the

Â uncertainty about the angular trajectory. And if you actually run this for a while

Â you can see that the banana shape gets more and more and more diffuse.

Â So here is the first banana shape, and now the robot turns and there is more

Â uncertainty over the heading, and so you get a larger and larger banana shape

Â distribution on the robot's position, assuming that there's no evidence to

Â correct the position based on say sonar or laser readings.

Â