CPD's that exploit additional forms of local structure, Some kind of parametric form, are extremely valuable in the context of the general, in in the general context of graphical models, because they allow us to provide a much sparser representation. But they are absolutely essential when we have a network that involves continuous variables, because there, tables are simply not an option. So let's look at some examples of networks that involves continuous variables, and see what kind of representations we might want to incorporate here. So now let's imagine that we have a continuous temperature variable, say the temperature in a room, and we have a sensor, a thermos, a thermometer that mentions, that, that measures the, the temperature. Now thermometers aren't perfect, and so what we would expect is that the sensor is around the right temperature but not quite. And so, one way to capture that is by saying that the sensor s is a normal distribution. So here's a normal distribution, around the true temperature t with some standard deviation, sigma s. So this defines for every value of t, a distribution over s, in a very compact parametric form that has just really the parameter sigma s, and then we just say that sigma s is of Gaussian around the variable, around the value of the variable t. Now let's make the situation a little bit more interesting. This is the temperature now, and this is the temperature soon. So we have T and T prime. Now, T prime now depends, the, the temperature soon depends on the current temperature, as well as on the outside temperature. Because of some equalization of temperatures from the inside to the outside. So, what model we, might we have, for p prime as a function of its two parents, temperature, the current temperature and the outside temperature? Well, so one model might be just some kind of diffusion model that says that P is equal to some weighted combination, sorry, P prime is a Gaussian around, a mean that's defined as a combination of the current temperature and the outside temperature. So you kind of combine the two and because there is stochasticitian noise in the process, we're going to say P prime isn't exactly equal to this, but rather is a Gaussian around this mean with some standard deviation, sigma T, to be distinguished from the standard deviation sigma S, which was the sensor, variance. Let's moke, make, let's make life even more interesting. Let's imagine that there's a door in the room. The door can be opened or closed, so it's a discreet variable. It takes two values. And clearly the extent of the diffusion, is going to depend on whether the door is open, and we would expect different parameters to the system in the case of, the two values of a discreet variable. And so if we write the model now, we're going to have that the temperature time. It's the temperature soon T prime, is going to be a Gaussian that whose parameters, alpha and sigma, depend on the value of the door variable. So if D equals zero, we're going to have parameters alpha zero and sigma zero T. And if D equals one, we have a different set of parameters that reflect the different diffusion process. So, just to give all these things names, this model that we had over here was called a linear Gaussian. And we'll define that more thoroughly in the next slide, and this model is called a conditional, linear Gaussian. Because it's a linear Gaussian, whose parameters are conditioned on the discrete variable door. So, to generalize these models to a broader setting, where, where we have a general variable y. And y has parents x1 up to xk. The linear Gaussian model has the following form. It says that y is a Gaussian. So that's what the n stands for, whose mean is a linear function, and that's why it's called a linear Gaussian. It's a linear function of the parents x, i, and importantly, whose variance doesn't depend at all on the parents. The variance is fixed. That's the definition of a linear Gaussian CPD. And obviously it's restricted, It doesn't capture every situation, but it's a useful model, and a useful first approximation in many cases. Conditional linear Gaussian introduces into the mix the possibility of one or more discrete parents. In this case we just drew one for simplicity, but you can have more than one. And this is just a linear Gaussian whose parameters depends on the value of a. So, writing it down, it looks exactly like this. We now have, for every one of the parameters we have, the ability for the parameters to depend on A. And in this case, the variance can depend on the, continue on the discrete parent, but not on the continuous ones. And this is the conditional in your Gaussian] model. And again, similarly to restricted model, one can certainly generalize beyond that, as we'll show in a moment, but it's a very useful model that's used in a large number of applications. One example application that we've seen that involves continuous variables is the task of robot localization. I'm not going to show this video again now. We're going to see it again when we talk about Temporal models. But, just as a reminder. What we have here is a robot whose location is a continuous quantity. so are the sensor observations that give a noisy version of how far away the robot is from an obstacle looking in each one of the different directions. Got it. And so we have both the continous, state variables as well as continous observations that the robot needs to deal with. So what kind of observation model makes sense in this setting? So here, let's imagine that this line over here represents the true distance from the robot's current location, from a given location, one obstacle. So, if the robot is looking in, if, we're conditioning on, the robot's current location, and we're asking, if I look in this direction, how far is it before I hit an obstacle? In this case, the distance is 200 and I don't know, 20 centimeters, and what this tells us, is that, a sonar is a Gaussian. You can see this is the sonar, the red is the sonar, is a Gaussian around that true distance. Now, the lasers, which is a different sensing modality for the same robot, is also a Gaussian around that true distance. But because the laser is a more accurate sensor, the standard deviation is lower, for the laser and for the sonar. And that reflects the accuracy of these two different sensor modalities. Now this is, an idealized version. But, surprisingly, it corresponds in useful ways to the real model. So let's actually look at some of the lets look first at the model that was used in the system and then at, which is the red line, and then we can look at the blue line. The red line, actually involves three different components. So this is the actual sensor model. used by the robot, and we can see that it has three components. It has this peak, which is this Gaussian around the true distance. The next most obvious phenomenon is this ridiculous peak over here. This corresponds to a max range reading. This is what you get if there isn't an obstacle in that direction within a reasonable distance for the laser or the sonar to return any signal. And so that's why there's a very large peak, at, as, beyond a certain distance. That's the entire rest of the probability mass. The final most, more subtle aspect of this probability distribution is that you see that this is higher than that. So, this, the, there's more probability mass for the density before the obstacle than after the obstacle. And why is that? Because once you get to the obstacle, though the beam returns, but before I get to the obstacle, there might be some other, like, transient things, like a person walking in front of the obstacle. And that's going to return the beam in a way that doesn't represent the actual structure of the map. And so that's why we have a certain probability of having the, of having the, the beam return sooner than the obstacle, and that probability doesn't exist on the downside once the obstacle's been reached. So the actual probability distribution is an aggregation of these three signals. The sensor model around the obstacle in a given direction, this uniform, probab-, distribution before the obstacle, which for just the spurious return, and the max range reading at the end. The red line is the model that was used, and the blue line is the actual measured distances in different, different settings, to sort of show whether this really does, the model that was used really represents reality. And the answer is that, it does, to a surprising extent. So these, this is an example of how continuous sensor, continuous distributions are going to be used in in a real world application. The next example of that is actually the robot motion model. So here is a robot, and the robot is heading in a given direction, alpha. Okay? Heading, sorry, heading in this direction, that's where it thinks it's going. And, the question is, if it moves, a certain distance, it thinks it's moving a certain distance in a given direction. what is the actual distribution over it's next location? And the answer is, it's a little bit tricky, because robots actually have a heading, and there's a certain uncertainty alpha over, which is a difference between where they think they're going and where they're actually going. And then there's also noise on the distance delta that they think they moved. And so when you put tall these together, the actual cloud the distribution, of where the robot is going to be, following that move, is this weird banana-shaped distribution, which is centered around this area where the robot thinks it is but is, but has a sort of banana shape that's, where the banana shape is induced by the uncertainty about the angular trajectory. And if you actually run this for a while you can see that the banana shape gets more and more and more diffuse. So here is the first banana shape, and now the robot turns and there is more uncertainty over the heading, and so you get a larger and larger banana shape distribution on the robot's position, assuming that there's no evidence to correct the position based on say sonar or laser readings.