Let's talk a little bit about expectations. So if you're taking this class, you already know that the expected value of let's say the kth moment of a random variable x is the integral x to the k from say minus infinity to plus infinity, f(x) dx, where f is the associated density. And then this integral would be replaced by a sum if it was a discreet mass function. So imagine now, instead of X being a scaler, X is a p dimensional vector. So X is p by one. Then the kth moment of the ith component of the vector x is nothing other than the multivarient integral xi to the k, f (x1 up to xn) dx1 to dxn. So it's worth asking if this definition is consistent with the prior definition, if we only knew, say for example, the marginally distribution of xi. So we have two definitions here, let's suppose I'm saying xi is the ith component of the vector x, and let's suppose we knew its marginal distribution. We would have that. Whereas over here, we're saying let's get this expected value by integrating over the entire joint distribution, the joint multivariate distribution with all of the components over all the components of X. So let me just say how would you get to that marginal distribution f(xi)? It's a little bit sloppy notation to use f for both the joint and the marginal, but I hope that you understand what I mean. I'll write that a little bit better. F(xi). So the way we get at that marginal distribution, let me just put a little m over it, is we would take the joint distribution and integrate it all the way up over all the variables except for xi. Okay, so that's how we would get the marginal distribution. But then if you look at this multivariate definition of the expected value, we could reorder these collections of integrations into whatever order we want. And so we could first get rid of or do the integral with respect to every one of these variables except xi. And then we would just simply be calculating this marginal distribution and then we would be back to this definition above. So the point being that if you have let's say a bivariate normal distribution and you calculate the expected value of one of the components of that bivariate normal, you get the same expected value as if you have only the associated marginal distribution associated with that normal distribution for that one variable, so everything is consistent. If you do it not with the n dimensional joint distribution but you find the distribution of just five components, for example, of that vector, and Xi happens to be one of those five components and you calculate the expected value over those five components, you're still going to get the same answer. So in other words, the expected value of an element of a random vector is the same no matter which collection of either the joint or whatever joint distribution or the marginal distribution for that variable that you use. It's always consistent. And then the basic summary as to what an expected value is to a joint distribution, it's of the random vector, is it just the element y's expected values, okay? So that's just to get you started on expected values, we're only going to have one particular distribution that we're really going to take expected values with respect to, and that is the multivariate normal.