Let's talk a little bit about multivariate variances and covariance. So we're going to define for random vector X the variance of the random Vector X, which says N by one is going to be the expected value of the outer product of X minus mu. With it self so where here mu is equal to the expected value of the, it's the vector expected value of x. So this quantity here has the property that for example, the diagonal element or the first diagonal element is expected value of x1- mu 1 squared. Okay. So it's just the ordinary variance of the first element of the vector. The second entry, second diagonal entry of this matrix is just the expected value of X 2 minus mu 2 squared. The first off diagonal element of this matrix in either above the diagonal or below the diagonal, it's just the expected value of X 1 minus mu 1, times the expected value of X 2 minus mu 2 and that is exactly the covariance between X 1 and X 2. So, the ajith element of this matrix is the co-variance between the I element of the vector X and the J element of the vector X. So this quantity is called the variance co-variance matrix. And just like the variance calculation for Univariate random variables has a shortcut formula, the variance calculation for multivariate random variables also has a shortcut calculation. So the variance of x is expected value of (x-mu)(x-mu)transpose. So let's use our rules, so that's the expected value of x, x transpose- x mu transpose minus mu x transpose + mu mu transpose. So that's equal to the expected value of x x transpose. Then this quantity, mu is not random. So we can pull it out of the expected value. And the expected value is a linear operator, so it moves across these sums. So we can write this as the expected value of x times mu transpose minus mu times the expected value of x transpose. And then mu mu transpose has nothing random in it, so that's mu mu transpose. But this is just mu mu transpose because remember mu is defined as expected value of x. And this just just mu mu transpose again so we get -mu mu transpose -mu mu transpose + mu mu transpose. So we get that the shortcut formula is the expected value of the outer product of the x's- tthe outer product of the expected value of the axis, okay? So that's a simple shortcut formula. The variance has nice properties, not unlike the mean. It would be nice if the variance was a linear operator but it's not. So we cannot say, for example, that variance of x + y is equal to the variance of x + the variance of y. So we cannot say that unless the vectors x and y are mutually independent. Or uncorrelated at least. So, what we can say is that first of all that the shifts in the variance, shifts have no affect on the variance. So if we take the variance x and shift it by a constant vector b that's just the variance of x again. Just like in univariant cases of course. The distribution of a random variable if it looks like this. It's going to have the same variance if we shifted over by a little bit. So the variance doesn't change if we shift and then another important property is the variance of A times X if we have a variable that we'd like to pull out of a variance that is equal to A variance of X. A transpose. So when we pull matrix or vector out of a variance it sandwiches the variance. And so you get, A is the bread, and then the variance of X part is the meat. So, it sort of sandwiches the variance and when you pull it out it has to go in both directions with A and A transpose. Look back at the definition one more time, I also want to point out that the. Variance covariance matrix is clearly symmetric. If you were to take, for example, the transpose of this matrix. Remember, that transpose has moved into expected values. You'll find that you get the expected value of the same exact thing. So it is symmetric which is a good thing because we know that for example the IGA Off-diagonal covariance x i and x j is equal to the covariance of x j and x i, the bivariate covariance operator is exchangeable with its arguments, so it has to be symmetric. So it's nice that we can see that property very directly. So those are some of the key things to note about multivariate variances, or variance of vectors. And we'll use these facts a lot throughout the class. So it'd be nice to commit, especially this formula right here about pulling a matrix out of a variance calculation. That's quite useful.