which is a measure of the spread of the distribution,

plus a constant, that is one half log 2 pi e.

In particular, as sigma, the standard deviation, tends

to 0, the differential entropy tends to minus infinity.

[BLANK_AUDIO]

The next 2 theorems, are the vector

generalizations of theorem 10.43, and 10.44 respectively.

Let X be a vector of n

continuous random variables, with correlation matrix K tilde.

Then the differential entropy of X is upper bounded by one half log two pi e

to the power n times the determinant of the correlation matrix K tilde

with equality, if and only if X is a Gaussian

vector with mean 0 and covariance matrix K tilde.

[BLANK_AUDIO]

Theorem 10.46 says that, for a random vector with mean nu and

covariance matrix K, the differential entropy is upper bounded by one half log 2 pi e

to the power n, times the determinant of K, with equality if and

only if X is a Gaussian vector with mean mu and covariance matrix K.

[BLANK_AUDIO]

We now prove theorem 10.45.

Define the function r_{ij}(x) to be x_i times x_j, and let

the (i,j)-th element of the matrix K tilde be k tilde ij.

Then the constraints on the pdf of the random vector X,

namely the requirement that the correlation matrix is equal to K

tilde, are equivalent to setting the integral of r_{ij}(x)

f(x)dx over the support of f to k tilde ij.

[BLANK_AUDIO]

It is because r_{ij}(x) is equal to x_i times x_j, and so this

integral is equal to the expectation of X_i times X_j,

that is the correlation between X_i and X_j, and this is for all i,j between 1 and n.

[BLANK_AUDIO]

Now by theorem 10.41, the joint pdf that

maximizes the differential entropy, has the form,

f star of x equals e to the power minus lambda_0,

minus summation over all i and j, lambda_{ij} x_i times x_j,

where x_i times x_j is r_{ij}(x).

[BLANK_AUDIO]

Here, the summation over all i,j, lambda_{ij} times x_i

times x_j can be written as x transpose L times x

[BLANK_AUDIO]

where L is an n by n matrix, with the (i,j)-th element equal to lambda_{ij}.

[BLANK_AUDIO]

Thus, f star is the joint pdf

of a multivariate Gaussian distribution with 0 mean.

[BLANK_AUDIO]

To see this, we only need to compare the form of f

star of x and the pdf of a Gaussian distribution with mean 0.

[BLANK_AUDIO]

Then for all i, j between 1 and n, the covariance between X_i and X_j is

equal to expectation of X_i times X_j, minus

the expectation of X_i times the expectation of X_j,

where the expectation of X_i is equal to 0,

and the expectation of X_j is also equal to 0.