But if you're selling a diamond, you might be interested in knowing, okay not

if I collected all the diamonds of this particular mass and

took the average price that they were valued at.

Not that, but if I were to sell this particular diamond,

what's the range of possible values?

That would be reasonable as a price for this diamond and that's a different thing,

so there's a difference in this context between a confidence interval for

the mean value, in other words the value of the line or the plane or whatever,

at that particular collection of X values versus a prediction

that incorporates the uncertainty that is included in the Ys themselves, okay.

So imagine we want to predict Y naught,

which is the price of this diamond for this particular mass,

where we haven't actually observed the Y at this particular value of X naught.

Think of it as a new value of Y.

Well think about the the quantity y

naught- x naught beta-hat, okay?

That's the difference between our actual y naught at that particular

value of x naught, the new realized value of y and

what we would predict at this value of x naught, where, not beta naught, just beta.

Where again our beta-hat hasn't used this y naught in its calculation, okay?

So now the variance of

this is now the variance of y naught +

the variance of, let's say y naught hat.

Okay, and I can move that variance across that sum again,

because this beta hat didn't involve that y naught.

This potential new value of y naught in its calculations, so they're independent.

Well this variance of Y-naught is sigma squared plus the variance of Y-hat,

we just did that a second ago, that sigma squared, x-naught,

x transpose x inverse, x-naught.

There should be a transpose there, okay?

So if I wanted to estimate this variance it

would be sigma squared times 1 plus, x naught transpose.

X transpose x inverse x naught.

And then what I'm going to ask you to do for

homework because it should be old hat for you now,

is to prove to yourself that y naught minus x beta-hat over S square root,

1 plus X naught transpose, X transpose X,

inverse X naught, follows the T-distribution,

with N minus P degrees of freedom.