0:00

In this video we'll introduce the t-distribution and

discuss its origins and mechanics.

In a nut shell, the t-distribution is useful for

describing the distribution of the sample mean

when the population standard deviation, sigma, is unknown, which is almost always.

We'll start our discussion with a review of conditions for inference so

far as a motivation for why we need this new distribution.

What purpose does a large sample serve?

As long as your observations are independent and

the populations distribution is not extremely skewed, a large sample is going

to ensure you have a sampling distribution of the mean that is nearly normal.

And that the estimate of the standard error is reliable.

Remember, we estimate the standard error of the sampling distribution

as s over the square root of n, where s is the sample standard deviation.

That is the best estimate we have for

the unknown population standard deviation sigma.

If the sample size is large enough, chances are s is indeed a good estimate

for sigma, and therefore your overall standard error estimate is reliable.

But what if the sample size is small?

You might be thinking in the age of big data why are we talking about small

samples.

It is true that in certain disciplines,

especially when data are automatically recorded like webpage clicks or

Twitter stream, small sample sizes might be irrelevant.

However, there are certainly disciplines where this is not the case.

Think for example about a lab experiment or

a study that follows a near extinct mammal species.

So we need methods that work well for both large samples and small samples.

The uncertainty of the standard error estimate

is addressed by using the T distribution.

This distribution also has a bell shape, so it's unimodal and symmetric, and

it looks a lot like the normal distribution.

However, its tails are thicker.

Comparing the normal and

t-distributions visually is the best way to understand what we mean by thick tails.

Notice that the peek of the t-distribution doesn't go as

high as the peek of the normal distribution.

In other words, the t-distribution is somewhat squished in the middle and

the additional area is added to the tails.

This means that under the t-distribution observations are more likely to fall

two standard deviations away from the mean than under the normal distribution.

Meaning that confidence intervals constructed using the t distribution will

be wider, in other words more conservative

than those constructed with the normal distribution.

3:18

Let's talk practicalities next.

How do we actually use the t distribution in the statistical inference?

The answer is simple.

Use the t distribution for inference on a single mean or for comparing two means

when the population standard deviations are unknown, which is basically always.

Calculate the T statistic just like you would calculate a Z statistic

as the sample mean minus the null value divided by the standard error of

the sample mean.

And find the P value as the probability Of observed or

more extreme outcome given that the null hypothesis is true, just like before,

except using the T distribution instead of the Z distribution,

using R the distribution calculator app or a table.

4:01

First a little bit of mechanics.

We're going to calculate three probabilities.

A, probability that the absolute value of Z is greater than two,

which is .0455 B, probability that the absolute value of

t with 50 degrees of freedom is greater than 2, which is 0.0509.

Remember, we talked about thicker tails and

a higher percentage of observations falling further than 2 standard

deviations away from the mean under the t-distribution.

We're starting to see the effect of this with the larger

tail area under the t-distribution.

Let's take things a little further and decrease the degrees of freedom to 10.

The new probability is 0.0734.

In summary, as we go from

the normal to a t distribution with a somewhat high degrees of freedom

to a t distribution with low degrees of freedom, the probability of the test

statistic being more than two standard deviations away from the mean Increases.

Next, suppose you have a two sided hypothesis test, and

your test statistic is 2.

Under which of these scenarios would you be able to reject the null hypothesis

at the 5% significance level?

For the first scenario we have a P value of point, 4.55 % which

is indeed less than five % so, we would reject the null hypothesis.

We would also like to mention that this P value is pretty close to 0.05.

In the second scenario the P value is greater than five % so,

we would fail to reject the null hypothesis though again we

would mention that the P value is pretty close to five %.

And in the last scenario we would definitely fail to reject the null

hypothesis.

We can see that as we get more conservative with a t distribution with

lower degrees of freedom,

we also become less likely to be able to reject the null hypothesis.

We'll discuss how to calculate degrees or freedom for a particular study or a data

set in the following videos but generally degrees of freedom is tied so sample size.

Meaning that if your sample size is low, it is not as easy to reject the null

hypothesis and that stronger evidence is needed in order to be able to do so.

6:07

Before we get to working with the t distribution and

using an inference examples, let's pause for a moment and

talk about where this distribution comes from.

It actually has a peculiar name, the student's t distribution.

This name come from a student name student used by William Gosset

in papers where he develop much of the foundation for this distribution.

William Gosset was the head experimental brewer at Guinness brewing company

in the early 1900's and his main role was to experimentally brew and

gradually improve a consistent and economical barrel of the Guinness stout.

This required sometimes working with small samples because maybe he would just

have a few batches to try.

Therefore, much of the development of the t-distribution comes from trying to make

the Guinness beer taste better.

Since the Guinness company was worried about their trade secrets getting out,

Gosset was asked to publish any work that he was doing under a pseudonym and

Student was the name that he chose for himself.

While others, like Fisher continued to work on the t-distribution,

even Gosset's foundational work.

It's named after his pseudonym student.

So next time you're having a pint of Guinness, say cheers for statistics.