Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by Johns Hopkins University

Mathematical Biostatistics Boot Camp 2

41 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Hypothesis Testing

In this module, you'll get an introduction to hypothesis testing, a core concept in statistics. We'll cover hypothesis testing for basic one and two group settings as well as power. After you've watched the videos and tried the homework, take a stab at the quiz.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

So the, the Z test that we're talking about requires the assumptions of

Â the CLT and for n to be large enough for it to be applicable.

Â If n is small, then you could just do Gossett Student's T test.

Â In the same way, you're just replacing

Â the normal quantiles with the Student's T quantiles.

Â the probability of rejecting the null

Â hypothesis when it's false is called power.

Â Remember, we set the type

Â one error rate, which is the probability of rejecting the null hypothesis when

Â it's true, so we force the type one error rate to be small.

Â The type two error rate, which is the probability of failing to reject the

Â null hypothesis when in fact the null hypothesis

Â is false is called a type two error rate.

Â Power is 1 minus that, it's the probability

Â of rejecting the null hypothesis when it is false.

Â And so, power is a good thing.

Â You want to reject the null hypothesis when it's false.

Â And unfortunately, power is not typically under

Â our control after the experiment has been conducted.

Â so the way that people combat this is prior to conducting

Â the study, they do a power calculation where they vary the sample

Â size or if it's simple enough, just calculate the sample

Â size needed to obtain a certain level of power using guesses for what they think

Â the standard error and, and hypothesize hypothesized

Â significant effect would be.

Â And that's what we'll talk about next lecture.

Â Okay.

Â So let's actually go through the T calculation for this example.

Â suppose that n is 16 rather than a 100 as we were considering

Â before so we have to so we're going to use a T test.

Â then, look at this equation right here. we want 5% to be the probability

Â that X bar minus 30, the value under the null hypothesis,

Â divided by the estimated standard error now, s over square root 16.

Â we want to do the probability that that quanitity is

Â larger than the t quantile now instead of the z quantile.

Â Again, the 1 minus alpha quantile with 15 degrees of freedom.

Â So our test statistic now is this standardized observed

Â mean so 32, our observed mean, minus the hypothesized value

Â divided by the standard error 10 over square root 16.

Â Square root 16 then moves up in the denom, in the numerator and that works out to

Â be 0.8 and the t critical value is 1.75.

Â And so now we, we fail to reject and it, and it, and it shouldn't be surprising,

Â right, we're changing what used to be multiplication by a square root

Â 100 to now square root 16. And so, the test statistic went

Â down substantially while the quantile that we're comparing it to went up.

Â Because remember, the t is a heavier tail distribution than the

Â normal, so it shouldn't be surprising that we now fail to reject.

Â Okay. So in the previous

Â slide, we did the one sided tests.

Â Let's now do the two sided tests, and we're

Â going to move through these things quickly because I'm hoping

Â at this point in the class that you're getting,

Â you'll be getting used to these kinds of calculations.

Â So let's, we want to now test whether mu is different from 30 as the alternative.

Â And maybe you could say that doesn't make a lot of sense in this case

Â because the way I framed the problem

Â was that we're looking at a particularly susceptible

Â population to having a high RDI so why aren't,

Â why don't we just test mu greater than 30.

Â And well, let's just, for the sake of argument,

Â just to show you the calculations do different from 30.

Â But also, I would say that in

Â many journals and avenues of scientific inquiry, they

Â demand two sided test even if the one

Â sided test is the natural direction to consider.

Â so let's do a two sided test.

Â So what we want is to test whether or not our observed mean X bar is significantly

Â different from our null hypothesized value of, for the population mean 30.

Â So that would if it's significantly larger than 30 or significantly smaller than 30.

Â So, we could just say, well, maybe we will look

Â at the absolute value, X bar minus 30, which would look

Â at whether it's too small too small below 30 or too large above 30.

Â And then of course, because we you know, we want to, to standardize our statistics.

Â We're going to divide by the standard error of the mean, s over square root 16.

Â And we know that X bar minus 30 over s over square

Â root 16, if the data are iid Gaussian, that follows a t distribution.

Â And so, if we want alpha, the type one error rate, to be specified

Â so that the probability that this test statistic is too large or too small,

Â the probability of that occurrence is exactly

Â alpha, well what we could then pick

Â is the t quantile t1 minus alpha over 3 and 15 degrees of freedom.

Â And what this does is this says, this random t statistic,

Â the probability of it being larger than

Â this quantile, is alpha over 2 probability.

Â of the, the positive part of this

Â statistic, the probability of it being larger than

Â the, the the, the, the t1 minus alpha

Â over two quantile gives alpha over 2 probability.

Â The probability that this test statistic on the negative end is less

Â than neg, the, the t alpha over 2 quantile with 15 degrees

Â of freedom which is a negative value is also alpha over 2.

Â So we put alpha over 2 in the lower tail, alpha over

Â 2 in the upper tail, and that yields a total probability of alpha.

Â And in the next slide, I'll describe that a little bit.

Â And this calculation is, of course, all done

Â under the null hypothesis that mu equals 30.

Â so we'll reject if our test statistic, which in this case, X bar minus 30

Â over s over square root 16 is 0.8.

Â So, when we take the absolute value, it remains 0.8.

Â And we're going to reject if it's either too large or too small.

Â But again, remember, the critical value is calculated now using

Â alpha over 2 rather than alpha because we want alpha over

Â 2 probability of rejecting for too large, and alpha over 2

Â probability for rejecting if the test statistic is too small, small

Â negative.

Â So, in this case, the critical value is 2.13 and notice, of

Â course, that's a larger value than when we just use alpha, because

Â we're going further out into the tail, so it's harder to reject

Â for the two sided test than it is for the one sided test.

Â So since we rejected for the one sided test,

Â we're, of course, going to reject for the two sided test.

Â Okay.

Â Let's just briefly again show you the calculate,

Â the two sided calculation in where the alpha over 2 comes from.

Â So here, I'm setting a sequence of x values from minus 4 to plus 4.

Â I'm evaluating the t density with 15 degrees of freedom at those

Â points, and then let me plot. And there's my t distribution.

Â Okay, now I'm going to shade in that area right there.

Â That's 2.5%.

Â And let's say my alpha, my type one error rate that

Â I want is 5%, and that value right there is 2.13.

Â So for the t distribution, the 97.5th

Â quantile is 2.13 with, when you have 15 degrees of freedom.

Â Then let's do the same thing for the lower quantile.

Â sorry about that.

Â [SOUND]

Â There we go. That's

Â better. And that's 2.5%

Â right there and that's negative 2.13 and

Â then that's 95%. So what we're saying is we calculate

Â the our normalized test statistic X bar minus 30

Â over S over square root 16 and the probability that the

Â absolute value of that statistic is bigger than 2.13.

Â Or in other words, the positive, the probability that that statistic is too

Â large positive above 2.13 is 2.5%. Or too small negative is 2 point

Â negative too small negative in the form of being less than negative 2.13 is 2.5%.

Â So the probability that it's absolute value is bigger than 2.13

Â is 5% including the upper tail 2.5% and the lower tail 2.5%.

Â So that the probability we, we, the test

Â statistic lies in the rejection region is 5%.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.