Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by Johns Hopkins University

Mathematical Biostatistics Boot Camp 2

41 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Hypothesis Testing

In this module, you'll get an introduction to hypothesis testing, a core concept in statistics. We'll cover hypothesis testing for basic one and two group settings as well as power. After you've watched the videos and tried the homework, take a stab at the quiz.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

So let's actually go through this calculation.

Â So, remember, beta is the type two error rate, so 1 minus beta is the power.

Â And let's assume n is large, so that we

Â can just do standard normal calculations rather than t calculations.

Â Okay, so here, notice our s is replaced by

Â sigma, the true value of the standard, the population

Â standard deviation.

Â So here we have our test statistic, which is

Â X-bar minus 30 over sigma over square root n.

Â Which, under the null hypothesis that mu equals 30, is a Z statistic so our

Â rejection region since we're rejecting whether or not mu is larger than 30.

Â Will be as if this mean is large, hence this,

Â if this normalized mean is larger than a standard normal quantile.

Â If we wanted alpha level

Â error rate we would, we grab the Z1 minus alpha level standard normal quantile.

Â So for example if we wanted a 5% error rate, we would pick the 1.645, the number

Â 1.645, which is the 95th percentile of the standard normal distribution.

Â Then 1 minus beta, right, is

Â the probability that we reject, the probability

Â that the statistic is larger than the quan, the cutoff, the critical value.

Â Under the alternative hypothesis, given that mu, is in fact, mu a.

Â So, now this statistic is no longer a Z statistic,

Â because we're considering the alternative hypothesis, not the null hypothesis.

Â it, it's a normal of course, if, if the, if the data is Gaussian distributed.

Â And this is, of course, still normally distributed,

Â just with a different mean.

Â And if n is large, and we're applying the Central Limit

Â Theorem, then this is, again, not converging to a Z statistic.

Â So, what we need to do is convert it to a Z statistic.

Â So, the easiest thing to do would be to maybe add, subtract

Â the mean under the alternative, and we do that on this line here.

Â Then in the next line, we simply take the correctly, normalized mean, X bar minus

Â the mu under the alternative, which is what we are assuming to be true.

Â Divided by the standard air single, square root n, and now, we're calculating the

Â probability that, that is larger than Z1

Â minus alpha, minus this quantity over here.

Â mu a minus 30 over sigma over square root of N.

Â Now again, this standardized mean is a Z

Â statistic, because we're doing the calculation under the alternative.

Â So we want

Â the probability Z is larger than this quantity over here, which

Â we can perform this calculation, because we know Z1 minus alpha.

Â But we're assuming we know sigma.

Â We of course know n, and we know 30, of course.

Â so mu a is that only thing we have to plug in.

Â And that is the fact about power calculations,

Â is that you have to plug in the particular

Â value, the mean, that you're interested in.

Â Okay, so let's actually do a specific version of this calculation.

Â And suppose that we want to calculate the power of detecting an increase in the

Â mean RDI of at least two events per hour, above our null hypothesis of 30.

Â So we, we want to be, we want to calculate what's the

Â power if the, the alternative mean, the population av, the population

Â mean or Respiratory Disturbance Index is 32.

Â When our null hypothesis is that it's 30, and

Â we'd like to calculate the power of detecting that.

Â Now, again, under the assumption where the type one error rate is 5%.

Â So, again, assume normality and at the sample question

Â will have a standard deviation of four, events per hour.

Â What will be the power if we took a sample of size 16?

Â Okay, so, here are Z1 minus alpha is 1.645 are

Â mu a minus 30 over 4 over square root of 16, works out to be 2.

Â So, we want the probability that a standard normal is bigger than

Â 1.645 minus 2, just the probability of standard normal is bigger

Â than negative 0.355 which is 64%. So, under these

Â set of assumptions the probability of detecting an alternative of

Â two events above the hypothesized value per hour is 64%.

Â And this is, of course, a bound if the, the power only

Â gets larger as the alternative goes away from 30 events per hour.

Â This is, this makes sense of course, right?

Â Because, the, the, the bigger the difference is from

Â the null, the easier it should be to detect, right?

Â If, if the true population mean is 100 events per hour, we

Â shouldn't, you know, we should have a high probability of detecting that.

Â A higher probability of detecting that than if the true mean is 30.01

Â events per hour which seems like it would be very hard to detect.

Â relative to 30,

Â because it's such a small change. so this power,

Â 64%, is a bound for all values above 32. So

Â instead of calculating power given a sample

Â size a variance, and a value

Â of the alternative. We could flip the question

Â around and say, imagine we have a power that we'd

Â like to achieve for a particular value of the alternative.

Â What sample we, what sample size would we need to achieve it?

Â And this is called a sample size calculation.

Â It, in both of these calculations are typically

Â done at the phase of designing the study.

Â So when you actually want

Â to figure out how many subjects to have in the study.

Â Or whether or not to conduct the study, if your number of subjects is constrained.

Â so here we do this calculation exactly, where we calculate

Â the sample size we would need to get 80% power.

Â 80% is a very, is a very common benchmark standard in the field of science.

Â You can argue whether 80% is enough power,

Â but it is somewhat of a benchmark in, in, say, for example, clinical trials.

Â I would admit though, most clinical trials do two-sided

Â tests, and here we're calculating power for a one-sided test.

Â so here we want 0.8 to be the probability that our test statistic.

Â Which appropriately normalized as a Z statistic under the alternative, is

Â greater than this the, the, the standard normal quanta Z 1 minus alpha.

Â But remember when we converted our test statistic

Â so that it was normalized appropriately under the alternative.

Â We have this extra term mu A minus 30

Â over the standard error, sigma over square root n.

Â And this calculation is then of course calculated under the alternative.

Â Which is when the, the, the, when we normalize with

Â respect to mu, mu a, and you get this extra term

Â out here.

Â So if we want this probability to be 80% then we know that the entity here

Â on the right, Z1 minus alpha minus mu a minus 30 over sigma over square root n.

Â We know it has to be.

Â It has to be equal to the

Â 20th percentile from the standard normal distribution right?

Â So we have to set and so we know what Z1

Â minus alpha is, we know what mu a is, we obviously

Â know what 30 is and then sigma over square root N.

Â We know all of those except n and we can just solve for n.

Â and that is a so called sample size calculation.

Â And you know, logic would dictate, and of course the mathematics works out this way,

Â that if we solve this power calculation for a particular value of mu a.

Â It's going to

Â be applicable for every larger value of mu

Â a, because the direction of the alternative is larger.

Â We're not going to need a smaller or we're going to

Â need a smaller sample size to detect bigger effects.

Â so once we, we do this calculation for a specific value of mu a, it holds.

Â That sample size will give us 80% power or higher, for all larger

Â values of mu a.

Â And, so usually you pick mu a to

Â be the smallest effect that you could reasonably detect.

Â That you would reasonably want to detect.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.