Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by Johns Hopkins University

Mathematical Biostatistics Boot Camp 2

41 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Hypothesis Testing

In this module, you'll get an introduction to hypothesis testing, a core concept in statistics. We'll cover hypothesis testing for basic one and two group settings as well as power. After you've watched the videos and tried the homework, take a stab at the quiz.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

Hi, my name is Brian Caffo, and this is mathematical biostatistics boot

Â camp lecture three, where we're going to talk about two sample tasks.

Â Okay, so today were going to first talk about matched

Â data, then were going to talk And matched data brings up

Â a kind of an interesting immediate discussion on the topic

Â of regression to the mean, which has a very famous history.

Â And then we're going to talk about two independent groups.

Â So, when you're comparing two groups, one of the first things you

Â want to ascertain is whether or not the data are paired or not.

Â And so it's, it's a little less.

Â A little less clear than you might think at

Â first so, so the most obvious instance where data

Â are paired, is when the observations are both on

Â the same subject so take as an example a

Â trial where you have a treatment, say some medication.

Â in one case imagine if you randomize that treatment to one group of

Â treatment and randomize say a placebo to the other, to another group of people.

Â And that's clearly not paired.

Â You have one group of people that received the treatment

Â and another group that received the other treat, the placebo.

Â Instance where its obviously paired would be, imagine if you, in some random order

Â for every person, gave them the treatment and had maybe a washout period and

Â then gave them the placebo where for some people you gave them the placebo first.

Â And then had a wash out period and then gave them the treatment.

Â In that case its obviously paired because you had

Â the same person recieving both the treatment and the control.

Â So, in both those cases the question would be to compare the

Â treatment versus the placebo but in one case you would have two

Â independent groups of separate people and in the other case you would have.

Â each person receiving both, so in the, in the, in the latter case the

Â would be paired and in some cases pairing is maybe a little less obvious, so

Â for example what might happen is you're

Â not looking at something that's assigned and

Â let's say you want to compare, let's say, people with high blood pressure to people

Â with low blood pressure.

Â [SOUND]

Â And, You, you wanted to compare, some other characteristic.

Â let's say, there, We,

Â [INAUDIBLE]

Â let's see. We used sleep as an example the other day.

Â So imagine if you wanted to, to compare

Â the low blood pressure people to the high blood.

Â [NOISE]

Â Pressure people and compare their RDI, or their respiratory disturbance index.

Â well that seems like it would be unpaired

Â because you would have the low blood pressure people

Â would be one group of people and the high

Â blood pressure people would be another group of people.

Â but they could be paired because often what

Â people do in these kind of experiments is they

Â get say their low blood pressure people, and

Â then for every person in that, in that group,

Â They then try to match in terms of age and

Â gender and other characteristics, try to closely match for every person.

Â Another person in the high blood pressure group.

Â In that case it would be sort of softly-paired in that,

Â that they would've, wouldn't be the exact same person of course 'cause

Â that person that, that, they won't give an instance, can't have

Â low blood pressure and high blood pressure True at the same time

Â but what you could have is where a, a

Â person one from the low blood person group is very

Â closely matched in terms of age and other characteristics to

Â and weight or whatever to the high blood pressure person.

Â And then it's, it's they are paired, they're just paired in a different way.

Â so that that's called matching.

Â When you, when you try to take for every person in one group

Â find another person in another group that has very similar

Â characteristics for all these other variables that you're maybe not

Â interested in but you think could contaminate the comparison, that

Â process is called matching and its a very effective way.

Â To to deal with a variable as I mentioned

Â variables you think might confound the relationship that you want one to look at.

Â So in any case the, the, the discussion in this slide for today is

Â to simply you know, to distinguish between

Â two different types of two group comparisons.

Â The two main types of two group comparisons.

Â One incidence is where observations are

Â paired, either by matching or something else.

Â OK, so lets consider the instance where the observations are paired and this is

Â the easy case because then everything reduces

Â down to basically a one sample problem.

Â So, when the observations are paired, one strategy is

Â to take the difference between the paired observations and

Â do a one-sample t-test of the null hypothesis that

Â the population mean difference is zero versus the alternative,

Â that the pop, population mean difference is non-zero

Â or, or one of the other two alternatives.

Â and then that would be a, very straightforward thing to do.

Â you're test statistic would just be the ordinary

Â one sample test statistic, the average of the distances.

Â the, minus the hypothesized mean, you, typically 0.

Â divided by the standard deviation of the differences.

Â Divided by the square root of the number of pairs of observations.

Â So, so you want to make sure that your calc, you put in the number of pairs of

Â observations in there, not the number of total number of observations.

Â Because when you start out you end up with 2n observations.

Â one for the first measurement, and the other

Â for the second measurement, so when you subtract them

Â you wind up with a half the

Â number of observations that you typically started with.

Â so at any rate, the, the, the.

Â The principal way to, to do this is to

Â simply take the differences, throw out the data, and

Â treat the differences as if they were a one-sample

Â test, and that's the, the ordinary paired two-group t-test.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.