Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

Loading...

From the course by Johns Hopkins University

Mathematical Biostatistics Boot Camp 2

41 ratings

Learn fundamental concepts in data analysis and statistical inference, focusing on one and two independent samples.

From the lesson

Techniques

This module is a bit of a hodge podge of important techniques. It includes methods for discrete matched pairs data as well as some classical non-parametric methods.

- Brian Caffo, PhDProfessor, Biostatistics

Bloomberg School of Public Health

So, I've been using the word confounder a lot and confounding.

Â So, let me define it.

Â so, variables that are correlated both

Â with the explanatory and response variables

Â are confounders and they can distort an, an effect, an estimated effect.

Â so in this case victim's race was correlated with both the

Â defendants race and whether or not the death penalty was executed.

Â and this is a,

Â this is I think, kind of an old school definition of confounding.

Â There's, there's a modern definition given by causal

Â inference that causal inference classes that, that I think

Â you, you, you might be if you are interested

Â in the field of statistics will be worth learning.

Â I think ultimately with a, with a

Â confounder here, we, we haven't really distinguished

Â between something that's causally related with the race in the death penalty

Â versus something that has a statistical association with race and death penalty.

Â In this class, we're mostly going to be talking about, things

Â that have a statistical association with the explanatory and response variables.

Â Where there is kind of a plausible causal connection between them.

Â Okay.

Â so, you know, again, putting aside the rather

Â difficult and lengthy discussion of what is a confounder?

Â how do we select our confounders, this, you know, how do

Â we, you know that, that discussion, putting it to the side.

Â Let's assume we have a single confounder, how do we adjust for it?

Â Well, you know, there's several ways regression is probably the

Â biggest and most common way to, to, to adjust for confounders.

Â But, the kind of an old school

Â way in categorical data analysis.

Â Is to stratify with a confounder and

Â then co, combine the straightest specific estimates.

Â And so requires, this requires appropriate

Â wei, weighting of the straightest specific estimates,

Â and we'll talk in a minute about how do you do the appropriate weighting?

Â And unnecessary stratification has its own set of problems, so

Â you know again, you know just bringing back this discussion.

Â The, the solution to the confounding

Â problem is not just to stratify or adjust for everything in sight.

Â Right.

Â That's not the solution, because that has its own host of consequences.

Â you know, for example in, in any of these Simpson's paradox

Â examples imagine a giant database.

Â And you're interested in, say, the death penalty.

Â and you had a giant database with lots of other other variables.

Â For sure you could find one variable that reverses the association, but has no

Â bearing on whether or not a person received the death penalty, right?

Â So, so, adjusting for that confounder will reverse the association.

Â But has no real

Â business for being adjusted for.

Â And so, it's a hard topic you know admittedly it's

Â a hard topic of selecting confounders and achi, achieving balancing.

Â Between the right amount of confounder adjustment and the

Â with over adjustment, balancing between that and over adjustment.

Â so let's stipulate for the time being, we

Â have a confounder and we want to adjust for it.

Â And what I really want to talk about is the method

Â for stratifying and then combining stratus with specific estimates.

Â And then because then we will be able to teach you some nice methodology.

Â And then, as we take more statistics, courses you'll learn

Â more about the delicate surgery of, of dealing with statistical confounding.

Â Okay, so I, here I have aside, but it's an important aside,

Â suppose you have two scales and what I mean by scales, I mean,

Â things for weighing objects.

Â And, let's assume both scales are so

Â called, unbiased, they both have some variance associated

Â with them, who weigh the same thing over and over again, you get different answers.

Â But one has a variance of one pound and the

Â other has a variance of nine pound, they're both unbiased.

Â so, confronted with weights from both scales,

Â would you give both measurements equal credence, so,

Â let's supposed we weigh an object.

Â And that our first weight was this variable X1.

Â And we're going to assume that it's normal.

Â Mu in the variance of the first scale, sigma 1 squared.

Â And then X2, because both scales are unbiased, we're going to assume that

Â it's normal and it has the same population mean, mu, and the same.

Â different variant sigma 2 squared and

Â let's assume both sigma 1 and sigma 2 are known,

Â we want to estimate mu this unknown weight of the objects.

Â Okay, so, we measured it with one scale with

Â one precision another scale with another precision, we're assuming

Â both scales are unbiased and that if we measured

Â the same object over and over and over again.

Â Again, the average would be about right. so, If, if we characterize

Â this in this way.

Â I'm hoping what everyone can do in the class is set up the likelihood.

Â Multiply the two.

Â add the 2 log likelihoods, or multiply the 2 log likelihoods, and take the log.

Â and then, come up with the fact that the log

Â likelihood from mu, disregard any terms that, that don't involve mu.

Â And I'm hoping everyone could come with the fact that the likeli,

Â the log likelihood for mu looks like this, bottom equation right here.

Â Okay.

Â And you know, you can, let's solve for a maximum likelihood estimate

Â so, the easiest way to do that right now would be to take the derivatives, set the

Â derivatives equal to zero and you get this answer.

Â X1 times r1 plus x2 times r2 divided by r1 plus r2, or in other words, x.

Â Times p plus x2 times 1 minus p, where p is r1

Â over r2 plus r2 and 1 minus p is r2 over r1

Â plus r2. And in this case, ri is 1 over sigma

Â squared sub i and then p is, of course, 1

Â over r1 divided by r1 plus r2. So, why does this makes sense?

Â This makes a lot of sense to me now.

Â but the first time you see it, you might say this makes

Â no sense but let me describe why this makes a ton of sense?

Â Okay, so notice what each ri is. It's 1 over the variance.

Â So,

Â if let's say sigma one

Â is huge. In other words that scale

Â stinks, it has this huge variants, then the weight r1

Â times x1. The weight given to the measurement from

Â that scale is very low. And then conversely you know, if, if sigma

Â 2 is very small.

Â Then, I get r1 which is 1 over sigma squared will be a huge number

Â and then, x2 is given a gigantic weight and then we divide by r1 plus r2.

Â So that, so that when we weight these two things.

Â X1 and x2.

Â We, we get a convex combination, p times x1 plus 1 minus p times x2.

Â So, that it's an average.

Â It's just a weighted average.

Â Okay?

Â and by the way you can do this always

Â if you want to take a generalized form of average right?

Â You, you know, then you want r1 is the weight for x1, and

Â r2 is the weight for x2, they have to be positive, of course.

Â and you want to turn it into an average, then divide the whole by r1 plus r2

Â and then you'll turn it into p times 1 and 1 minus p times the other.

Â If r1 equals r2, they'll be the strict arithmetic average of the two numbers, if

Â r1 is different from r2, it will weigh one of the observations more than the other.

Â Well, the, the answer in this case is that

Â we want to weight by the inverse of the variance

Â giving high variance measurements, low weight and low weight

Â and low variance measurements high weight, which to me then.

Â And makes a ton of sense.

Â Okay. So, any way, the general principle.

Â Instead of averaging over several unbiased estimates, take an

Â average weighted according to the inverse of the variances.

Â And this is so ingrained in statistical

Â practice now that people do it without thinking.

Â They don't go through this exercise of deriving maximum likelihood equations.

Â so, in our case, sigma 1 squared was 9, was 1.

Â Sigma 2 squared was 9. So, in this case, you can work it out.

Â P works out to be 0.9.

Â So, it works out to be 0.9 times

Â the first measurement plus 0.1 times the second measurement.

Â The first measurement getting a lot more weight.

Â Because the scale you know?

Â Has is, you know? Has 1 9th the variance of the other one.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.