A practical and example filled tour of simple and multiple regression techniques (linear, logistic, and Cox PH) for estimation, adjustment and prediction.

Loading...

From the course by Johns Hopkins University

Statistical Reasoning for Public Health 2: Regression Methods

75 ratings

A practical and example filled tour of simple and multiple regression techniques (linear, logistic, and Cox PH) for estimation, adjustment and prediction.

From the lesson

Module 4: Additional Topics in Regression

- John McGready, PhD, MSAssociate Scientist, Biostatistics

Bloomberg School of Public Health

So, in this section, we'll talk briefly about another use of propensity scores,

Â above and beyond standard adjustment.

Â By including the propensity score as a predictor in a multiple regression,

Â it also includes the predictor of interest.

Â We'll talk about a method called propensity score matching that

Â is sometimes used.

Â So hopefully, this lecture will give you a basic overview of

Â propensity score matching, its purpose, and situations in which it may be a better

Â alternative to a tris, traditional adjustment with multiple regression.

Â Or even with propensity scores the straightforward adjustment way like we

Â saw in lectures ten A and ten B.

Â So, again, the reason we have to adjust and, and the reason propensity scores can

Â be helpful is that in some non-randomized studies there is a very

Â specific outcome predictor association of interest, but because of the study design,

Â the observational nature, confounding is a threat.

Â So in seap, such situations, there may be other potential predictors, but

Â the research interest in other predictors is in, in terms of using them for

Â adjustment only.

Â The research is not concerned about the adjusted associations between

Â the outcomes and these predictors after adjusting, if you will, for

Â the main potential predictor of interest.

Â These are only used to better, get a better comparison,

Â a more comparable comparison of the outcome between comparable or

Â equivalent exposed and unexposed groups on the main predictor of interest.

Â However, in some such studies the exposed and unexposed groups.

Â As determined by the primary predictor levels are very different with regards to

Â their values, or distributions,

Â of potential confounders, there is only a subset of the unexposed, perhaps, that

Â have similar confounding distributions as those in the exposed groups.

Â In this scenario it may make sense to restrict the comparison of

Â an outcome between the exposed and exposed groups to this subset.

Â Let me show you an example where the distribution of propensity scores between

Â the unexposed and exposed leaves some gaps for

Â both groups but in this example, mostly for the unexposed,

Â you can see that there's a whole portion of the propensity score distribution.

Â Dwells where there's no crossover with the values among the exposed.

Â And there's also a little bit of the distribution here in

Â the exposed group that shares no values with the unexposed.

Â In this situation if we were to adjust either by the traditional approach.

Â Multiple regression of an outcome on the primary predictor of interest and

Â then including each potential confounder as separate predictors in

Â a large multiple regression model or

Â if we were going to do what we advocated doing in ten a and ten b.

Â Where we do a multiple regression of an outcome on the primary predictor of

Â interest and then include our only adjustment variable as the propensity

Â scores, either in quartiles or quintiles or however we choose to do that.

Â But when we have such differing distributions of confounders for some

Â subgroups of each group, this approach may estimate an artificial comparison.

Â Those exposed to the unexposed who are the same on all other variables.

Â All other variables mean either those used in the traditional multiple regression as

Â adjustment variables or those used to create the propensity scores.

Â So ostensibly when adjusting with the propensity score you're adjusting for

Â those variables that were used to create it.

Â So what do I mean by artificial comparison,

Â those exposed to unexposed were the same on all other variables.

Â Well, in this type of situation, there are at least some confounders where there do

Â not exist exposed and unexposed subjects with the same values.

Â Or there are some members of each group that do not share values with some,

Â members of the other group.

Â So again, in this example, it's more explicit here with the unexposed.

Â But you can see there's a hole.

Â Group more than 25% of the unexposed group, that has distributions

Â lower than the lowest observed propensity score value in the exposed group.

Â This may also cause analytic problems.

Â Because if you think about adjustment,

Â you think the process of adjustment is essentially breaking.

Â You know behind the scenes if we do a multiple regression where we adjust for

Â propensity scores it behind the scenes breaks them up into groups based on

Â their propensity scores.

Â Especially if we're adjusting by quartiles or quintiles.

Â And within each of the groups the propensity score groups it estimates the,

Â the outcome exposure relationship.

Â And then it averages those across the different propensity score groups.

Â Well, if there's some propensity score groups where there's no way to

Â estimate the outcome exposure relationship because there's no exposed members in that

Â grouping where there's very few relatively unexposed.

Â Then we may actually inflate the uncertainty of the estimate because we're

Â averaging in quantities that have very little precision to them

Â based on the imbalance of the observations with that propensity score,

Â between the two groups we hoped to compare.

Â So, the idea of propensity score matching is that for

Â each observation in the exposed group, match one with one or more

Â observations in the unexposed group, who have the most similar propensity scores.

Â And then compare the outcome between the exposed observations and

Â this subset of matched unexposed observations.

Â So what's generally done is something like this.

Â You'll see it in the examples I give you.

Â And it can be very complicated, depending on the relative number of persons in

Â the unexposed and exposed groups.

Â And thinking about what to do when there's more exposed members

Â than unexposed in a certain area of propensity score values etc.

Â But the basic idea is this.

Â What the researchers do is they take the propensity score distributions,

Â break it up into groups like we were talking about and then

Â within those groups they match everybody who has that value in the exposed group

Â to somebody in the unexposed group with the most similar value.

Â So this is called propensity score caliper matching, where they, the calipers

Â are the groupings if they could be quintiles, or propensity scores, etc.

Â The difficulties come, on how to approach this when there's,

Â in some groupings, or calipers, there's fewer unexposed persons than exposed,

Â then you have to go with plan B and

Â sometimes match multiple exposed persons to the same unexposed person.

Â And then handle the fact that you've duplicated a match in

Â the analysis later on.

Â But the general idea is that you break your, and prevents

Â the score distribution within the groups and do the matching within those groups.

Â So let's just look at two examples of where this is used.

Â And both these articles, actually if you look them up, if you go to Welch and

Â search them out, they spend a large amount of time talking about how they match.

Â Because like I said,

Â it can be complicated depending on what happens to the number of unexposed versus

Â exposed in the different subgroups of propensity score distributions.

Â Now let's look at this first one, abs, abstinence pledges and

Â subsequent sexual activity from an article published in Pediatrics in 2009.

Â And so in the abstract they say the objective is the US

Â government spends more than $200 million annually on abstinence promotion proble,

Â programs, including virginity pledges.

Â This study compares the sexual activity of adolescent virginity pledgers with

Â matched nonpledgers by using more robust methods than past research.

Â And they talk about how they recruited the subjects here.

Â This was based on the National Longitudinal Study of

Â Adolescent Health respondents.

Â A nationally representative sample of middle and high school students who,

Â when surveyed in 1995, had never had sex or

Â taken a virginity pledge, and who were greater than 15 years of age.

Â Adolescents who reported taking their virginity pledge on the 1996 survey were

Â matched with non-pledgers.

Â >> About two and a half times as many, and 289 of those who took a pledge and

Â 645 who do not, by using exact and nearest neighbor

Â matching within propensity score calipers, that was just what I was referring to.

Â Breaking the propensity score distribution up into bins, essentially.

Â Matching within propensity score calipers on factors including

Â pre-pledge religiosity and attitudes toward sex and birth control.

Â Pledgers and matched nonpledgers were compared five years after the pledge on

Â self reported sexual behaviors and positive test results for chlamydia,

Â gonorrhea and, Trich, Trichomonas Vaginalis, and safe sex outside of

Â marriage by use of birth control and condoms in the past year and at last sex.

Â So the results they found; five years after the pledge, 82% of the pledgers dis,

Â actually denied having ever pledged.

Â That's sort of an interesting finding unto its self.

Â Pledgers and matched non-pledgers did not differ in pre-marital sex,

Â sexually transmitted diseases and anal and oral sex variables.

Â Pledgers had 0.1 fewer past years partners, but

Â did not differ in the lifetime sexual partners and the age of first sex.

Â Fewer pledgers than matched nonpledgers used birth control and

Â condoms in the past year and birth control at last sex.

Â And so the, a researcher concludes that the sexual behavior of virginity pledgers

Â does not differ from that of closely matched nonpledgers, and pledgers are less

Â likely to protect themselves from pregnancy and disease prior to marriage.

Â Virginity pledges may not affect sexual behavior, but

Â they may decrease the likelihood of taking precautions during sex.

Â Clinicians should provide birth control information to all adolescents,

Â especially virginity pledgers.

Â So let me just show you what they talked about.

Â because this is a little bit of insight into what they did.

Â But the article is really detailed about how they did the matching.

Â Matched sampling is a nonparametric method for assessing program outcomes by

Â comparing a program group within similar nonprogram respondents.

Â We created a group of nonpledgers as similar as possible to

Â pledgers on all prepledged factors that may influence sexual behavior.

Â So, outcome differences between pledgers and

Â matched on pledgers cannot be attributed to pre-existing differences.

Â So they're trying to adjust for these.

Â Past studies compared self selected virginity pledgers with

Â the general population of non-pledges and attempted to ju, adjust for

Â the vast pre-pre, pledged differences by using traditional regression models.

Â The ones I was talking about, where what virginity pledge yes or

Â no was the primary predictor of interest.

Â Then all other potential control variables were entered individually into the model.

Â Both matching and regression yield associative rather than causal inference.

Â But matching creates more valid comparisons in results for three reasons.

Â First, regression models rely on dubious parametrics assumptions, which actually,

Â you know, that's, it's true that they have

Â structural resumptions that some of what should be talked about in regression.

Â But the fact is, quite frankly, because propensity scores are based on

Â regression estimation, there's some of that in this process as well.

Â So this doesn't let the researcher off the hook for that.

Â But regression on the whole cannot adjust, even on average, for

Â large differences between program and non-program groups.

Â So that's what I was talking about.

Â If there's certain subsets of persons in each of the groups where

Â there's no similar persons in the other group in terms of the propensity scores,

Â regression can have trouble estimating the difference between the exposed and

Â unexposed groups after adjustment with precision.

Â Second, matching computes outcome differences only once after

Â verification that the matched non program group is similar to the program group.

Â This separation assures that the model is selected independently of

Â the study res, results.

Â In contrast to regression with which it's impossible to

Â verify model correctness without seeing results.

Â And this gets it.

Â Some of the ideas of models selection.

Â We were talking about before.

Â If a researcher was looking at multiple repressions and

Â adjusting for various combinations that confounders.

Â He or she may iterate until they find the subset of confounders that

Â are statistically significant.

Â So they're actually trying to valid their results by seeing the results.

Â And here what they're saying is we first set up the matching algorithm without

Â estimating the outcome exposure relationship.

Â And once we're good with the matching algorithm we then go ahead and

Â do the adjustment.

Â Third, matching, and this is a thing about propensity scores as well, allows for

Â many more variables for adjustment than does straight up

Â regression where you include each of the variables as individual predictors.

Â In this study I controlled for

Â 112 variables which would be problematic in a regression with 289 pledgers.

Â So they took the information 112 variables and

Â reduced it to a single number through the propensity score.

Â So for these reasons, match sampling has been advocated for studies in medicine,

Â public health, and is used increasingly often in the medical literature for

Â situations, perhaps, like this, where there's a very

Â different confounder distribution between the groups being compared.

Â And I'm going to read to you a little bit more, sorry, I know you can read, but

Â I just, just think they do a nice job of.

Â Of reinforcing one of the ideas we said up here, so in an ordinary regression,

Â virginity pledges would be compared with all non pledgers, but

Â these groups differed one year before taking the pledge.

Â Comparing the 289 pledgers in all 31, 51 non pledgers,

Â that wave one, but for matching, pledgers were less sexually experienced and

Â expected more negative and pure positive.

Â Psychosocial effects of sex and

Â birth control use with lower birth control efficacy and knowledge.

Â Pledgers had greater levels of religious belief,

Â involvement, born again affiliation, more religious parents, and fewer

Â substance-using friends, and were more likely to expect marriage before age 25.

Â They reduced proportionally female, Asian with for, with foreign born parents and

Â had lower peabody ver, vocabulary scores.

Â And in this next poem it's not contiguous with the first.

Â To say turning to outcomes five years after the pledge 81.9% of virg,

Â virginity pledgers claimed to have never pledged.

Â Virginity pledgers and matched non-pledgers did not differ in 12 of 14

Â sexual behaviors, 3 of 3 STD test results and 4 of 4 marriage related outcomes.

Â Pledgers report an average of 1.09 past year vaginal sex partners.

Â 0.11 fewer than non-pledgers and

Â 2.31% fewer pledgers reported having been paid for sex than non pledgers.

Â Unmarried pledgers were less likely to report using birth control and

Â condoms in the last year and birth control at last sex but

Â did not differ in reporting condom use at last sex or in condom breakage.

Â So here's the table where they actually do the comparison.

Â So.

Â The idea is what we're seeing here is the adjusted results because these

Â are among the matched samples.

Â And what they're measuring in this column is the difference and the 95%

Â confidence interval for the difference either in the mean or proportion.

Â And then they call this the t test,

Â but as we know for comparing proportions we wouldn't call a t test.

Â But they're measuring here is something we're pretty familiar with, which is

Â the standardized distance between the two groups divided by their standard error.

Â So the idea is, if this is greater than 2 in absolute value,

Â the result is significant with a P value of less than 0.05.

Â So for example, if we look at sexual intercourse,

Â that's 72.66 of the pledgers engaged in sexual intercourse

Â by the time of the follow up study versus 76.24% of the non-pledgers.

Â So the pledgers had a lower proportion on the order of 3.6%, but

Â it was not statistically significant.

Â And similarly, there's other measures here, age at first sex.

Â Those who pledged were about a half a year older on average

Â age at first sex as compared to those who didn't pledge, the average is 20.7 years.

Â But this result was not statistically significant.

Â And the two things they highlighted as being statistically significant were

Â the number of past year partners was lower by about 0.11 on average and

Â was statistically significant.

Â Had a effect size or a distance measure of negative 2.45 and having ever been paid

Â for sex which was statistically significantly lower in the pledging group.

Â Let's look at one more example where they use propensity score matching.

Â Drug use and intimate partner violence.

Â So the objective, this was done in the American Journal of Public Health,

Â they say we examined whether frequent drug use increases the likelihood of subsequent

Â sexual or physical intimate partner violence, and whether intimate partner

Â violence increases the likelihood of frequent subsequent drug use.

Â They used a random sample of 416 women on methadone, and

Â they were assessed at three points in the study.

Â Baseline called wave 1, 6months called wave 2, and 12 months at wave 3.

Â Propensity score matching in multiple logistic regression were employed.

Â They found women who reported frequent crack use at the second stu,

Â study period wave two were more likely than non drug using women to

Â report intimate partner violence at wave three.

Â An odds ratio 4.4.

Â That was statistically significant.

Â And frequent marijuana users of wave two were more likely than non-drug users to

Â report intimate partner violence at wave three as well, with an odds ratio of 4.5.

Â In addition, women who reported IPV, intimate partner violence, in wave two,

Â were more likely than women who did not report intimate partner violence to

Â indicate frequent heroin use in wave three.

Â An odds ratio of 2.7.

Â So let's just look a little bit about how they sampled and matched.

Â They randomly selected 753 women from the total population of 1700

Â women enrolled in 14 methadone maintenance clinics in New York City.

Â And of the 753 women 559 agreed to participate.

Â And they go on to describe this.

Â Ultimately they ended up with 416 women who were eligible, and

Â agreed to participate and completed a baseline [INAUDIBLE] criteria.

Â And their eligibility criteria were, being a female between the ages of 18 and 55.

Â Being enrolled in the Methodist, methadone maintenance program for

Â at least three months and during the past year having had a sexual or

Â date, dating relationship with someone described as a boyfriend, girlfriend,

Â spouse, regular sexual partner or father of her children.

Â We use propensity score matching to reduce the selection bias that can occur in

Â an observational study.

Â This heuristic nonparametric technique in effect reconstructs a sample that mimics

Â the results of a random sample component in a randomized clinical trial.

Â By selecting groups that have similar values to observe confounders and

Â that only differ with respect to a treatment variable of interest.

Â Propensity score matching can eliminate this bias if we

Â were able to balance across the treatment and control groups all the covariates that

Â are associated with both the treatment and outcome.

Â Propensity scores were calculating using the attributes for

Â observed confounders measured at wave one.

Â Treatment variables at wave two, and outcome variables at wave three.

Â [SOUND] This analysis plan ensures that the confounders temporarily proceed

Â treatment assignment,

Â which in turn proceeds the determination of the outcome variable.

Â And they go on to say the confounders include associated demographics,

Â history of trauma.

Â Psychological distress, social support, and HIV risks.

Â For hypothesis one, the treatment variables frequent drug use measured at

Â wave two in the outcome variables at, in the net part near violence at

Â wave three and we saw that from the results section.

Â And they go on to describe their other hypotheses.

Â And then they say, after using propensity score matching procedures to select

Â a final sample of participants who are, for which valid causal effect sizes could

Â be obtained, we used multiple logistic regression to test each, each hypothesis.

Â For each type of drug, adjusted odds ratios and

Â their associated 95% CIs were examined.

Â To test the hypothesis.

Â And here's the table they report.

Â So for hypothesis one frequent drug use

Â increases the likelihood of subsequent IPV.

Â And what they're showing, their outcome here is whether or

Â not the woman reported having experienced intimate

Â partner violence at the last study follow up wave three.

Â And they use their reporting of frequent drug use at wave two,

Â two as the predictor, so women who reported using cocaine at wave, wave two

Â had 60% higher estimated odds than women who didn't of experiencing intimate

Â partner violence at wave three but this was not statistically significant.

Â However, as you can see, and they reported it in the abstract, crack uses and

Â heroin uses were statistically significantly associated with

Â a large increase in the relative odds of experiencing intimate partner violence at

Â the third follow up period.

Â And then this other hypothesis they look at

Â their predictor here is whether the woman had experienced a reported intimate

Â partner violence at wave two with regards to what they used drug wise at wave three.

Â So, if the outcome was use of cocaine at wave 3,

Â then women who had actually experienced intimate partner violence at wave 2 had

Â 2.1 times the odds of women who not, of using cocaine frequently at wave 3,

Â although that was not statistically significant.

Â And crack again was and heroin were the two things that were

Â statistically significantly associated with increased uses for women who had

Â experienced intimate partner part, violence at the previous study period.

Â So, hopefully this whole lecture set has given you a look at

Â some alternative methods for adjustment that can be useful.

Â One thing to keep in mind though is that all of these are,

Â are mine the same territory as what we did in multiple regression but

Â may have some advantages analytically especially, but

Â nothing can actually adjust for confounders that were never measured.

Â So the utility of these other methods involving propensity scores is limited by

Â the number of potential confounders that are measured by the researcher just as

Â the other methods we discussed with multiple regression were as well.

Â Coursera provides universal access to the worldâ€™s best education,
partnering with top universities and organizations to offer courses online.