This video is on assessing balance and in particular, assessing balance when you use inverse probability of treatment weighing. So the weighting creates a pseudo population or a weighted population. We would like to be able to assess whether we've achieved balance on the covariant distributions between the treated and control groups. So after we've fitted our propensity score model, and we've created these weights. We can then apply the weights to our data and then what we would like to know is did waiting work? So did this IPGTW method work in terms of, do we now have balance in our covariate distribution? In the same kind of way that you would expect to have balance, if you had a randomized trial. What we would like is for our weighted sample to really look a lot like it would look had we actually randomized. We can do this in a couple of ways, so a common thing that people do is create something like a table 1. Which is really again, these kinds of summary statistics. For example, the means of each covariate stratified on treatment group but in this case, using weights. So that weighted means, do these weighted means look similar? And typically then we would use standardized differences to really assess whether there's balance. We can also look at plots, so we could look at tables and also you could look at plots to see how well it's done. So as a reminder of what a standardized difference is in general. So right now we're not thinking about waiting just imagine we want to look on our raw data, for example. Where we have treated subjects and control subjects. Well what we would do to get a standardized difference is just take the sample mean for the treated subjects. And the sample mean for the control subjects, take the difference, and then divide by what's essentially a pooled standard deviation. So you could take the variance for the treated subjects, the variance for the control subjects divided by 2 and take the square root as an example, to get a pooled standard deviation. So this would give us a standardized difference, so what's the standardized difference in the mean and you would do this for each covariant. So this X bar here, this sample mean is for a particular covariant and we would do it for every covariant. And also as a reminder, it's very common to just report absolute values. So you might want to take the absolute value of it. Standardized differences could be positive or negative. But a lot of times people just report the absolute value. Because we're mostly interested in the magnitude of the difference as opposed to the direction of the difference. Okay, so how do we obtain standardized differences after waiting? Well it ends up being the same kind of idea where we are interested in a mean difference divided by a pooled standard deviation. Except here, it's just going to be weighted means and then essentially a weighted pooled standard deviation. So what we would do is we would stratify on treatment group. And then for each covariate, separately for each covariate, we would calculate a weighted mean and weighted variance. And then we would put them together using the same kind of formula. And so you could do essentially by hand, you have the weights, the data, you could go ahead and calculate them. Or there is a lot of software packages that were developed for surveys that you could use. So I mentioned this survey design package in R, for example. So there's a lot of statistics software have options where you can get weighted means and weighted variances and so on. So once we can weight our data and then get these standardized differences on the pseudo population. And then of course, the goal is that hopefully these standardized differences are low after we've waited. As an example, consider the right heart catheterization data, where in that data set, RHC is the treatment and no RHC is the control. And here, there's a few thousands subjects in each group. So we have n, n of about 3,500 in one group, 2,100 in the other, so that's control versus tretment. And then I just chose a handful of variables to compare. So we have age, sex, blood pressure and then a few comorbidities. So medical diagnosis such as heart failure, colon cancer, and so on. And then to the far right here, we have SMD which is a standardized mean difference. And this is all in the raw data, unweighted data. And you'll notice that there is a few standardized difference that are fairly large. So again,we would like the standardized differences to be less than 0.1. We see for the blood pressure one, it's the standardized difference is about 0.4. We see a 0.4 at the bottom for sepsis, for coma and so on. So there's quite a few where they're showing imbalance. Next, what we'll do is we'll use the same kind of technique, but use it on the weighted data. So we have weighted means now, and then the SMD over here, it's the standardized difference based on the weighted data. So this is on our pseudo population, so you could think of this as standardized mean differences on the pseudo population. And now you'll see that all of the standardized differences are less than 0.1 in absolute value. In fact, our largest one is just a 0.04 here. And in general, you'll see that there seems to be good balance between the two groups. And if you look at a couple of their more extreme cases from before for example, for blood pressure previously, the mean was 85 about versus 68. And now we have about a 78, well about a 79 in each, so very similar now. So the weighting seemed to have achieved balance. And so this is how you can check for balance is by applying the inverse probability of treatment weights to your covariants, to those data. And calculate standardized mean differences on the weighted data. And in fact, if you were publishing a paper using inverse probability treatment weighted methods. You would probably want to include a table like this, looking at both the raw and weighted to see how much apparent can found was in the original versus how much balance you have after weighting. You could also use a plot to look at this, and so this is a very common thing that's done with either matching or in our case, inverse probability of treatment weighting. And typically there would be a lot more variables in this, you're probably controlling for 20 or more confounders. Here, we just have a handful of them, but to illustrate the main idea. And this is a graph of standardized mean differences versus the confounders. And right here, That's the raw unweighted, whereas this bluish greenish color is the weighted. And you'll notice that there's a vertical line at 0.1. because that's a sort of common rule of thumb, that you would want your standardized differences to be less than 0.1. And then this is sorted, so that the original was appearing at the top and this imbalance appearing at the bottom. And so you'll see in the raw data that there a quite a few confounders that where there was imbalance. Where there was on the raw data standardize difference is greater than 0.1. Whereas for the weighted data, everything looks really nice and close to 0 as you would hope. So in this case, it looks like waiting seem to do well in the sample of variable set, we consider. So if you actually did have imbalance after weighting, there's some things you could do. One thing you could do is potentially refine your propensity scores. So at this point, you haven't looked at the outcome at all, so you wouldn't be cheating to sort of go back and refine your model. So at this point this is still really more at the design stage. You want to make sure that you have good balance. So you could go back and say am I not including something in my propensity score model that I need? Maybe I've made a linearity assumption or something that isn't warranted, and then you could reassess balance. So potentially you can do a little bit of back and forth in terms of models, variable selection and model selection, and you can kind of iterate until you're happy with the results.