Hi! This video is on optimal matching. We're going to begin with an example that illustrates why greedy matching is not optimal, and then we'll discuss what optimal matching actually is. So we'll begin to think about why greedy matching is not optimal. And remember by optimal, what we're really talking about here is in terms of a global distance measure. So for example, optimal would mean the smallest total distance if you took the distance of all matched pairs and added them up. Ideally, that would be as small as possible, and we'll call that globally optimal. And greedy matching is not globally optimal. So we'll look at an example where suppose we want to match on age. So we'll just really simplify things and say there's one confounder, it's age. And now, we'll imagine that we have 3 treated subjects and 12 control subjects. So here's our treated subjects here that and their ages. So we have three treated subjects, their ages are 45, 38, and 41. And then we have available controls, and these are just the ages of the controls and you'll see that there's 12 different ages there. And what I'm going to do is use the greedy algorithm and match them one at a time. And so, we have treated subjects and control subjects over here, and as I find matches, I'm just going to put them over in that right-hand column. So we'll begin here. And so the first subject is 45 years old and the best match for them was a control subject, that's 44 years old. So I place that over here. So we matched a treated subject to a control subject, the first treated subject was 45, the best match, meaning the smallest distance in age, was a 44 year old. So next, we'll look at the second treated subject which is a 38 year old. I put a... I crossed off the 44 year old because they're no longer available to be matched. And we find that a 36 year old is the best match, meaning we're being really here so we're just going to find the person whose age is the closest to the treated subject in absolute value. So now we've matched a 38 year old with a 36 year old. And we also crossed off the 36 year old from the list; that person is no longer available to be matched to a treated subject. So next, we move to the 41 year old. The 41 year old we could actually either match to a 47 year old, which is what I did here. But they also could be matched to the 35 year old and the distance would be the same, so I just picked one of them. But in either case, the distance is six. And now we could add up just the absolute distance here. And you'll see that the first matched pair has a distance of one, the second matched pair has a distance of two, the third matched pair has a distance of six, and so we have a total distance of nine. So, overall, the ages between treated and controls differ by nine. So that's with greedy matching. But, you could actually find a better overall set of matches. So here I list a better set of matches. So, rather than using the greedy matching algorithm, I just selected a set of matches that are better. So again, we have treated subjects over here and the treated subjects are 45, 38, and 41. And I matched this control with the first treated subject, this control with the second treated subject, and this control with the third treated subject. And we'll see, if you add up those distances, we have a difference of two for the first pair, a difference of two for the second pair, and a difference of three for the third pair. So the total distance here is seven, whereas if we go back, the total distance there was nine. So we have a smaller total distance here. So this just illustrates that with the greedy algorithm, by always taking whatever the smallest distance is, you aren't thinking about the big picture. There might be times when you want to save a match for a later subject and accept a slightly less good match now. But the greedy matching you never looking ahead – you're not thinking about the big picture, you're just always grabbing the best match. And so, you're not necessarily going to hit the optimal in terms of this global kind of distance. So this is just illustrating the main idea that greedy matching is not necessarily optimal and usually will not be, in fact, in terms of minimizing the total distance. So what does optimal magic do? Well optimal matching is going to minimize the global distance. So global meaning the total distance. So, you could imagine the big picture would be like let's say I'd considered all possible matches between treated subject and controls. So imagine that I matched, I did every possible combination and I calculated the global, the total distance and then I chose a set of matches that had the smallest total distance. Well that would involve a lot of computation, so it's something that could be done. But if you have a big dataset, that would involve a lot of calculation. So it's a difficult problem. It can be solved using, basically a network flow optimization approach, details of which are way beyond this course. But it's such a way, it's a difficult optimization problem but one that people have solved. But even though people have solved it or figured out how to do it, it's still computationally demanding. So in general, optimal matching is if you have a small dataset, it's very feasible; on a big dataset, you know, the computational burden might become too much. But the big picture idea is optimal matching is going to minimize this global distance. And there are R packages that can do this, such as optmatch and rcbalance. Okay, so 'Is it computationally feasible?' is the question. So is optimal matching computationally feasible? Well, it really depends on the size of the problem. And what we mean by size typically here would be the number of possible treatment-control pairings. As an example, if you had 100 treated subjects and 1000 controls, there's 100,000 possible pairings. And in fact, that would actually be the kind of the size of a problem that this could actually solve in a reasonable amount of time. And in fact, your computer probably could handle even a million treatment control pairings, but it will take a while. Once you get into thousands of subjects and possibly, you know, a million treatment control pairings, your computer might be working for a long time, but it is typically feasible to have about that many. But once you got to something like a billion pairings, let's say, 10,000 treated patients and 100,000 control subjects, it's likely that you're not going to be able to use optimal matching. Your computer might be running for a month or longer, so it's not really feasible. So this is something to be aware of that even though optimal matching should be better than greedy matching, there's sometimes when optimal matching just isn't going to be feasible. There is something you can do about it though. There are different methods that can make optimal matching more feasible, and this wouldn't really be full global optimal matching but sort of some kind of compromise. So, you could place some kind of constraints, you can impose constraints to make optimal matching feasible. So for example, imagine that we might want to match within hospitals in a multi-site clinical study. So maybe we have a study that has several hospitals and then patients within hospital. Rather than do an optimal matching on the entire dataset all at once, you could do optimal matching within hospitals. So the idea would be that you only accept matched pairs, so matched treated to control pairs within a given hospital. So within a hospital, you do an optimal matching. Well this has greatly reduced the size of the problem. You know, let's say there's 10 hospitals all of equal size. You know, you would basically be reducing the size of the optimization problem by a 10. You'll be dividing by 10 essentially, so it might become feasible. So this would be – you could think of these as blocks, you might want to match within blocks. Another example might be matching within primary disease category. So primary disease category, you know, what might be somebody goes to the hospital, what their primary diagnosis code is. You know, so they could have heart failure or they could have COPD and so on. And so you might want to match treated and control subjects within primary disease category, and that would be an example of a block. And if you potentially then can optimize within these blocks and it have something that's computationally feasible. So this is also known as sparse matching, and you can do this in the R package rcbalance. And you can also, with this kind of approach and this software package in particular, you can also tell it to tolerate some mismatches as long as fine balance is achieved. So we've discussed prime fine balance a little before, but just as a reminder, you might be willing to accept matches that aren't perfect. Maybe you're matching a male to a female, as long as the overall distribution of sex, for example, is the same in the treated and the control groups. So there's a number of different ways that you can make these matching problems a little more feasible or maybe you're optimizing given constraints and so on. And there's software that sort of makes all of these things feasible, where you can really tailor what you want in a particular study using the software, and it's relatively straightforward and computationally feasible.