This video is on causal effects. Here we're going to more formally define what we mean by causal effects. And in particular, we'll discuss two types of causal effects. Average causal effects, and the causal effect of treatment on the treated. So we're going to formally define them using statistical notation, potential outcomes. So this video is going to be a little more technical. And then we're also going to spend some time focusing on the difference between conditioning and a variable or variables versus manipulating or setting variables. So first we're going to talk about hypothetical worlds and average causal effects. So, let's begin by thinking of some population of interest, which we're depicting just with a circle. So this circle is everybody you're interested in. It's representing a whole population of people that you're interested in. It's sort of your target of interest. So if you are interested in people who have diabetes and what treatment is better, then your population would be the population of diabetics. This circle represents the whole population that you're interested in. And now, we're going to think about hypothetical worlds. So remember with potential outcomes, we were thinking about hypothetical worlds and hypothetical interventions, so we are still thinking about that now. So we're not really thinking about data yet. This is, we're just imagining what we would ideally like to see. So what we would ideally like to see is two worlds, two hypothetical worlds. So World 1, which is depicted by this grayish circle, is that everyone in our population gets treatment A=0. So treatment A=0 could be, it actually could be no treatment, it could be a placebo, whatever you imagine. So we're picturing now, this is a world where our entire population, every single person, got treatment A=0. Versus, some other hypothetical world where everyone received the other treatment, A=1, so depicting that with this light blue circle. But the most important thing here is that World 1 and World 2 have the exact same people, it's the same population of people. But in one case, we do one thing to them, and in another case, we do another. And then, if we were able to observe both of these worlds simultaneously, we could collect the outcome data from everyone in the populations, and then we could take the average value. So I say mean of Y in World 1. And mean of Y in World 2. And then that difference would be the average causal effect. And so this is what we mean by an average causal effect. So, it's an average in the sense that it's a mean and it's a population sort of level average causal effect. It's over the whole population. We're saying, what would the average outcome be if everybody got one treatment, versus if everybody got another treatment? So, of course, in reality, we're not going to see both of these worlds. But this is what we want, this is what we define as the average causal effect. This is what we would like to see, and this is what we're hoping to estimate. And we can define that more formally using statistical notation. So here, the E refers to expected value and that also means that's the mean. And here we're then taking the average of difference of these two potential outcomes. So, remember, Y^1 is a potential outcome if treated with A=1, and Y^0 is the potential outcome if treated with A=0. And so then, we could take the average difference of that to get an average causal effect. So this quantity, this average causal effect is the average value of Y if everybody was treated with A=1 minus the average value of Y if everyone was treated with A=0. And that's exactly what I showed you on the previous slide. Mean of Y for World 1 versus mean of Y for World 2. So in the case where Y is binary for example, this would just be a risk difference. In fact, it would be a causal risk difference. Because the mean of a binary variable is just a probability or a risk. So then the mean of a difference of potential outcomes where the outcome is binary would be a risk difference and a causal risk difference in this case. And now we will present an example, which is the Average Causal Effect under the situation where we're comparing regional versus general anesthesia for a hip fracture surgery on risk of major pulmonary complications. So here, our population, that circle from previous slides, is the population of people who are undergoing hip fracture surgery. So that's a population we're targeting. Our outcome Y is major pulmonary complications, and that's just a binary variable. So just a yes or no, 1 or 0. And our treatment is regional versus general anesthesia. So suppose that the causal effect here, the average causal effect is -0.1. So right now we're just imaging that we know what that is so that we can just think about what the causal effect means. So at this point, we're not thinking about estimation or any of that. We're imagining that we just happen to know what this causal effect is, and we want to interpret it. So imagine that the causal effect is -0.1. What that means is the probability of major pulmonary complications is lowered by 0.1 if given regional anesthesia compared with general anesthesia. But it's a little bit difficult to think about, what a difference in probabilities means or a difference in risk means, so one thing that's helpful is to quantify that in terms of the number of people. So we could imagine if 1,000 people were going to have hip fracture surgery. Then if the casual effect is -0.1 we would expect about 100 fewer people to have pulmonary complications under regional anesthesia compared with general anesthesia. Now we can look at another example. Suppose treatment is thiazide diuretics versus no treatment among hypertensive patients. And our outcome Y here is systolic blood pressure. So our population, that circle that we're interested in is hypertensive patients, so patients with hypertension. And we're interested in, is it effective to treat with thezide diuretics? And now imagine that the true causal effect, this population average causal effect is a value of -20, so this is again, the outcome systolic blood pressure. What this would mean then is that, if you think of the population of hypertensive patients that took thiazide diuretics, on average, they would have a systolic blood pressure value that's about 20 units lower than if they did not take this particular medication. Next we're going to focus a little bit on a very important topic, which has to do with conditioning versus setting. And in particular, we're thinking about treatment. And so some of this has to do with statistical notation, and some of this is conceptual. And it's really crucial to get this distinction down. So first, what we mean by conditioning isn't the usual statistical sense. Where it's a given or conditional on a particular variable. In this case, we're focused on treatment. So, the main idea is that this average causal effect, which is the mean of this difference of potential outcomes, which is what we see on the left side of this inequality. The expected value of Y1-Y0, that's not in general going to be the same thing as the expected value of your observed Y, given A=1 minus the expected value of Y, given A=0. And this idea is really fundamental to causal inference. So let's think about what these different types of quantities mean. So whenever you see this vertical line in the expected value, one way you could read it is, the expected value of Y given A=1. But I usually like to think of it as, iit's defining a subpopulation. So rather than say, given A=1, you could read it as the expected value of Y among people who have A=1. Or among the subpopulation of people who have A=1. So this vertical line, this conditioning, is really restricting to a sub-population of people. The expected value of Y given A=1 means what is the average value of Y in the subpopulation of people defined by those who actually receive treatment equal to 1? But they might differ from the population as a whole in important ways. So for example, people at high risk for flu might be more likely to choose to get a flu shot. So imagine that that's the case that people at higher risk for the flu might be more likely to get the flu shot. Then if we take the expected value of Y among people who actually got the flu shot, we're taking expected value of Y among sort of a higher risk population. And that's different than the expected value of the potential outcome Y^1 because Y^1 is the outcome if everyone in the whole population got treatment. It's not restricting to a subpopulation. So when I say setting treatment, I mean manipulating, or in the potential outcome situation. And when I say conditioning on, I really mean restricting to subpopulations. We can see this pictorially. So, again, imagine our population of interest is a circle, but in reality, some people get A=0, so that's this gray partial circle on the left. And you'll notice that it's not the original population anymore. It's a population of people who actually got A=0. So, we're not hypothetically manipulating anything at this point. This is just reality. Some people actually get A=0, and that's them. On the right hand side, blue partial circle, that's the population of people who actually got A=1. And you'll notice that these are not the same population of people. So we could take the mean of both of those subpopulations. And that would be an average difference in the outcome between subpopulations that are defined by treatment group. So it would be the difference in outcome among two subpopulations. But the people who get treatment A=0 might differ in fundamental ways from people who get treatment A=1. So we haven't isolated a treatment effect, because these are different people, and it might have different characteristics in general. So that's why this distinction is very important, between the thing we're targeting, which is this average cause and effect. This causal effect where we're manipulating treatment on the same group of people versus this thing that we actually observe, which is the difference in means among some populations that are defined by treatment. So conditioning versus setting, as I mentioned, this first line is referring to an expected value of Y in the sub-population that actually receive treatment A=1. But that's different from the expected value of Y^1, which is a potential outcome. That's the mean of Y if the whole population was treated with A=1. So in general, this contrast between the mean among treated people and the mean among untreated people is not a causal effect because it's just gets comparing two different groups of people. And these different groups of people might differ in ways that are independent from their treatment. They differ on treatment, but they also might differ on a lot of other variables. Whereas, the expected value of Y^1- Y^0, this expected value of this difference in potential outcome is a causal effect because we're comparing the same population of people. And the only thing that's different is treatment. So, in one case, we give everyone treatment A=0, in other case, we give everybody A=1. So, we've isolated the treatment effect in that sense. So this is a very crucial distinction that we'll need to keep in mind throughout the course. And we're always targeting this something involving a mean of potential outcomes. But this average causal effect is just one possible causal effect. So there's a lot of other causal effects that you might be interested in. And what you choose might depend on the particulars of your study, of your research question, or even what data you have available to you. So some of this will be covered later on in the course, but I'll give a few examples of causal effects. So here, now instead of a difference in potential outcomes, we're taking a ratio. So this is the expected value of Y1 divided by Y0. And so, if we have binary outcomes, that would be a causal relative risk. So we're taking a risk if everyone was treated versus the risk if no one was treated. So that is a causal relative risk. Another somewhat common example that people are interested in is the casual effect of treatment on the treated. And so you'll notice here that we condition on A=1. And that's what we mean by, on the treated. We actually are restricting to a subpopulation, so the subgroup people who have actually received treatment, but we're contrasting potential outcomes for that group. So we're still in causal effect territory. And if this one's confusing, and I'll show and then next slide, what this represents in a picture, and I think it will be more clear. But we might be interested in this because you might be interested in how treatment works among people who are actually treated. And sometimes this is important because there might be some subpopulation of people who I just would never be interested in this particular treatment. Maybe it's surgery and they just are not interested in surgery. But among people who want the surgery, we want to know how well it works. In that case, we would want to know what is the causal effect of treatment on the people who actually were treated. Another situation you might be interested in, casual effects among subpopulations. This is also known as heterogeneity treatment effects where there might be some subpopulation which I'll define by V. So V is just some variable. And we might want to know, well, what is the causal effect in this subpopulation? So maybe among people who have a bio-marker that's high or low, or what's the casual effect among men, or among women, or by age, or by race, we're still isolating a treatment effect, but maybe in certain sub-populations. So let's look at the real world versus another example of a causal effect. So we saw this before. In the real world, some people get treated, some people don't. These are different populations. But what we actually might be interested in, in the case of causal effect of treatment on the treated is focusing on the sub population who are treated. So that's this little semicircle, that's the treated population. But you'll notice then when we get to World 1 versus World 2, we're restricting to the same group of people. So we have this gray semicircle and this blue semicircle, but they're the same shape, they represent the same people. And now we're giving everyone treatment equals A=0, versus treatment A=1. This is a causal effect of treatment on the treated. It's a causal effect because it's the same group of people, it's this population of people who in reality were treated. But now we want to imagine in World 1 what would have happened if we had not treated them. And in World 2 was sort of what actually happened where we actually did treat that. So, that will be the causal effect of treatment on the treated. The important distinction here is that we're still comparing the same group of people. So we can define a subpopulation, and that's fine as long as on the treatment side of things, we are doing the same thing to everybody. So in World 1, we might not be treating everybody. In World 2, we're treating everybody, and then we're contrasting the mean. But, again, we're going back now to this fundamental problem of causal inference. We only observe one treatment and one outcome for each person. Again, that's the fundamental problem of causal inference. And so, what we're moving into is how do we then use observed data to link observed outcomes to potential outcomes? So in reality, for each person we're going to see one treatment, we're going to see one outcome. But we want to infer something about what would have happened. And how are we going to do that? And so, what we'll need to do is make assumptions. And so, though we'll have to make assumptions to link observed data to potential outcomes. And so, a major focus of the course will be, once I have data, how do I then link these things? How do I estimate causal effects from observational data? So we see what the fundamental challenge is. We've defined causal effects. And now, we need to really focus more on how do we then link these things together? We've observed data and potential outcomes so that we can get the causal effects of interest.