In this video, we're going to talk about hypothetical interventions and we're going to think about what is meant by causal effects and what our well-defined causal effects versus those that are not so well defined. We're going to discuss manipulable variables, what is meant by manipulable variables, and hypothetical interventions in general. And then we'll also discuss the fundamental problem of causal inference. What it is, why it's a challenge, and we'll start to think about what we can do to overcome that problem. So this video is largely about hypothetical interventions. So we'll begin by thinking about what we mean by an intervention. And the reason we do that is because it's cleanest to think about causal effects of interventions. And sometimes these are also known as actions. And so what we mean typically by intervention is that, or something that can, a variable than can be manipulated. Whether it might be hypothetically or in practice, we actually could manipulate it. So we'll talk more about what that means but that's really what we're thinking about here is the kind of variable that we could intervene on, or we could manipulate, or take an action on. And if you can do that, even if it's just an exercise, a thought kind of exercise, then we can think of causal effects of those types of variables. So Holland wrote, no causation without manipulation. And this is a widely shared quote in the casual inference world. And the main idea is that it is difficult to think about casual effects for variables that you can't think about manipulating. But causal effects of interventions are generally well defined. So for example, suppose we want to compare the causal effect of drug A versus drug B. In principle, we can imagine manipulating that. We could give some people drug A, we could give some people drug B, we could decide who gets what. And in that sense, we're manipulating it. We can imagine taking some kind of action. And so if we wanted to know what the causal effect of drug A compared to drug B is, it's somewhat easy to understand what we mean by that. At least in principle, so that's hypothetically we could intervene on what drugs somebody takes. Related to that is this idea that there's one version of treatment, and so sometimes this is also referred to as a no hidden versions of treatment hidden kind of assumption. So for example, suppose we were interested in the causal effect of body mass index, BMI on health outcomes. And this is a common thing that people are interested in. You'll hear a lot of discussion of what is the health impact of obesity, for example. But we have a problem here because there are a lot of ways that one could achieve a particular BMI value. So whether you're thinking about body mass index in general or you could think about weight or obesity, imagine we want to hypothetically manipulate BMI, which is the exposure variable of interest here. But we have a problem because there's not many potential ways you could do that. And these different ways might be associated with different outcomes. So we could imagine for example, you might prescribe a weight loss medication and that might help somebody achieve a particular BMI, or maybe they exercise, and they achieve a particular BMI in that way, or maybe it's diet, or maybe it's surgery. So there's a lot of possible ways that you could achieve a particular BMI, which means there's not one version of treatment, if we're imagining body mass index says that these sort of treatment or exposure variables. There's a lot of ways in which you could get it to a particular value. And the method you use to sort of get it to that particular value might be related to the outcome. In an extreme example, you could think of smoking. So in some cases, people smoke and lose weight, right? So smoking might be one way that you can get lower BMI, but of course, smoking might be associated with the outcome here in the [INAUDIBLE]. So difficult territory here and thinking what is the causal effective body mass index, but was is the causal effect of obesity? Because there's not one version treatment, that's one reason, and also it's not the kind of thing that is easy to manipulate. So in this case it might be better to think of a causal effect of a particular intervention that aims at manipulating weight. So if we're interested in the causal effect of weight or BMI or obesity on some outcome rather than try to estimate what that is because it's a difficult thing to define. Instead you can think of what is a causal effect of some intervention that's aimed at effecting weight or body mass index. Another difficult case is immutable variables. In here we're talking about things like race, or gender, or age. Things that we can't even hypothetically believe change or manipulate as an investigator or a research investigator. So, when we think about potential outcomes or counterfactuals, we're imagining what could hypothetically happen if treatment was something else. So suppose we're interested in the causal effect of race on some outcome, it's very difficult to think of a potential outcome. What would the outcome be if your race was different? Because there's no way for us to actually manipulate that directly. So that's an example of a variable that we can't manipulate. And then it makes causal inference quite difficult. It's the same thing with age. We can see that as people get older they're at higher risk of certain outcomes. But it's harder to imagine what it would mean to actually physically manipulate their age. So causal effects, in these cases, are more difficult to define. So when we think about what does a causal effect mean of these types of variables, we get into territory, that's just much more challenging. And so one way around this is to, rather than to think about these kinds of variables and the causal effect of them directly, well you can't directly intervene on them. As I mentioned earlier, you could think about related things that you can manipulate. So interventions, where you can take an action. So with race for example, if you might be interested in what is their discrimination and hiring practices based on race, right? So we can imagine changing somebody's actual race and see whether they would hire or not. But we could potentially send out identical resumes where the only difference is there's a name that sounds like, people would tend to associate with an African American person versus a white person. And so the name on the resume is something we can directly manipulate and then we could potentially identify a causal effect of that particular type of intervention. So where there is discrimination in hiring based on whether the name sounds African American or white, and so there has been a lot of this actual studies that have taken place. In terms of obesity, you can imagine bariatric surgery for example, what is the causal effect of bariatric surgery. Because there, a decision is actually made, should I get bariatric surgery or not? So that's something that, an action is taken, it's an intervention, it's manipulable and so we have some hope of identifying the causal effect there. Socioeconomic status is really such a broad concept, socioeconomic status is not very easy to manipulate somebody's socioeconomic status. But we could imagine an intervention where, for example, you give somebody a gift of money and what is the impact of sort of this extra income, this extra money on some outcome that you're interested in. So the gift of money is something we can manipulate directly. And for the remainder of the course we're mostly going to focus on treatments or exposures that we could think of as interventions. So at a minimum, hypothetically, you could imagine intervention. So one way to help sort of figure out, is this a manipulable type of variable. Is this a hypothetical interventions? Can we imagine a randomized trial? So you can think of hypothetical trial. Can you picture a hypothetical randomized trial, that could exist that could manipulate this variable? So it's possible that it would be unethical to actually carry out the trial, but we should be able to imagine that it could happen. So, for example, ethically we probably can't randomize people to smoke or not, but we can imagine, hypothetically, we could imagine doing it, right? We do, it might not be ethical but we can at least imagine it. So in that case smoking we could think of as something that is manipulable in principle, so we could imagine a causal effect of smoking. Whereas, we can't really imagine a hypothetical trial that would change somebody's race, or change somebody's BMI. But that's not to say that there aren't causal effects of age, of race, of gender, or obesity. I think most people believe that there are. It's just that, it's hard to define these, what formally do we mean by a causal effect in those settings? And we really can't fit it very well in the potential outcomes framework, and we're going to use the potential outcomes framework to identify causal effects throughout the course. So it's widely recognized that there are causal effects here, with race, gender, obesity and so on, but it's doesn't fit very cleanly into this kind of causal framework in a formal way with data and estimation and so on. So, those are very hard problems and we're going to focus on the slightly easier problem of variables that we can think of as hypothetical interventions. So we focus on hypothetical interventions, because their meaning is well defined. So that's one of the key reasons. But also they're potentially actionable. And this is actually a really crucial point. If you knew the causal effects of obesity, let's just imagine that there is one. What could you do about it, right? Well, your next step would be, what can I do about it? You would start imagining actions. So if we found out that obesity was shortening people's lives, if we really believe that was the causal effect, then we want to figure out what we could do about it. We would propose interventions. And so really we want them to know what is the causal effect of that particular intervention. So now we're getting back to hypothetical interventions causal effects. And so if we stick to this framework of hypothetical interventions then we end up with output that's potentially actionable. So then what are causal effects? So we'll imagine, now, broadly speaking, that, A causal effect occurs when the potential outcomes Y1 and Y0 are not equal to each other. So we'll say that variable A, this exposure or treatment A, had a causal effect on Y, if Y1 differs from Y0. So if the potential outcome under treatment, differed from the potential outcome under no treatment. So as an example, let's say that my headache is gone one hour from now. Or what the variable is that the headache is gone one hour from now. So it's either yes or no. So our observed outcome is Y and equals value yes, if your headache is gone an hour from now and 0 otherwise. And let's imagine that you're possible treatment is ibuprofen and it is equal to, because that's our treatment, we'll call that A and A=1 means you take the medication and A=0 means you don't. And so then in reality we only see one thing, so you might make a statement like this, I took ibuprofen and my headache is gone, therefore the medicine worked. So you'll actually, I'm sure in real life here, people say things like this. I hear this all the time where somebody did something, some outcome was observed, and they say therefore it worked. But that's actually not proper causal reasoning. What they really just told you, so they gave you a long sentence, but what they really said is Y superscript won't equal 1. That's really all they told you. In other words, remember Y superscript 1 means the outcome, in this case headache under treatment, is equal to 1 which means my headache went away. Y1=1 means I took the treatment and my headache went away. But what would have happened had you not taken ibuprofen? Right, so this is unknown at this point. In other words, what is your counterfactual? If you hadn't taken treatment would your headache have gone away anyway, right? We don't actually know that. So there's only a causal effect if these potential outcomes are not equal to each other. So if your headache would have went away anyway, then we can't say that taking the medication caused your headache to go away. So hopefully now you're identifying a problem. And that problem is known as the fundamental problem of causal inference. So the fundamental problem of causal inference is that we can only see one potential outcome for each person. And so, in the previous example, we don't know what would have happened, had they not taken ibuprofen, right. So that's something we can never see for that person. But, we just said, that we need to know it, to know if there was a causal effect. So this is a problem, in fact it's the fundamental problem. However, you can make certain assumptions to estimate what's known as a population or an average cost and effect. So rather than think about did this causal effect work or did this treatment work for me as an individual. We can think about a population as a whole. So what we can never know is what might be called a unit level causal effect or an individual level causal effect. What would have happened to me had I not taken Ibuprofen, I actually can't know that for sure. So we're not really going to think about causal effects for individual people. So I say that's pretty much a hopeless problem, but what is possible is something like this, what would the rate of headache remission be if everyone took ibuprofen when they had a headache versus if no one did? So now we're now trying to figure out a causal effect for each individual. But really as a population as a whole, if everybody did one thing versus if everybody did another thing, what would the rates of headache be in this case? So we would be able to say something about how well exposure and treatment sort of works in general. As a general statement at a population level. So the most of the course is going to be focused on that. Estimating these kinds of population level causal effects. So this is how we deal with the fundamental problem of causal inference is by looking at whole populations as opposed to looking at individuals.