Hi, in this last lecture in this unit on linear models, what I wanna talk about is something I like to call the big coefficient. Now here's the idea. If we have a simple linear regression model, we have some equation like Y = a1 x1 + a2 x2 + b, right? And x1 and x2 are called the independent variables, and y's the dependant variable. So, for example, Y might be sales of a product. And x1 might be advertising in magazines and x2 might be advertising in television. Now we can look at these two coefficients, a1 and a2 and figure out which one's bigger. And what that's telling us is we get sort of more bang for the buck from advertising on magazines or from advertising on television. If it's television, if a2 is bigger than a1, then that's where we spend our money. So the idea is you put your assets, you put your resources on the variables that have the bigger coefficients. So this big coefficient thinking has led to something that people like to call Evidence Based blank. So there's Evidence Based Medicine. What you do is you look at all sorts of different treatments that have been tried on patients. And you gather all that evidence and you figure out which ones have the biggest coefficient. So does diet have a bigger coefficient than exercise? Does the medication have a bigger coefficient? Which medication has the biggest coefficient? And that's where you put your resources. There's also Evidence Based Philanthropy. If you want to go ahead and you want to improve a community or improve a country, you look and you say which has, which coefficient has the biggest bang for the buck? Is it, you know, spending on children? Is it spending on health care? Is it spending on women? Is it spending on education? What is it? And based on that, you can make better decisions. Now, remember, we've talked about in this unit that linear models are better than just thinking up stuff without any evidence. It's absolutely true, so 'm totally in favor on evidence based thinking. Let me explain sort of how it works. What you do is you construct some sort of model. And by that I mean, you have some understanding of what variables you think matter, and possibly even the structural form of those variables or that equation. So it could be nonlinear. Then what you do is you go gather data. And after you gather that data, you identify the important variables and change those variables, okay? Now, there's a movement towards what people call Big Data. And the Big Data Movement says that, well, maybe we don't even need the models anymore. This is what some people say. They say, here's what we do. We first just go gather the data. That's the first thing. Then you find the pattern. Then identify the important variables. So, no need for the model. Now I wanna make the point that I think that that's not true. I think that's an overstatement, and let me explain why, right? Big data does not obviate the uses for models. First let's just think of the broad reasons why we have models. One is just to understand how the world works. So even if you see the pattern, right, the identification of the pattern is completely different than understanding where it came from, right? So you could recognize wow, we've done a ton of experience, and force seems to equal mass times acceleration, right? That's very different than having a model that explains why that's the case, right? You might not actually realize that heavy objects and light objects seems to fall at basically the same rate. Still we'd like to have a model that explains what that's true. So identification pattern in no way gives us any sense of explanation, all right? But there's bigger reasons. Just even in trying to sort of affect policies is to find evidence based whatever just without a model just based on pure data. First is correlation is not causation. Remember, the example of the equestrian team, right? If your run all sorts of data you could find. This variable seems to matter, but the thing is it could be that variable doesn't matter at all because it gets correlated with something else that matters like the Equestrian Team. Second, and this maybe the most important point. Linear models tell sign and magnitude of these variable but only within the data range, right? So if I have a bunch of data here like this, right? And, I fit this model to it that doesn't necessarily tell me anything about what's going on up here. So, what i'd like to do is have a model perhaps that gives me some indication of whether I think that linear relationships going to continue to hold. So let me give an example, two examples of what it means. The first is feedback. So let me give two examples. So the first one is. Let's take anti-lock brakes in cars. If you looked at data on accidents, you could say that but one of these that seems to be causing accidents is cars bumping into the car in front of them. You could think if we can just get cars to stop sooner, we'd reduce the number of accidents. And so you put money and resources into developing anti-lock brakes and, in fact, initially that seems to save a lot of lives. But what might happen over time [INAUDIBLE] people are like thinking electrons if before, people kept maybe like a 40-foot or 30-foot gap between them and the car in front of them. Now that they have anti-lock brakes, they may creep up a little bit and they may start driving closer to the car in front of them. And a lot of the benefit of the anti-lock brakes will fall off. And so if you think of, if I had a little graph that had speed of breaking, right, instead of being a nice linear graph, if I take into account the feedback, right, the benefit may fall off, right? Instead of being linear, it may be okay? Another example, let's go back to education. Class size. You may fit some data and say, boy when class sizes fall from 25 to 20. I put student performance here. That there is a lot of data here and it seems to show performances going up. So I think you say well let's move it all the way to 15. And you could think if I extrapolate from this if I use my class size 15, performance should go way way up. But it could be if you move class size to 15, the performance sort of just plateaus. And the reason why is there's all these other causes for why performance doesn't increase, like family support, general health, right? Resources in the community. So even if you reduce class size to 15, there may be a diminishing effect for that because of feedback, right? One big reason why there'd be a feedback in this case is that you need to hire a whole bunch more teachers. And you might not be able to give the same sort of teacher quality that you had when you had 25 students per teacher something double the number of teachers, let's say. You're not gonna say, the tickets aren't gonna be the same quality and the students might not do as well. So and again, these feedbacks mean that you have to be careful about extrapolating a line outside the data range. There's a bigger problem with the fact that your data exists only within a small region. And this is what I call the problem of Multiple Peaks. Suppose you have a bunch data and it's all around here. Then we run some regression and we see, whoa it seems to be increasing in slope. So we start moving in this direction and then we find this peak. We got data in this range here. Using this data, we figured out this is the optional thing to do. But in doing so, we completely miss this entire peak over to the right. We completely miss this other opportunity because we were sort of blinded by the data, where we had data in this small range. So this leads to a distinction that I'm wanna make between what I call the big coefficient, which is climbing our current hill and something I'm gonna call the new reality, which is taking a completely different hill, to ask is there something entirely new and different? So let me be really clear. I'm not saying big coefficient thinking is wrong. I think it's really useful. When to have models, when to identify the important coefficient, you wanna think will those coefficients likely hold outside their range? And if they do, then you want to change those variables and hopefully, affect change in a meaningful way, in a good way. However, you also want to take into account that the fact that you might want non marginal changes. You may wanna do something big and new. And to think about the effects of something big and new, it's often useful to construct models of those entire systems to see what do you think is gonna happen. So, let me give you some examples of this. So [INAUDIBLE] in healthcare. Big Coefficient thinking might be tax cigarettes, right, because lung cancer is a leading cause of death. This reduces the number of people who get lung cancer, you'll also raise money that you can spend on healthcare, win-win. New reality thinking might be something like [LAUGH] Universal Health Care. Let's give everyone healthcare and let's try and, you know, improve the health of Americans or the health of people in any other country through Universal Healthcare System. Let's look at traffic. Big coefficient thinking might be, increased number of high occupancy vehicle lanes, with the number of lanes where you can have two or three cars, right? Again, makes total sense. New reality thinking, though, would be why not a rail system? The United States doesn't have much of a rail system, why not create a huge rail system either within a few cities on the East Coast or the West Coast and try and move some of that traffic off the highways? Again, new reality versus big coefficient. Last one, this is kind of a fun one. When I was growing up there was a study showing that oat bran significantly reduced cancer. Now, it turned out this was sort of a small-end study. When they did subsequent studies the effect wasn't as big as they thought. But because the coefficient looked big at first, they started putting oat bran in everything including in pretzels, right? And this is, again, big coefficient thinking, this will reduce cancer. Let's give everybody oat bran. Your reality thinking would be let's try and get everybody in a fitness regime, right? So let's try to create some fitness regime where people are out there exercising an hour a day. That's a completely different thing than sprinkling a little oat bran in the pretzels, right? It's fundamentally changing how we live our lives. It's a new reality. Now this plays out in policy circles all the time. So if you look at something like the American Jobs Act Right? This was 470, 447 billion dollar programs. It's a lot of money. And it did things like create tax credits for new employees, and subsidies to hire veterans, and payroll tax holidays. These are all big coefficient logic programs. The idea is that we've looked at a lot of data, and we see that these sort of policies. Get people to sort of spend that money initially or hopefully get people to hire employees initially, right? So these are programs that we think give us the most bang for the buck. But again, most of this program, most of this 147 billion was big coefficient thinking. That's good, right? But we can contrast it with. New reality thinking. So what's new reality thinking? A new reality policy is something like the interstate highway system. So in 1956, the United States government allocated $25 billion for 41,000 miles of roads. Now what that would cost now, if you just used, basically, sort of an inflation index like the CPI, there would be $200 billion. But they've actually figured out what the cost per mile of road these days to get not in cities but between cities, It's about $10 million. It'd be about 410 billion, about the same as the American Jobs Act. But this was new reality. This wasn't big coefficient. This was creating an entirely new system. I'm not saying either one is better than the other, cuz you can do a new reality program that can not work at all. But the point I'm making is this, is that evidence-based methods are really useful. And if you're gonna do some minor change, or tweak a whole bunch of variables, you should use evidence, you should figure out those coefficients, and should put your money in the big coefficients. At the same time, you have to keep in mind the fact that big coefficient thinking, right? Can ignore the new reality, can blind you to completely new and different ways of thinking about the world and making improvements in it. So this is where models become so important, right? One of the things we do in model thinking, right? Is model thinkers is we can construct models to think about what happens if we move outside that data range, right? So here we'll talk about thinking outside the box. How do we think clearly outside the box? Well one way is we do so is by thinking with models, all right? Thank you.