So in this segment I want to talk a little bit about thinking about causality. So in any introductory level statistics course one of the important things that you learn is to chant correlation is not causation, correlation is not causation. It's some sort of semi-religious incantation. It's particularly important when we think about people analytics. Why is this so important? When we do people analytics, it's because we're thinking of changing something, right? It's because we're thinking there's something we could do differently. So to take some of the topics we've been talking about. I talked about different ways in which you can staff jobs. Suppose we had found that people who move jobs through formal posting do worse. Well, then should we be thinking about discouraging that posting and doing more sponsorship? Suppose as we analyze performance, we find that people who have been the longest in the job tend to be lower performers. Does that mean we should be moving people around their jobs more so they don't get stale? Suppose we find that people who have taken a training program perform better, well, should we invest more in training and train more people? Or even if we find that they have people who have done a training program, their performance improves more. Again, does that suggest this training program works? Maybe, but we want to make sure that other things aren't happening instead. There are really two types of problems that we want to be very careful about. So one of them is what we call an omitted variable problem, an omitted variable bias. So we find that X is associated with Y. So, when people have moved jobs through posting, which is X, they're more likely to perform higher, which is Y. Does that mean that X causes Y? Maybe, but it could also means there's some third variable, here O, that is driving both X and Y. So I talked about maybe if we found that posting jobs was associated with lower performance, we should do less posting. Well, an alternative explanation is that the reason that we post jobs is because they're difficult to fill. We don't have people currently who could do that job that we can identify, so we post it to see if there's anybody out there. Well jobs that we don't have people we can do are probably gonna end up being performed worse. And so it might not be that the posting's causing the problem, but rather the thing that's causing us to post is causing the problem. The other thing I talked about is, maybe we find people improve performance after training? Seems like good evidence that training improves performance. But in the literature on wages and training, there's this really interesting phenomenon called the Ashenfelter Dip. Somebody discovered that actually people's wages tend to dip before they enter training, because why do people enter training? Because they feel they need to improve. When they increase afterwards, part of that is just a reversion to the mean. People having temporary problems, those tend to go away. It's also when they have training. You can imagine the same thing in your organization. People have a bad quarter, they get sent to training, well that bad quarter was a bunch of things, the next quarter's gonna be better, okay? It looks like the training's caused performance, but really it's that bad quarter that's causing both the fact they go to training and the fact that their performance looks better next quarter. The other big kind of problem is reverse causality. So rather than X causing Y, Y causes X. So good performers, the fact that people are performing well causes them to do things differently. So, I talked about if we find that the people who are performing worse in their roles are the ones who been there the longest. Does that mean we should rotate people around? Does that mean people are getting stale in their job? Well, it could do. But it could also be this kind of Peter principle that I talked about. That people who have been in the job for a while have not been promoted out. And so they're the lower performers. The better performers are the ones who get promoted out. And so in that case, performance is causing people to stay on a job longer, rather than the other way around. And similarly, if people tend to send their best people to training, then we're likely to see the higher performers have more training. That's not the training's improving the performance, it's the other way around. Okay, and so a lot of these times what looks like an obvious inference about what we should do, may not be if we really understand the chain of events that's on the line, the patterns that we see on our data. There are a variety of ways of addressing this. I think really, the basic question always that you need to ask when you're doing this, is what's leading to difference in our main predictor variable? Okay, we see in effect of posting, why are some jobs posted and other jobs are filled by sponsorship? We see in effect of training, who gets training? We see in effect of time and job, why do some people stay longer in their jobs than others? Without understanding what's driving those things, you want to be very cautious about making kind of a strong argument that those things are themselves causing performance or whatever outcome variable you have.