0:00

In that extended example we saw

Â the value of doing good analytics with these performance measures.

Â What first appeared to be significant differences in skill turned out to be

Â purely chance.

Â And, that really emphasises the importance of persistence.

Â You need to find ways of testing for persistence.

Â So, in that case, we just looked at split sample.

Â How performance varied in one year, and then how it varied in the next year.

Â Finding ways to do that is one of the most fundamental

Â ways you can parse signal from noise.

Â We're gonna focus on four additional issues for the rest of the module.

Â Regression to the mean, sample size, signal independence, and

Â process versus outcome.

Â These are all important concepts to have in mind as you dig into your data.

Â And they are also

Â 0:48

issues that we tend to have if we only reason about data intuitively.

Â So there are issues that data can improve, analytics can improve.

Â But analytics aren't apparent, you can still make these mistakes, even with data.

Â So let's drop into them.

Â The first is regression to the mean, and want to start that

Â with a very simple model of performance where you can think of performance terms

Â of real tendency plus luck, and we've been talking about this a little bit.

Â We can formalize it, and don't get too put off by the baby math here but

Â in formal terms.

Â You can think of performance, y as a function of x true ability and

Â e some error, some randomly distributed error around 0.

Â Now what does that mean for when we sample on extreme performance?

Â What underlies extreme success and failure?

Â 1:43

What is that, if that's the model of the world and everything that we've been

Â saying so far says it is, that there's some noise in these performance measures,

Â what does it mean when we sample on extreme performance?

Â Well it means that extreme success suggests that the person might

Â in fact have superior ability or tried very hard but also that they got lucky.

Â The error was positive.

Â And conversely extreme failure perhaps means inferior ability or

Â that they did not try very ord, but also negative era or that they got unlucky.

Â We can be sure that as we sample very extremely on performance measure.

Â A noisy performance measure, and they're all noisy

Â we can be sure that when they go to the extremes we get extreme error as well.

Â So what are the consequences of that?

Â There's one very important consequence and

Â that is in subsequence periods, error won't be negative again.

Â It will regress to the mean.

Â You'd expect it to be zero.

Â Error is by definition zero.

Â And if you've got very positive error in one period,

Â you would expect less error in the following period.

Â This is a notion called regression to the mean and

Â it's one of the most important notions in performance evaluation.

Â So, an example,

Â there was a study a few years ago of mutual fund performance in the 1990s.

Â So the study divided the 1990s into two halves, 1990-1994 and

Â then 1995-1999 and they looked at the top ten performing funds from the first

Â half of the decade and here I'll show them to you, we anonimized them,

Â this is just supposed funds a through j, and their performance in the early 1990s.

Â There were 283 funds in this study, these are only the top 10 performing funds.

Â Then they did two things they go and ask how do these

Â funds perform in subsequent years, and they did an interesting thing in between,

Â they ask people what do they predict happened in the new few years.

Â What do they think performance will be realized in the second half of the decade.

Â So here are the predictions, the estimations,

Â from the people that they asked.

Â 3:51

They didn't think the top performing firm, A would again be the top performing firm,

Â they though maybe 10th.

Â And so on down the list.

Â E, which is the 5th performing firm they thought well, maybe 44th.

Â And so you can see that they didn't expect the firms to be as good, but

Â they expected some regression to the mean.

Â 4:14

What actually happened?

Â It ranged from 129th, 21st, 54th.

Â The interesting thing is that on average,

Â the firms performed, their rank was 142.5.

Â What is the significance of 142.5?

Â It's half of the total number of firms in the study.

Â In other words, the average performance of the top ten firms in the second period,

Â the second half of the 90s, was perfectly average for this sample.

Â They regressed entirely.

Â The top ten mutual funds in the top half of the 90s,

Â the early 90s, regressed entirely to the mean in the second half of the 90s.

Â If that's the case, what does that say about how much skill

Â versus luck was involved in how those firms did, in the first half of the 90s?

Â 5:03

If they regress all the way to the mean in the second period,

Â it suggests that there was no skill.

Â That the differences that we saw,

Â and there are huge consequences to those differences because we know that new fund

Â flow to successful funds, were in fact entirely based on luck.

Â So, there are many other examples.

Â Danny Conium the Nobel prize winner Danny Conium,

Â it gives a famous example of being an officer in the Israeli Air Force.

Â 5:26

He was studying the officer in the Israeli Air Force.

Â This was early in Common's career.

Â And the officer told him, "Punishment is more effective than praise.

Â Whenever I punish a pilot after a really poor flight,

Â I see better performance the next time.

Â Whenever I praise a pilot after an excellent flight,

Â I see worse performance the next time.

Â Therefore, it must be that punishment is more effective than praise.

Â What's a more parsimonious explanation?

Â The more parts explanations that there's a little chance involved with whether

Â a pilot has a good performance a good flight, or a bad flight.

Â 5:56

And after a good flight if there's some chance involved there you would expect

Â that the following flight wouldn't be as good and conversely.

Â After a bad flight, if there is some chance involved, you would expect,

Â you would predict that the next flight, would on average be better.

Â This is exactly why we have to be so careful about regression to the mean.

Â We have the wrong model of the world if we don't appreciate regression to the mean.

Â We walk around like the Israeli Airforce Officer who believe that it was all about

Â praise and punishment, as opposed to merely statistical regression to the mean.

Â There's another example.

Â We're not gonna pick on Israeli Air Force officers.

Â One comes from Tom Peters, from the original business book.

Â Peters and Waterman were McKinsey consultants, no less.

Â And they did a study.

Â And it began as an internal study.

Â And they eventually published it as a hugely best selling book

Â on what determines excellence in companies.

Â They selected 43 high performing firms and tried to learn what they could

Â about business practices from these top 43 firms.

Â 6:59

But subsequently if you folks evaluated the performance of those 43 firms and

Â what do they find?

Â Five years later, there were still some excellent companies and

Â there were some that were solid but not exactly the top of their industries.

Â And then there were quite a few in weakened positions and there were even

Â some from the supposed 43 excellent companies who were fully troubled.

Â Now this is exactly what you'd expect from regression to the mean and that suggests

Â that sample that Peters and Waterman had grabbed as supposedly excellent firms.

Â They grabbed them perhaps they were on average a little bit better, but

Â they had necessarily been lucky to make it into that sample.

Â To be called the most successful of 43 firms in the world essentially.

Â They were necessarily lucky and in subsequent periods,

Â they're not gonna have luck break their direction.

Â 7:48

So this is something that you'll see any time you sample based on extreme values.

Â That if you sample on one attribute, any other attribute that's not

Â perfectly related will tend to be closer to the mean value.

Â So we've been talking about performance at points in time.

Â If you sample on extreme performance at one time period.

Â The subsequent time period won't be as extreme,

Â whether you sample an extremely good or extremely bad.

Â But it can also be attributes within an individual or within an organization.

Â If you sample, say, a person's running speed, and

Â then look at what their language ability is.

Â These things are imperfectly related, right?

Â So if you ever looked at the fastest runners

Â how would you expect them to perform on some language ability test?

Â They wouldn't be as high.

Â The fastest runners will, almost by definition,

Â will not necessarily be the people with the best language ability.

Â But that's not because there's some inverse relationship between language and

Â running ability.

Â It's that these two traits are simply imperfectly correlated, and so

Â when you sample on the extreme you have to expect regression to the mean

Â on any other attribute.

Â 8:50

So we could spend a day on regression to the mean effect.

Â There aren't many concepts that are more important in understanding the world,

Â than regression to the mean.

Â We could spend hours on this.

Â And I would be very happy if you walked away from this course with only two or

Â three ideas, if this is one of them.

Â Because it's gonna help your reasoning about the world.

Â Why is this so hard?

Â Why is this such a hard concept to stay?

Â To live?

Â Well there are a few things that get in the way.

Â Among others, we have this outcome bias.

Â I mentioned it with Hershey and Barron, referenced Hershey and Barron.

Â They're the ones that came up with this study originally.

Â We tend to believe that good things happen to people who worked hard.

Â Bad things happen to people that worked badly.

Â And we draw too strong an inference based on this.

Â We tend to judge decisions and people by outcomes and not by process.

Â This is a real problem and it gets in the way of our

Â understanding of this regression to the mean framework for the world.

Â Two others.

Â One is hindsight bias.

Â Once we've seen something occur we have a hard time

Â appreciating that we didn't anticipate it occurring.

Â We, in fact,

Â often misbelieve that we anticipated that that's exactly what would happen.

Â We show hindsight bias and again,

Â if that's the way we reason about what happens, then we're not gonna

Â appreciate that what happens next could possibly be just regression to the mean.

Â 10:14

And finally narrative seeking.

Â We want to make sense of the world, we want to connect the dots,

Â we came to believe things better we can tell a causal story between

Â what took place at time one and what took place at time two.

Â And if we can tell a causal story, then

Â we actually have a great confidence in our ability to predict what happens next.

Â We seek these stories as opposed to what I've been telling you which is this dry,

Â statistical reason for why things happen.

Â We seek stories.

Â And that again gets in the way of our understanding of

Â the statistical processes that actually drive what's going on.

Â So in short we make sense of the past.

Â We are sense making animals and we make sense of the past.

Â And there's not a lot sense to be had for merely regression to the mean, but

Â it's going to get in our way of predicting what happens next.

Â We try to find stories that connect all the dots.

Â And we, by doing that, give chance too small a role in those stories.

Â So there was an internet mean that captures this well a year or two ago

Â where, if this is knowledge distributed in your experience in the past.

Â This is knowledge you might have.

Â And with that knowledge, perhaps you can add some experience and start connecting

Â the dots, drawing some lines, create something from that knowledge.

Â That's good.

Â That's what we want experience to do.

Â But then, sometimes we're inclined to do this which is get a little too creative

Â and over fit those lines, and we turn what should be a pretty straight grid, pretty

Â parsimonious connections into something that is likely to replicate in the future.

Â It might be a very satisfying interpretation of the past but

Â it's over fit, and an over fit interpretation of the past is

Â going to make very bad predictions about the future.

Â