Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

From the course by University of Houston System

Math behind Moneyball

24 ratings

University of Houston System

24 ratings

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

From the lesson

Module 6

You will learn how two-person zero sum game theory sheds light on football play selection and soccer penalty kick strategies. Our discussion of basketball begins with an analysis of NBA shooting, box score based player metrics, and the Four Factor concept which explains what makes basketball teams win.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

So, Dean worked a long time for the Denver Nuggets. I believe he played college basketball for Cal Tech. Worked for the Denver Nuggets. Became the director of ESPN stats and recently he went back to the NBA to be the analytics director, I believe, for the Sacramento Kings. So you remember we talked about the Giants in baseball, what made them win. We talked about there's hitting, there's pitching, there's fielding. And base running a little bit, but that's usually not [INAUDIBLE]. Okay, and in the NFL you could break it down to pass offense, pass defense, run offense, run defense, and special teams. Okay, so in basketball how can you break down what makes a team win if you want to see where your team ranks on various things. So I have data from the 2007/2008 NBA season here. And the four factors, what are the four things that sort of

you need to be good at to be good at basketball? Well, you need to be able to shoot well and stop the other team from shooting well.

And so that's one factor, which is your effective field goal percentage minus the opponents, and that's the difference. So here Miami shot 52.44% effective field goal over their opponents, 47.51, the difference is what matters. Well you need to go to the foul line, so you take free throw attempts divided by field goal attempts. So the Heat would get 36 free throw attempts for 100 field goal attempts. Their opponents 29.9. Really it probably should be something like free throws made would be better, but they usually use free throw attempts. Okay. Now then you want to not turn the ball over. Now here, so the difference on free throw attempts from offense to defense. Your team and the other team, that positive. But turnover as a positive difference is bad, it means you turned the ball over. So it's turnovers per 100 possessions for the Heat was 13.52 and opponent was 12.6. And then rebound percentage. What percentage of your shots did you rebound on offense? What percentage did the opponents shots does the opponent rebound? Take the difference. So those four differences are your four vacuums, and basically the question is how you rank what are each of those worth and then how do you rank them? So in the regression data, we've got the difference columns put adjacent.

Okay, and what you're trying to predict is the number of wins. Okay. So we're going to run regression in the season, and then you have the EFG difference as one independent, you've got the free throws divided by field goal attempts as another variable.

So it's going to be these four columns. Okay, so we've got row 56 through 86 there. That should work, we're going to have labels. I can put this on the same worksheet. Okay, so our score is 93.1, I've got the results somewhere else, I think that's the right answer.

Okay. So let's see how good this model will do for this data in predicting how many wins the team has. So the r squared is 93 percent. 93% of variation in wins is explained.

The standard error, we know is important, 3.72. So, you double that. 95% of the time we can use four factors to predict wins within two standard errors. So, two times 3.72, about seven wins.

And we can see all these variables are significant. The P values are very low. Rebounds don't occur to be quite as important. They make sense. Okay, you have one more turnover for a hundred possessions. You're going to win four less, about 3.7 less games okay. And if you think about that that makes sense. One more turnover per 100 possessions.

Let's say on offense well that's going to be, you don't have 100 possessions per game. Okay so one more turnover. Okay, it's going to cost you possession, which is worth about a point, so that would cost you about three wins, but then you're giving them a better possession when you turn the ball over, they'll average more points for possession. So that coefficient makes sense and you can interpret the other coefficients similarly.

Okay, free throw differences, like if you had one more free throw per hundred possessions, per hundred field goal attempts, that's a little less than one free throw extra per game, which might be about 0.7 points, okay? And, well you might have scored on that possession, so it's not clear that. How to evaluate what that coefficient should be. But all the P values are low and that's good. So what percentage of basketball is based on shooting offense minus defense?

Free throw offense minus defense? Notice the four factors. What percentage of basketball is basically based on those four?

And so I did a bit of an analysis here and this you could use in any regression. It's a crude way to figure out how important each variable is. So, I've copied the coefficients from our regression over here.

And so then we ask ourselves if we can improve ourselves by one standard deviation. On each factor, how many more wins from average to one standard deviation above average? How many wins would we get? And then we can see for each factor what percentage of the total is each factor. Now I can tell you Dean Oliver said.

Okay. Now let's see what we get here? So we need the standard deviation for each of these differences, so I've got that with the stdev function. For instance the standard deviation on EFG offense minus defense is 2.82 freethrow attempts divided by field goal attempts is 3.75 etc. And so, let's take the difference in effective field goal percentage. So you want to make yourself one standard deviation better than average which means move from the 15th percentile to the 84th percentile, you would have to go up by 2.82 on effective field-goal percentage, and multiply that by the absent value of the coefficient here. That would be worth ten wins. And on free throw difference, it'd be worth three wins, and on turnover difference it's worth about five wins. I took the absolute value of the coefficients and on run defense, sorry rebound defense it's about 1.3 wins and if you see what this adds up to about 19. So ten wins out of 19 is about 53%. So you can see, right here I get almost exactly with Dean Oliver says although basketball reference seems to disagree with these percentages.

more important than Dean Oliver got and I sort of agree with that and I get rebounding is less important than Dean Oliver got. There isn't much differential on rebound percentages.

Well that's the end of our video on the four factors. But any GM, one of the first things in basketball, what a person he should do or she should do is look at basically where they stand on the four factors and see basically who they can get in the free agent market or via the draft who might impact those four factors. Ok well we'll see you in the next video when we'll talk probably

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.