I'm starting with one of the simplest questions in basketball statistics.
If you wanted to know the free throw percentage in the NBA, you could simply count the total free throws players made and divide it by the total free throws players took. If you do this for the last ten years in the NBA, you get 76.7%.
But how certain of this number are we? What does that question even mean? One way to think about it is if the league started yesterday and we only had one day of games to go on, we wouldn't really have enough data to be sure. Maybe in our single day players took 55 free throws and made 50 of them. Using the same method, we would say there is a 91% free throw rate across the league. But maybe the players go lucky. They only took 55 attempts, so they could have hit 50 of them by chance. Maybe tomorrow they'll make 45 out of 55.
So how certain are we that the free throw rate is 76.7%? Using a binomial model of uncertainty, we can quantify how certain we are. Using the last ten years of data, we have a ton of games to go off of, so we are actually very confident in our 76.7% number.
In fact, it's very likely the free throw rate is somewhere between 76.6% and 76.9%. So in this case, using our simple 76.7% is completely reasonable.
The name of the newsletter is Binomial Basketball because the binomial distribution is simple but can be used to understand more complicated phenomina. Here we started simple, but it can get very complex very quickly.
Stan Model.
You can stop reading. This section is only for people curious about the underlying probability model. Either because they want to understand the details or they want to expand on it themselves. Here's my Stan model.
// The simplest binomial model.
// Models the total free throw rate using a binomal model
data {
int<lower=0> n_attempts;
int<lower=0> n_successes;
}
parameters {
real<lower=0, upper=1> theta;
}
model {
theta ~ normal(1, 10);
n_successes ~ binomial(n_attempts, theta);
}