Mathematical Formulas: Don't let one ruin your day

This is not my review of Harding’s weekend wins, although that will be up this evening. Stats will be updated at some point tomorrow and will include tonight’s UAM-Philander Smith non-conference game.

If you’re not interested in the statistical side of basketball, this post probably isn’t for you, but I thought I would pass along this bit of information because of some problems I ran into with this weekend’s update.

I use Offensive Efficiency Rating (or OER/ORtg/OEff) quite regularly to evaluate players on this site, but it’s quite the immense formula and probably isn’t worth fully explaining here. For that, I recommend Dean Oliver’s Basketball on Paper, which is an absolute must-read anyway if you’re into basketball stats. Essentially, it takes all of the good things a player can do on offense (shoot efficiently, create opportunities for opponents via assists, not turn the ball over) and boils it down into one rating, which represents the number of points that player “produces” per 100 possessions.

Some players are not incredibly efficient, but they use a lot of possessions on offense, while others are efficient within their limited roles. Both are valuable, and OER is helpful for determining that value. It’s a bear of a formula to calculate, since it takes into account such little (but important) things as what percentage of a player’s field goals are assisted, which helps to properly allocate scoring possessions across the team.

Specifically, the calculation I just mentioned is called Qast, and it’s a nasty formula in its own right, accounting for a player’s time on the floor, field goals and assists (both individually and by the team). Theoretically, the percentage of assisted field goals has to be between 0% and 100%, but Qast in its raw form doesn’t take that into account.

99.9% of the time, that’s not an issue, but on the far extreme of high productivity per minute, it can cause total chaos.  This was the case in Harding’s 104-95 win over Delta State a few weeks back. I just noticed the issue today, and the problem, essentially, is that Qast can be tricked into thinking that more than 100% (or less than 0%) of a player’s baskets were assisted.

In this case, DSU’s Anthony Fizer lit up the Bisons, scoring 10 points (three 3s and a free throw) in just four minutes on the floor. Qast thought, based on those gaudy numbers, that 726% of his field goals were assisted, ultimately wreaking havoc on his total scoring possessions (calculated at -13), points produced (-37), and OER.

It took me long enough to figure this out that I don’t have a Harding update ready yet, although I will at some point this evening.  This has been the only instance of this kind of formula destruction in OER, and it will be corrected in tomorrow’s stat update.

Fear not, Anthony Fizer.  My spreadsheets are not biased against you.


4 thoughts on “Mathematical Formulas: Don't let one ruin your day

  1. So is the solution as simple as capping the estimated assisted FG percentage at 100% on the high end and at 0% on the low end? Or does that cause problems somewhere else?

    And of course, the obvious next question is, are the any other parts of the formula where it’s possible to get an estimate of something that’s outside that something’s allowed range?

  2. That’s the suggestion Oliver made on the APBRmetrics forum a few months ago when someone else noticed that issue.

    I played around with the numbers, and the value it gives now for the FG component of “Scoring Possessions” still doesn’t make perfect intuitive sense for that outlying game. Since Qast is zero, the FG component is artificially low (around 1, even though he made 3 FGs), inflating his efficiency significantly for the game.

    It’s not as big an issue in the context of a full season’s games, or even a handful of games, but in the context of one game, it makes me less confident in Offensive Ratings. While there are not many extreme box score lines, the formula shouldn’t break because of one.

    I’ve been looking for other areas that might be affected by a similar issue, but I haven’t found any yet.

  3. I just took a quick glance at it and played around with some different numbers. Mathematically, I’m not exactly sure what was going on, but I didn’t look at it that closely. I stopped looking for the problem when I read that he sets an artificial floor of 0 and a cap of 100%.

    I’ll e-mail you the file I was using to play around with the formula. I also need to check the formula against the book again, to make sure I’m not compounding the problem with a keying error of my own.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s