clock menu more-arrow no yes

Filed under:

Sunday Stats: Strength of Record

New, 1 comment

Filling in KenPom’s largest deficiency

NCAA Basketball: Purdue at Michigan Rick Osentoski-USA TODAY Sports

Michigan is game-breakingly good. And I’m not just saying that because they took my beloved Boilermakers out behind the woodshed on Saturday. They literally broke the analysis I put together for this week’s column. But I’m getting ahead of myself. Let me backup.

Wins or Efficiency?

There are two schools of thought when it comes to college hoops. One says that wins and losses should be all that matter, since that’s why you play the game. The other says that each possession gives you information about the quality of the two teams and that you need to look at every possession to get the full picture about who’s better than whom.

I’m firmly in the latter camp, but both sides kind of have a point.

Let’s take an exaggerated example. I love exaggerated examples.

We have two teams who both play the same five-game schedule. Let’s say against five really good teams: Duke, Michigan, Kansas, Virginia, and Furman. Team A loses all five games by a single point. Team B loses four games by 20 points, but beats Kansas by one. Which team is more impressive?

Well obviously the computers will like Team A, since they are pretty much dead even with some really good teams. But their resume includes no wins. Team B doesn’t look nearly as good, but, hey, they beat Kansas.

If the two teams played, Team A would be favored. But most people would argue Team B had a better season. They’d finish ahead of Team A in the standings, and the schedules can’t get fairer than identical.

When it comes to the NCAA Tournament, I’ll buy that we should look at what teams accomplished and not how they’re predicted to do relative to other bubble teams. (College football is exactly the opposite, by the way. The criteria for the CFP is “the four best teams”. Most rational people are convinced this is a ploy to get two SEC teams in the playoff every so often.) In other words, if it was down to Team A and Team B to fill the 68th slot in the Big Dance this March, Team B should get in.

So that raises a natural question: can we mathematically measure accomplishments, caring only about wins and losses, while still working within a good predictive statistical framework? Yeah, we can. We want to look at a metric called Strength of Record.

Strength of Record

The idea behind Strength of Record is simple—you take a hypothetical D-I team and predict how they’d do against your team’s schedule. Then you slide around their efficiency score until you find the spot where the predicted number of wins is equal to the number of actual wins. That’s the efficiency score that’s equivalent to your team’s win/loss record. Then you rank teams by that adjusted efficiency score. In essence, you’re comparing the strength of their record (thus the name) as opposed to the strength of the teams themselves.

ESPN builds out a Strength of Record metric, but I don’t like it because we can’t take it apart into its component pieces and play with it. As always, I prefer to use KenPom. But Mr. Pomeroy hasn’t added a SOR metric to his site. Can we make our own? Hell yes.

The Nerdy Part

Here’s how you build out a KenPom Strength of Record Metric

  1. Pull adjusted offensive and defensive efficiency margins for all D-I teams from kenpom.com
  2. Calculate a Pythagorean probability score, which is expressed as a probability. You can think of it as the likelihood that a team wins a game played against a randomly-drawn D-I opponent. The formula is (OE^11.5)/((OE^11.5)+(DE^11.5)), where OE and DE are the offensive and defensive efficiency scores.
  3. Adjust for whether the game is home or away. A few years ago, KenPom used 1.34 as the adjustment for all teams. Anymore, he calculates specific homecourt advantages, so I can’t perfectly duplicate his results, but I can get close enough. If a team is at home, add 1.34 to their offensive efficiency and subtract 1.34 from their defensive efficiency. On the road, do the exact opposite. For neutral-site games, no adjustment is needed.
  4. Then for each possible matchup between D-I schools, you have a probability score for each team. Let’s say at home Team A has a score of 0.75 and on the road Team B has a score of 0.50.
  5. Use Bayes’ Theorem to calculate the probability that Team A wins. In our example, it’s (0.75*(1-0.50))/((0.75*(1-0.50))+(1-.075)*(0.50)) = 0.75. And that result makes perfect sense because in expected value terms there’s no difference between playing a random D-I team and playing an average D-I team. An average team is going to have a score of 0.50, just like Team B. (If you want to understand the logic behind Bayes’ Theorem, that’s a several-hundred-word post for another time.)
  6. For each team you want to look at—which for me was all 14 Big Ten teams—take every game they’ve played against each opponent and calculate their expected win probability. Add up all the win probabilities to get the number of expected wins against the schedule to-date.
  7. If the number of expected wins is higher than the number of actual wins, do the first six steps again, but use slightly worse efficiency numbers for your Big Ten school. If the number of expected wins is lower than actual, use slightly better numbers.
  8. Repeat step 7 until expected wins are equal to actual wins.
  9. You now have 14 adjusted efficiency margins. Put them in order and share them with your readers. (Note that efficiency margin is just OE minus DE. In other words, if you are efficient when you are on offense, that’s good and helps you. If the other team is efficient when you are on defense, that’s bad and it hurts you. A higher margin means that you are more efficient than your opponents.)

Big Ten Strength of Records

Or should it be “Strengths of Record”? Somebody who went to a less STEM-y school than me can let me know in the comments.

  • Wisconsin: 31.4 SOR efficiency margin (actual EM is 22.2)
  • Ohio State: 26.4 (17.2)
  • Nebraska 25.6 (18.0)
  • Iowa 22.6 (14.8)
  • Minnesota 22.6 (12.4)
  • Michigan State 21.9 (22.5)
  • Maryland 21.7 (16.1)
  • Indiana 17.8 (18.0)
  • Purdue 15.5 (21.9)
  • Northwestern 12.7 (12.7)
  • Rutgers 12.0 (6.4)
  • Penn State 9.5 (15.3)
  • Illinois -6.4 (yes, that’s a negative EM!) (7.8)

So how do you interpret this? Well, Wisconsin has a more impressive set of wins than does Ohio State, who has a more impressive set than Nebraska, and so on. If the actual EM in parenthesis is smaller than than the SOR EM, that means that a team has won more games than they “should have”. The opposite is true if the number in parenthesis is larger. (Poor Purdue.) Interestingly, Northwestern’s record is exactly what it’s predicted to be.

Wait, Where’s Michigan?

Here’s where it gets weird. Michigan is the only Big Ten team that hasn’t lost. So their actual performance is eight wins in eight games. To get an expected number of wins equal to the number of games you played, you have to have a 100% chance of winning each game. In order to do that, your offensive efficiency has to be positive infinity and/or your defensive efficiency has to be zero. So Michigan would be at the top of this chart, and their SOR EM would literally be equal to infinity.

I knew the Wolverines were good, but man, I didn’t know they were that good.