Adjusting for Strength of Schedule

Kevin Zatloukal
6 min readJan 6, 2024

Anyone who knows me knows how closely tied I am to the University of Washington: I’m an alum, my wife is an alum, my parents are alums, my eldest child currently attends, and I teach there! I’m also a third generation UW football season ticket holder. With the Huskies playing for a national championship on Monday, this has been me all week:

Of course, I have no intention of turning this site into a sports blog. However, as I noted last time, when, in following my interests, I see opportunities to bring machine learning to bear, I feel motivated to write about it. And, this week, in the embarrassingly number of hours I have spent studying the statistics about UW’s opponent, the Michigan Wolverines, I ran into another such opportunity.

The Statistical Matchup

Many commentators seem to make their initial assessment of a team (whose games they probably haven’t watched) by looking at their season-long stats. A reasonable place to start is to look at the total yards gained by that team’s offense and allowed by that team’s defense.

On that metric (see here), Washington has the 10th ranked offense and Michigan has the 70th ranked. On defense, however, Michigan ranks 1st, while Washington ranks 94th out of 130 teams. That seems quite bad. Indeed, many have noted that no team with such a poorly ranked defense has ever won the championship during the CFP era.

Some see those statistics as evidence that these teams (particularly Washington) are not very good. However, I see it as evidence that these statistics are not very good. The odds of a team with the 10th best offense and 94th ranked defense going 14–0 are extremely low. It’s much more plausible that these statistics are just not good ways of measuring the quality of these two teams.

One way to make the rankings more sensible is to switch from total yards to yards per play. Obviously, there will be fewer yards gained when there are fewer plays in the game, all else equal, but that is not strong evidence that the offense is worse or the defense better. With Washington having a pass-heavy offense, their games generally had more plays, making their defense appear worse by total yards allowed. Contrariwise, with Michigan having a run-heavy offense, their games generally had fewer plays, making their defense appear better by this metric.

If we switch to yards per play, the rankings change as follows:

We can see that three of the four rankings improve, giving results that are more sensible. These are clearly two of the best football teams in America, so we should expect the rankings to be high. It is much more reasonable to that believe that a team who is top-5 on one side of the ball and middle-of-the-road on the other could go 14–0 than a team that is 10th in offense and objectively bad on defense. However, we can make the rankings even more reasonable with a bit more work.

Strength of Schedule

The number of yards allowed depends not only on the defense but also on the offense they faced. A defense would be expected to give up more yards if they are playing better opponents, all else equal.

While, in many cases, the difference in quality of offense played is not large enough to matter, the difference in the quality of offenses that Washington and Michigan played were stark. Here are the rankings of the opponents played by each team in their 14 games so far:

Those in the top 30 (the best offenses) are drawn in red, and those in the bottom 30 (the worst offenses) are drawn in green. As you can see, Michigan played half its schedule against bottom-30 offenses, while Washington played half its schedule against top-30 offenses.

As a result, you would expect that the quality of Washington’s defense is better than it appears from the their statistics. To figure out how much better, we can turn to the simplest of machine learning tools, namely, linear regression.

A Linear Model of Offense and Defense

We will model each team’s offense and defense as separate entities. That seems fairly reasonable given that they are never on the field at the same time: one team’s offense plays the other team’s defense and vice versa when possession changes. An individual game is, thus, made up of two separate competitions that take turns playing out.

Our linear model will have a variable for each of the 130 FBS offenses and each of the 130 FBS defenses. Each game gives us two equations from the two competitions described above. For example, when Washington played Texas in the Sugar Bowl, Washington’s offense generated 7.60 yards per play versus Texas’s defense, which translates to the equation:

In the same game, Texas’s offense generated 7.01 yards per play versus Washington’s defense, which translates to this equation:

The constant term (c) in these equations is meant to represent the yards per play for an average team so that the variable for Washington’s offense represents how much it increases the yards per play above average and the variable for Texas’s defense represents how much it decreases the yards per play below average. We can enforce that representation by adding two more equations that force the average offense and defense value to be 0:

During the regular season, there were 739 games between FBS teams, so we end up with 1480 equations in 262 variables. With 5.65 data points per variable, we should be fine using ordinary regression. (There is no need to introduce more sophisticated tools like Lasso regression.)

Results

Ranking offenses and defenses by the model output, we get a much more sensible result:

Both teams are now top-25 in both offense and defense. That is much more consistent with the fact that both teams went 14–0.

The largest changes above are that the mediocre / poor parts (Washington’s defense and Michigan’s offense) show up as much better than before when accounting for the level of the competition. Also, Washington’s offense, which started out top-10, is now ranked top-5.

The one ranking that went in the opposite direction is Michigan’s defense. They started out ranked #1. That dropped to 4th once we accounted for how few plays they had to defend (due to Michigan’s run-heavy offense). After accounting for the weaknesses of the offenses they played, however, they drop to 11th.

Predicting the Championship Game

The final results make clear that this should be a close game, and I think that is right. Anyone making definitive claims about who will win is probably just seeking attention….

That said, since we are here, I imagine you want to know what the model says. It predicts that Washington’s offense will put up 6.34 yards per play against Michigan’s defense, while Michigan’s offense will put up 5.94 yards per play against Washington’s defense, giving a slight edge to the Huskies.

Go dawgs!

--

--