Home

How to Rate College Football Teams

Other Criteria: Performance, Improvement, and Common Opponents

Performance

At its base, performance is simply score differential. Winning easily is better than barely winning, and barely losing is better than losing by a lot of points. Performance is generally the last factor you should look at when rating teams, primarily using it to break up teams that seem otherwise equal. But when you do look at it, you must assess it properly, and unfortunately, many people don't.

Problems With Average Score Differential

There are a lot of problems with simply looking at average score differential. If Team A is 6-0 and has won by an average of 15, and Team B is 6-0 against the same teams by an average of 30, is team B better? Not necessarily. Even if those averages are 15 and 40, Team B may not be better. One team might be ball control and defense oriented, while the other runs a no-huddle passing offense through all four quarters regardless of score. Similarly, whatever the offensive style of two teams might be, one coach may start running out the clock relatively early in a game he is winning, while another may prefer running up the score.

Another problem lies in the following example. Team A defeats 4 opponents by 1 point each, and 2 others by 49 points each. Team B defeats the same 6 opponents by 10 points each. Team A has therefore beaten those six teams by an average of 17, while Team B has beaten them by only an average of 10-- a significant looking difference. But the fact is that Team A barely won 4 of those games, whereas Team B won all 6 by more than a touchdown. Team B actually has the better performance 4 out of 6 times, and should be rated higher.

Overtime is also a problem for score differentials. Winning by 3 points in regulation is better than winning by 7 in overtime. Some teams have even won by 13 in overtime. But any overtime score is, in score differential terms, essentially a tie. You could think of an overtime win or loss as a win or loss by half a point, regardless of the actual score.

Proper Assessment of Performance

Huge scores are flashy, dazzling the eyes and minds of most voters, but the most important dividing lines for football scores are smaller. Winning by 4 points is slightly better than winning by 3, because it would have taken more than a field goal to catch the 4-point winner. But by far the most important dividing line is the touchdown: winning by 9 points is more important than winning by 8, because it would have taken two scores to beat the 9-point winner. The 7 or 8 point winner, however, was one unlucky bounce or Hail Mary away from a possible loss. Games won and lost by more than a touchdown, and by less, are the main thing you should be looking at when assessing performance.

But even there you need to be careful. For example, in a close game, one team might take the lead with some 30 seconds left. The other team, desperate to come back, throws an interception that is returned for a touchdown. Clearly this game was closer than the actual score will show. On the other hand, one team might be dominating the whole game, only to give up a touchdown in the final minute, then an onside kick, then another touchdown on the last play, to win by only 7. This game was not as close as the actual score indicates. Which is why it is important to know what actually happened in the games, rather than just looking at the scores.

It is also important to compare apples to apples. When comparing the performance of two teams, you may need to break their games up into categories by power level of opponent. Close wins over bad teams are a negative. Close wins over top 10 teams are not. If one team has a lot more close games than another, it might be because that team played a lot more rated and winning opponents.

Southern Cal vs. LSU in 2003

For an example of assessing performance, let's look at the co-national champions of 2003, 12-1 Southern Cal (AP poll champ) and 13-1 Louisiana State (BCS champ). LSU played five games against rated opponents to USC's two, and LSU defeated #3 Oklahoma, whereas USC's top win came over #6 Michigan. So strength of schedule heavily favors LSU. In order for the AP Poll to rate USC #1, they must have been better performance-wise. But LSU won their games by an average of 22.9 points, and USC won theirs by an average of 22.7 points. AP poll wrong, case closed?

Cue Lee Korso: "Not so fast, my friend!" LSU had a slightly higher average score differential because they stomped on their cupcakes by more (such as Louisiana-Monroe by 42 and Louisiana Tech by 39). USC, however, was never threatened by any of their bad opponents, so the fact that they didn't stomp them by as much as LSU did is irrelevant. What matters is how many close calls each team had. And USC only had one: their overtime loss at Cal. The closest anyone else came to them was 14 points, and only two came within 20. LSU, on the other hand, had three close calls other than their loss, winning those games by 7, 7, and 3 points. All of those came against rated opponents, but USC defeated their rated opponents (both top ten) by 14 and 27 points.

So USC did outperform LSU in 2003, despite the average score differential favoring LSU. The only question is whether or not they outperformed them enough to make up for the big difference in strength of schedule. And that is in the eye of the beholder. I myself would go with LSU at #1. But USC is a valid choice too. On the basis of performance.

Performance Does Not Trump Actual Wins and Losses

Always remember that assessing performance takes a back seat to assessing relevant records: actual wins and losses. In fact, it's not even in the same car. Or road. It's wandering around in a back alley. Putting performance in the driver's seat only leads to accidents. Like Sagarin's absurd preferences for Florida State over national champion Alabama in 1992, Ohio State over national champion Tennessee in 1998, or USC over national champion Ohio State in 2002. Or like almost all computer systems rating Oklahoma ahead of national champion Penn State in 1986. Penn State was unbeaten that season, and they defeated Miami, who defeated Oklahoma-- whether or not Oklahoma (or Miami) outperformed Penn State overall is completely irrelevant in the light of those facts.

Improvement and Late Season Emphasis

Most people who rank college football teams-- possibly all of them-- emphasize the second half of a season over the first half, and they put the greatest emphasis on the bowl game results. It makes sense to give more weight to a bowl game than to a regular season game, because with the extra time to recuperate and prepare, teams theoretically show up at their best: healthier and with sharper game plans. And if two teams are comparable overall, but one was markedly better in the second half of the season than it was in the first half, it makes sense to rate the improved team higher. But this is all improvement and late season performance should be looked at for-- breaking up teams that are otherwise comparable. Like overall performance, it should be one of the last things you look at.

Lastgamitis

Unfortunately, AP poll voters tend to woefully overemphasize late season performance, especially the last game played. In fact, they emphasize it to such a degree that in some cases it looks like 90% of their ranking of a particular team is based on that team's last game (though that one game may represent less than 10% of its season). I call it Lastgamitis, and AP poll voters desperately need an inoculation for it.

Lastgamitis can have much less effect on "name" teams and teams that were highly ranked in the preseason, but where it really hits hard is with "no name" teams and teams that were not highly ranked (or ranked at all) in the preseason. But in any case, of all the poor practices I have covered regarding AP poll voters, I think Lastgamitis is the one that leads to the most logical problems with the final AP top 25 over the years.

Echoing points I have made in earlier sections of this guide, here is something that happens nearly every season in the AP poll. At the end of the regular season, there will be a conference champion who is ranked ahead of a conference runner-up that they defeated. The conference champ will go to a major (BCS) bowl and play another top ten team. The runner-up will go to a minor bowl and play a low-ranked or unranked team. When the champion loses and the runner-up wins, the runner-up will pass up the champion and be ranked higher in the final AP top 25. Given their opponents in their bowl games, this is incredibly illogical, and yet as of the end of the 2009 season, it has happened in each of the last 3 seasons, and a great many times over the course of history.

Georgia Tech vs. Virginia Tech in 2009

At the end of the regular season in 2009, Georgia Tech was rated #9 in the AP poll and Virginia Tech was rated #12. Georgia Tech went to a BCS bowl and played a top ten team. Virginia Tech went to a minor bowl and played an unranked team. Georgia Tech lost to Iowa (#7, 11-2), Virginia Tech beat Tennessee (7-6, unrated), and voila!, in the final poll Virginia Tech (10-3) was #10 and Georgia Tech (11-3) was #13. Never mind that Georgia Tech beat Virginia Tech and was the ACC champion.

Now I understand that it is quite possible for a conference runner-up to have a better season than a conference champion, and due to better non-conference results, end up ranked higher. But it doesn't happen nearly as often as the AP poll voters appear to think it does. Georgia Tech 2009 is not a case of poor nonconference results-- otherwise Virginia Tech would have been rated higher before the bowls. Georgia Tech 2009 is a case of thoughtless Lastgamitis. Punishing one team for playing a top ten opponent and rewarding the other for playing an unrated opponent, to the point of ignoring both the head-to-head result between the two and a conference title, is just senseless.

Michigan 1961

Lastgamitis affects every final poll, and every weekly poll during every season, but let's look at two of the worst cases in history. For the first, and I would say the worst of all time, we have to go all the way back to 1961. Michigan was 6-3 that season, losing to #2 Ohio State, #6 Minnesota, and #8 Michigan State. But they defeated the #12, #16, and #20 teams. So where do you think they were rated? Logically, one would think #9-11, but I obviously wouldn't be bringing them up if that were the case. Were they ranked about #18, behind two teams they lost to? That would indeed be egregious. Downright stupid even. But try this on for size-- they were not ranked at all. And it wasn't because they had 3 losses-- seven 3-loss teams were ranked (not counting bowl losses-- the AP poll ended before the bowls), including all three rated teams that Michigan defeated.

What we have here is a bunch of AP voters who decided that Michigan's last game was the only game they played that year. And Michigan's last game was the infamous 50-20 loss to Ohio State (finished #2, 8-0-1). That's the game where Ohio State, getting the ball on their own 20 with 34 seconds left, scored a touchdown on four passes (the only time Woody Hayes ever seemed to call for a pass was when he had a big lead against Michigan). Then they went for 2. And when asked why they went for 2, Woody Hayes famously said, "Because we couldn't go for 3." Or maybe he said that after another Michigan game where he did the same thing.

In any case, while Hayes always tried to run up the score against Michigan as much as possible, it is interesting to note that OSU only led this game 21-12 going into the fourth quarter. So one bad game, and basically one bad quarter, against an unbeaten team, #2 in the country, wipes out the entire rest of the season? Ludicrous.

Texas 1990

For the second example, we move forward almost 30 years to 1990. That season, 10-2 Texas lost to the #1 and #3 teams, and they defeated the #10, #11, #15, and #17 teams (and that is a very impressive number of victories over rated opponents). So where would you rank them? If you said #12, congratulations, you could be an AP poll voter. And you also shouldn't be allowed to vote.

Texas' last game here was also infamous, being the 1991 Cotton Bowl, where they lost badly to Miami (finished #3, 10-2), who was at their bad-boy worst. This was the game where Miami players yelled taunts into the Longhorn locker room before the game, and continued taunting the rest of the day. The game where they celebrated every touchdown like they had just won the national championship on the final play. The game where they danced a conga on the sideline. The one where they pushed and shoved their way to a record 16 penalties for 202 yards, then picked up first downs anyway (including a 1st and 40).

The one Miami won 46-3. A terrible embarrassment for Texas (and some of us would say for Miami, too, but that is another story), to be sure, but why are these voters so incapable of stepping back and reminding themselves that it is, after all, just one game? Now Texas forever sits in that AP top 25 directly behind two teams they defeated (including, of course, the Southwestern Conference runner-up, whom Texas defeated 45-24 that year).

Every Game Counts

Here's the bottom line: every game counts. Every game matters. I think it makes sense to put less emphasis on the first few games played, but those games are still games played, and should not simply be ignored as if they never happened. In the same way, it makes sense to put more emphasis on the bowl game result, but you shouldn't dismiss the regular season. A bowl game, in the end, is still just one game.

The entire season's record and head-to-head results are what really matter. Performance, especially recent performance, and especially especially last game performance, are secondary factors. To reiterate, you should generally only use them to break up teams with otherwise comparable records.

Common Opponents

Looking at common opponents is such a bad way to rank teams that I hesitate to include it among ranking criteria at all. For the most part, it belongs more in the "how not to rank teams" discussion. Unless you are comparing teams from the same conference, looking at common opponents just gives you way too small a sample size, and is thus very misleading.

As an example, in 1999, Virginia Tech and Florida State went unbeaten in the regular season. They played three common opponents: Miami, Clemson, and Virginia. Virginia Tech stomped on all 3, winning by an average of 25.7 points. FSU, on the other hand, beat Miami by 10, Clemson by only 3, and won the three games by an average of 12.7 points. Looks like Virginia Tech was better, right? And yet FSU was ranked #1 by almost everyone, and when the two teams played, FSU was favored to win, and did win (by 17).

So what happened? Well, those 3 games were just that: 3 games. Both teams played 11. And in the other 8 games, FSU defeated two other rated teams, and Virginia Tech defeated none (and in fact edged a bad West Virginia team by only 2 points). So FSU should have been rated higher before they even played, and they were rated higher.

But most of you already know the fallacy of emphasizing common opponents, so I'll end by discussing when it is actually valid. As I implied in the first paragraph of this subsection, common opponents can be very useful when you are comparing teams from the same conference-- especially when two comparable conference teams did not play each other, or when three (or more) teams are tied in the conference standings. Teams from the same conference obviously give you a larger, fairer sample size for common opponents.

And even if you are comparing two teams that aren't from the same conference, if everything else is equal, then you can, as the last tie-breaker, look at common opponents (in lieu of flipping a coin).

Next: How to Rank the Top 25 Teams

Section that follows "How to Rank the Top 25 Teams":
How Not to Rank Teams

Home