Does ESPN’s Total Quarterback Rating correlate to winning?

By now everyone has weighed in on ESPN's new Total Quarterback Rating (TQR) metric.  

Most of the critiques have centered on the following perceived shortcomings of the model:

  1. The Clutch Index.  In short, can one really measure clutchness?
  2. The proprietary nature of the model.  In short, only ESPN knows what goes into the recipe.
  3. The dividing of credit.  In short, we rely on ESPN's video reviewers to tell us how much credit a quarterback contributed to each play.

Since I waited two fulls weeks to react, I didn't want to beat the same drums.  Better stats people than I had already contributed to these discussions. 

Instead, I decided to take a different angle altogether.  I simply decided to look and see if TQR correlated to winning.

Here's what Trent Dilfer, who was on the team that developed TQR, had to say about the subject:

Forever we’ve lacked a quantitative way of explaining winning or losing quarterback play. The old passer efficiency rating had nothing to do with winning or losing because it gave every down an equal weight and it credited the quarterback for something that was more influenced by the receiver or offensive line, or punished the quarterback for something he had no say in. By filtering it down to critical downs and weighing the importance of each down, the really smart people in ESPN Stats & Information have come up with a number that will best describe how much the quarterback contributed to winning or losing a football game. This is a total game-changer. Ten years from now, this will be the rating that personnel people will refer to when talking about quarterbacking.

Sounds pretty good to me.  In fact, I should preface all of this with the notion that I'm not against improving the passer rating system.  Every stat geek in the world knows the current passer rating system's shortcomings, and even mainstream fans have a passing knowledge that the system fails to take into account a quarterback's abilities rushing the football.  Fans of John Elway have been looking for a better quarterback stat for decades in order to measure his true value in the Valhalla of NFL quarterbacks.  Elway took--nay, willed--his team to three Super Bowls early in his career.  Then, later in his career, and with a little help from his friends Gary Zimmerman, Terrell Davis, Shannon Sharpe, Rod Smith, and Eddie McCaffrey, he won two Super Bowls in a row.  However, because Elway's career passer rating was a rather pedestrian 79.9, the naysayers continue to throw rocks.  After all, 23 starting quarterbacks each had higher ratings than this in 2010.

What Elway lacked in rating, however, he made up for in wins.   We are told wins are what matter.  ESPN interviewed several quarterbacks for their special on TQR, including Ben Roethlisberger, Joe Flacco, Matt Ryan, Matt Cassel, and Tim Tebow.  To a man, everyone agreed that when evaluating quarterbacks, wins and losses were the con carne of football stats.

TQR and winning

Which brings us back to TQR. I took the entire TQR database from 2008-2010 (and the quarterbacks listed therein) and ran a regression against winning percentage over the same time period.  In short, I wanted to see if TQR did exactly what Dilfer and others at ESPN said it would do-- correlate to wins.  I also did the same with the standard passer rating sytem that was created in 1971.  This is the same rating system that Ron Jaworski said (and I agree) was "antiquated and needed to be revised to really reflect all the components that make up quarterback play."

The following chart summarizes the results of the regression in statistical terms:

Metric  Win Correlation  
Total Quarterback Rating 0.6765
Standard Passer Rating 0.6526

(Note: p-values less than .05)

For those not inclined towards statistics, a correlation of 1.0 describes perfect correlation (a perfect "fit").   A correlation of 0.0 shows no relationship.  A correlation of .70 or greater is generally considered strong.  Note, a strong correlation doesn't absolutely mean that TQR or passer rating causes winning.  Correlation is not causation (if you don't use this disclaimer, the stats trolls form an army and they kill you, or at least pound you on message boards).

Both TQR and passer rating show a moderately strong correlation to a quarterback's winning percentage.  However, what really stood out to me is that TQR barely beat outs its older and uglier cousin in correlating to winning.  Interestingly enough, TQR and passer rating are highly correlated to each other at .8750.  It makes total sense, then, that neither metric would be significantly more correlated to wins than the other.  TQR and passer rating are so highly correlated, in fact, one wonders why ESPN would bother calling the new metric a game changer.

Total Quarterback Rating and other well-known alternatives

After recognizing the close fit between TQR and passer rating, I was interested in exploring some other well-known quarterback metrics to see if there was anything better. Thus, I ran the regression--using the same period of 2008-2010-- against the following quarterback metrics:

  1. Pro Football Focus' Quarterback Rating System
  2. Touchdown rate (TD%)
  3. PFR's Adjusted Net Yards per Attempt (ANY/A)
  4. Advanced NFL Stats' Expected Points Added Per Play (EPA/A) and Win Probabiity Added Per Game (WPA/G)
  5. The Football Outsiders' DYAR and DVOA 

Here's what I found:

Metric Win Correlation
Total Quarterback Rating 0.6765
Standard Passer Rating 0.6526
Pro Football Focus QB Rating 0.5978
TD % 0.5649
ANY/A 0.6626
Win Probability Added/Game 0.7514
Expected Points Added/Attempt 0.7044
DYAR 0.6450
DVOA 0.6631

(note: p-values less than .05)

Surprisingly, or perhaps not so surprisingly to fans of Brian Burke's work, both WPA/G and EPA/P had a higher correlation to winning than did TQR (for a detailed explanation of Burke's two stats, go here).  WPA/G actually showed a strong correlation.  I chuckled at this when I saw the data because ESPN goes out of its way to thank Burke for advancing some important statistical concepts over the years, but then immediately implies they had to finish the heavy lifting.  The result was TQR.

It's also worth noting that two other stats (ANY/A and Football Outsiders' DVOA), were just slightly behind TQR.  This further dampens my enthusiam for the game-changer tag.

Other Questions

Perhaps the more interesting questions are:

  1. Why are TQR and passer rating so highly correlated to one another?
  2. Why are Win Probility Added and Expected Points Added more highly correlated to winning percentage than TQR?

The first question perplexes me, and frankly, I don't have a good answer for it.  TQR should have separated itself from passer rating, yet it doesn't.  Is it simply that passer rating, after all these years of ridicule, actually does a decent job of rating passers in a passer-driven league?

I also find the second question fascinating because TQR supposedly tried to bridge the gap (if one exists) between Win Probility and Expected Points with its clutch metric.

Perhaps dividing up credit to the quarterback on any given play is just too hard a task and introduces far too much error into the process.  This issue has vexed me for years, and as I've watched more and more game tapes, I realize that the task is almost impossible.  Even the smallest breakdown by any one of the eleven players on a given play could place the quarterback's contribution in peril.  This is especially true when facing Haloti Ngata.  But that's another discussion entirely.

Ultimately, by the very fact that ESPN attempted to pull the quarterback's contribution out of EPA and WPA, they may have made their metric less correlated to wins and losses, and as such, betrayed what Trent Dilfer said they wanted to achieve.  After all, EPA and WPA include the full contribution of a quarterback's teammates.  In the end, perhaps the real lesson is this-- a quarterback's contribution to winning is not as important as Dilfer (or the rest of us) think.

After all, isn't that why god created diva wide receivers?

TJ Johnson can be reached through telegraph, ESP, Spanish interpretor, or via email: Follow him on Facebook and Twitter if you want to see him mock "the man."  He assumes you are following It’s All Over Fat Man on Facebook and Twitter, but if you are not, that’s nihilistic, man.

I’m glad we had this talk.  Now, vaya con Dios, Brah.

Agree, disagree, just like us on Facebook and follow us on Twitter so I can quit my day job.

Gut Reactions