"Completion Probability" - A Sports Data Science Article on the Chiefs from KCSN
The latest sports data/analytics piece for KCSN from contributor, Joseph Hefner
How accurate is your favorite quarterback? Where does he rank amongst the rest of the NFL? And how can we know if we’re not watching and grading every other QB’s games? We could check the box stats, I suppose.
Box stats are the stats you can find on the ESPN stat tracker page when you’re checking a game. They include things like passing attempts, completions, completion percentage, passing yards, rushing yards, etc.
Box stats are important stats, but they often fail to tell the real story of how a game unfolded. The obvious example of this is garbage time yards, where a defense happily allows 75+ yards in the final minutes of a game, because they’re playing prevent defense. Garbage time stats inflate all of the box stats, and make the offense look better than it actually was (and conversely, the defense worse).
Completion percentage is often used to describe how accurately a quarterback throws the ball. The problem with it is that it doesn’t account for how deep a QB is throwing the ball or game situation. So a QB who throws a lot of short passes, screens, and dump-offs will have a much higher completion percentage than a QB that tries to press the ball downfield.
Completion Probability (CP) tries to account for this. It uses historical data (2006 to current) and a number of different factors (yard line, offense at home, indoor/outdoor stadium, down, yards to go, air yards, pass location, qb hit, and is also era adjusted) in its model.
Essentially, though, it comes down to the further the QB throws the ball from the line of scrimmage, the less likely the pass is to be completed. This intuitively makes sense. A screen pass is very likely to be completed, but a 50 yard bomb is much less likely to be completed.
Now, the public CP model doesn’t have access to the tracking data that other proprietary CP models have (like Next Gen Stats), which means it’s somewhat less accurate than that model. Maybe someday the NFL will release the tracking data, but until then, we make do with the data we have. Completion Probability takes the factors listed above, and estimates how likely a pass is to be caught.
The Completion Percentage Over Expected (CPOE) calculation is easier to understand with a concrete example, so let’s bring in Kadarius Toney’s first catch as a Chief. It was a screen pass on the first offensive play of the game.
The completion probability (CP) for that pass was 88% (0.88 CP). The outcome of the pass was a catch, so it gets listed as a one (1). An incompletion would be a zero (0). CPOE is the difference between the CP and the pass outcome (1 or 0), so in this case, the CPOE was 1 – 0.88 = 0.12 CPOE.
Average the CPOE over the course of a game or a season, and you start to get a picture of which QB’s are more accurate than others. Receivers drop balls, of course, which hurts a QB’s CPOE, but with a couple exceptions, drops are pretty random, so over the course of several games or a season, they don’t affect this stat much.
Here’s a graphic showing both Expected Points Added (EPA) and Completion Percentage Over Expected (CPOE) for all QB’s from the 2021-2022 seasons. Combining these two metrics into one graphic lets you see how QB’s are performing in multiple categories. The top right corner is all the QB’s who are performing above average at both stats.
Looking at the graphic above, we see that Tua, Burrow, and Geno Smith(??) have been very accurate, while Zach Wilson and Baker Mayfield have been the least accurate. I think that pretty well matches the eye test. Drew Brees leads this stat for most years he was in the league.
Also note the bubble size. That indicates the amount of dropbacks for each QB. Geno and Bridgewater have very small bubbles, which means a small sample size. Tua also has a small sample, though not quite as small. Generally, the larger the sample size, the more confident you can be in a number.
Mahomes is the 6th most accurate QB over the last two years, while also leading the league in EPA/play by a healthy margin. That’s pretty good, people. I ran this for 2018-2021, just for fun, and Pat is STILL 1st overall in EPA/play and 6th in CPOE. He’s been incredibly consistent the entire time he’s been in the league.
We’ll see if Tua and Geno can continue to put up these numbers the rest of the season, as their sample size gets bigger. Their small sample size is about the only thing we can ding them for right now. We can be very confident that Mahomes is going to continue to dominate these categories going forward, as his sample size is massive at this point.
You make a lifetime of difference for a child or teen.
Boys & Girls Clubs is looking for dedicated men and women to invest in the lives of youth through coaching. Did you know by participating in sports, children learn the value of teamwork, responsibility, good sportsmanship, and self-esteem? By becoming a volunteer coach, you are helping to power the dreams—and successful futures—of Kansas City’s kids! The Clubs are currently seeking baseball coaches. Click here to sign up today! All equipment will be provided. No prior coaching experience is needed.