Diving into Corsi: Limitations and applications of hockey’s ‘fancy stat’

In November, Sporting News presented a primer for hockey analytics. The aim of that primer was to introduce fans to the core concepts of one of the most significant shifts in hockey analysis.

Today, we’re going to do a deep dive into the “fancy stat” that you have likely heard the most about: Corsi.

Concept

Corsi can be used in all situations, but with analysis it’s most commonly applied to events that occur at 5-on-5 and even-strength hockey. This creates a level playing field for the stat, which is essential because time spent on the power play and the penalty kill would heavily influence the shot share.

Corsi measures all shot attempts — shots that are saved, hit the post or crossbar, are blocked, or misses, as well as goals — by an individual, a group of players (e.g. a line or defensive pair), or a team. 

If Alex Ovechkin is on the ice and during his shift and the Washington Capitals attempt two shots while the opposition attempts one shot, he would have a Corsi For (CF) of 2 and a Corsi Against (CA) of 1 in that shift, which makes for a Corsi differential of +1. If Ovechkin attempted both of the Capitals’ shots during that shift, he would have an individual Corsi For (iCF) of 2.

Because larger sample sizes are more valuable when we are looking for trends in data, Corsi is most frequently represented as a ratio (CF%). If Ovechkin played a game where he was on the ice for eight shot attempts for the Capitals and four shot attempts against at 5-on-5, he would finish the game with a 66.67 CF%. Another way to phrase that number is to say that the Capitals took 66.67 percent of the shot share when Ovechkin was on the ice — that number would suggest that Ovi was helping to drive possession for the Capitals.

Possession

It’s essential to note that Corsi is not an exact measurement of possession. It is, however, an indicator of possession that reveals which team is winning the shot share battle and which players are the greatest influencers of it — both positive and negative.

In hockey, regardless of the score, you want to have the puck. 

Having control of the puck means you are either on the attack or preparing to transition to the attack, which means you have a higher probability of scoring a goal. Of course, you can’t win a game without scoring a goal (unless you want to count the shootout, but who wants to do that?). Having the puck is also a great way to prevent your opponent from scoring a goal. 

We can all point to plenty of examples of teams winning games because they excelled at the counterattack despite being out-shot and out-chanced, but over the course of an 82-game regular season, the teams with the most success are typically those that excel in the shot share battle at even strength (and, more specifically, at 5-on-5).

Like Corsi, Fenwick is also an indicator of possession. Unlike Corsi, it excludes blocked shots from its measurement. So it encompasses less data, but can offer different insight for those who value shot-blocking as a skill and want to take it into consideration when analyzing the shot metrics. 

Quantity vs. Quality

If shot metrics were an onion, examining the “quantity versus quality” debate would be the next layer of the bulb. The crux of shot quality comes down to the fact that shots taken from different areas and in different situations are more valuable than others because they have a higher probability of being a goal.

Picture a piñata and imagine that it has a single piece of candy inside. Each blindfolded bash you give it increases your chance of releasing that candy from its cardboard prison. But not every whack with the bat has the same odds of getting that candy free.

If each blindfolded whack represents a random Corsi event, a whack with an aluminum baseball bat would represent a High-Danger Corsi For event (HDCF).

The nitty-gritty behind shot quality comes down to assigning a value to each shot attempt based on its historical statistical probability in becoming a goal. This scans with the eye test — shots taken from the slot, rebounds, and shots taken after receiving a pass from the other side of the slot all have a higher chance of being goals than an unscreened shot taken from the point.

Score and Venue Adjustment

When a team is up by a score of 5-1 midway through the third period, they are going to take their foot off of the gas and the team that is trailing is going to put the pedal to the metal. It is because of this that score-adjusted Corsi exists. The same is true of the quantifiable value of home-ice advantage. 

Micah Blake McCurdy and others have developed formulas to adjust Corsi and Fenwick values based on the state of the score and the venue. Some of these formulas also take into consideration factors like zone starts. Collectively, these adjustments are commonly referred to as adjusted Corsi. 

Without these adjustments, Corsi numbers can be misleading in both small and large sample sizes.They are, speaking broadly, more insightful than raw Corsi data because they take into consideration that all-important factor in sports analytics: context.

https://images.daznservices.com/di/library/sporting_news/c0/90/corsi-nhl-standings-2018-19_1j4do5igr5t2g1ck20zsz6cet3.png”>

The table above shows the top-15 teams in adjusted Corsi from the 2018-19 season. It also shows where they finished in their respective conferences.

As you can see, the big outlier here is the Minnesota Wild, who had a record of 37-36-9 and finished well outside of the playoff race. Last year, the Wild were average on the power play and good on the penalty kill but they had below-average goaltending and the worst shooting percentage in the league (all situations). Pair that with the fact that they play in the Central Division, and you have a team that misses the playoffs.

Application

Let’s look at an example. On Dec. 23, 2019, the Boston Bruins defeated the Washington Capitals 7-3 at TD Garden. This was, in a word, a wacky game.

At the end of the game, the Capitals crushed the Bruins in the shot share — 70.18 CF%. When we factor in the score state and venue, the Capitals had a 66.44 Adj. CF%. So, how did they lose? 

The Bruins scored four goals in the first period (two of which were scored on the power play). They potted a shorthanded goal before the Capitals finally scored with 5:25 left in the second period. It’s also worth noting that the Capitals owned the shot share in the second period with an 84 CF%, which was a valiant response to being down 4-0 after the first 20 minutes of the game. It’s even more important to note that Boston’s last two goals were empty-netters. 

Both the Bruins and Capitals are good hockey teams and good possession teams. Down 4-0 at the first intermission, the Capitals’ only choice to get back into the game was to take chances and open up their game. This resulted in a lopsided 32-8 Corsi-differential in the final two periods, including an 8-3 edge in High-Danger Corsi. The Caps outscored the Bruins 3-0 at 5-on-5 in the last two periods, but hockey isn’t played exclusively at 5-on-5. Washington went 0-for-5 on their power play, allowed a shorthanded goal, and allowed two empty-net goals.

It should be clear by now that winning the shot share battle at 5-on-5 and/or creating more high-danger chances and shot attempts than the opposition will not guarantee you victory. Why? Well, because it’s hockey.

A common criticism of the application of analytics to hockey is that the sport is too fast; too free-flowing and improvisational. Sports analytics began in baseball — a sport that is far more hospitable to the science of statistics due its very nature. 

Almost each and every event in baseball starts the same way. The pitcher is on the mound and has the ball. There are countless factors to take into consideration beyond that, but the static state of the pitcher hurling the ball towards home plate at the start of almost every play made for fertile ground in statistical analysis and, later, the application of theories and strategies harvested from that data.

Sports don’t get a lot less static than ice hockey. Hockey is fast and, at times, almost chaotic. However, tools like Corsi enable us to predict the success of an individual and a team better than point totals, shots on goal, or wins. The accuracy with which we can project success scales with the number of events we take into consideration. In other words, large sample sizes tell us more than small sample sizes.

If we put on blinders to every game the Bruins played this season before they crossed sticks with the Capitals on Dec. 23, we would likely have a poor understanding of why they win hockey games. When we take into consideration a greater sample size of games, that picture becomes a lot more clear. The Bruins are good because they have an elite power play, one of the best first lines in hockey, and are a good possession team. So, they are a lot more than a team that can win after finding a 4-0 lead thanks to the power play and some good puck luck after the first 20 minutes of a game. 

Context

As we have mentioned before, context is everything in analytics. As integral as possession is to success, it is not everything. 

The same is true for Corsi and all other stats. Corsi is a valuable tool, but it should never be viewed as an unflappable truth that can stand alone. Its value is enhanced when we take into consideration as many factors as possible. For many, it represents the hammer in the toolbox; the go-to number after the boxscore stats, but there are many other tools that compliment it and thereby increase its value.

All data courtesy of naturalstattrick.com and evolving-hockey.com.

Source: Read Full Article