There is a theory in hockey that states that the team that scores more goals in a game will win the game. Although this theory has seen challenges within the Texas Education Board among others, it has nevertheless gained acceptance throughout the hockey community. The objective of a general manager is to assemble a group of players who can consistently outscore the opposition. There are two facets to this: defense and offense. Defense is about preventing goals whereas offense strives to create goals. Predominantly, goals are scored by forwards. But what qualities do scoring forwards typically possess? How can general managers, and of course fans, identify who among the current crop of NHL forwards can be expected to produce goals consistently? In other words, what do the best forwards do to generate scoring chances? Please join me after the jump to read more about a new ranking system that I have developed.
In homage to the prominence and popularity of statistics like Corsi and Fenwick, I have decided to call my development the Smith Forward Ranking System. I will refer to it as the SFRS in this post, where I will discuss its application forwards who have played more than 30 games. The data for the past four seasons (the time covered by the SFRS) can be found at www.behindthenet.ca. The underlying premise of the SFRS is that forwards "control" the amount of by controlling the amount of shots that they take as well as from where on the ice they shoot. Furthermore, good forwards tend to make their line mates better, resulting in higher shots for the team while they are on the ice. As well, good forwards can find shooting opportunities despite playing in unfavourable situations, such as starting in the defensive zone.
What has been excluded?
Before discussing how these statistics can be combined to form the SFRS, it is important to justify why the system excludes measurements such as shooting percentage and quality of teammates.
Figure 1 shows the distribution of shooting percentages in 2009 versus 2010. Figure 2 shows the distribution of shots per 60 minutes in 2009 versus 2010. Clearly, the number of shots a player takes is far more repeatable than his shooting percentage. This makes shooting a more reliable statistic than shooting percentage.
Figure 1: Shooting percentages in 2009 versus 2010.
Figure 2: Shots per 60 minutes in 2009 versus 2010.
Quality of teammates is not considered in the SFRS because, as previously mentioned, good players tend to make their line mates better. A player should not be punished for having good line mates, as it could be indicative of the skill level of the player.
Methodology
Ultimately, the variables mentioned earlier – shots per 60 minutes, shot distance, on-ice team shots per 60 minutes, zone shift, and zone starts – serve as the foundation of the SFRS. However, the raw statistics have varying units and thus cannot be combined in a truly meaningful manner. This problem is solved by normalizing all the data with respect to either the rest of the league or a forward’s team. Essentially, all the numbers that comprise a set are divided by the maximum number in the set, resulting in a unit-less number between 0 and 1, where 1 is better. In the case of shot distance, where a smaller number is better, the raw data was first inverted. Zone shift was adjusted according to an expected zone shift for a given forward’s zone start. Figure 3 shows a graphical representation of how the expected zone shift function was determined.
Figure 3: Ozone% versus zone shift.
With the foundation of the SFRS established, the next step is processing the variables. This was accomplished by finding a linear combination of the statistics such that maximum correlation was achieved with goals per 60 (since goals are so important), normalized with respect to the league. The algorithm simply iterates through a number of scenarios and determines the point at which the R2 value is a minimum.
It was in this stage that a forward’s quality of competition was found to have a positive correlation with goal scoring. This indicates that some of the best forwards play against the toughest competition. This is discussed further here: http://www.arcticicehockey.com/2011/7/27/2294013/further-to-does-qualcomp-matter. Accordingly, quality of competition was ignored in the rest of the algorithm.
Figure 4 shows a plot of the resulting linear combination of the variables versus normalized goals per 60 minutes. This linear combination explains about 30% of the variation in goal scoring. Much of the rest is explained by shooting percentage, which is not player-controlled (see above). Table 1 shows the coefficients for the various underlying statistics.
Figure 4: Linear combination of normalized statistics versus goals per 60 minutes.
Table 1: Final coefficients determined by algorithm.
It is important to note that of all the variables, only zone shift was negatively correlated with goal scoring. In other words, the more a forward starts in the defensive zone compared to his teammates, the less likely he is to score goals. This is not surprising and is in fact an anticipated outcome. In terms of the SFRS, it means that in calculating the ranking of a forward, the zone shift factor must be subtracted from the linear combination of the remaining variables. On the other hand, when producing plots to correlate with goals, the zone shift factor must be added. This is to be consistent with the fact that starting in the offensive zone presents better goal scoring opportunities.
So what does all this mean? Good forwards will take lots of shots, they will shoot from close to the net, they will allow their team to generate a significant amount of shots while on the ice, and they will do so despite disadvantageous zone starts.
Results
And now the moment of truth: in Tables 2 to 5, I present to you the top forwards for the 2007-2008, 2008-2009, 2009-2010, and 2010-2011 seasons (they are quite large, so click on them to expand). Immediately, you should notice some recurring themes. Firstly, this ranking system absolutely loves forwards who take lots of shots. Regrettably, this comes at the expense of playmakers. Forwards like Joe Thornton, Ryan Getzlaf, and even Sidney Crosby are underrated in the SFRS. Secondly, the best forwards in the SFRS are not sheltered. The SFRS highly values forwards who starts in their own zone but still manage to produce shots. Thirdly, Sean Bergenheim? The SFRS also favours bottom six forwards who dominate the opposition. Bergenheim, for instance, lead the Tampa Bay Lighting in shots for per 60 in 2010-2011. The SFRS ranks truly elite shooting forwards at the top of the list along with the occasional bottom six player. While this can be distracting, please do not lose sight of the big picture. Finally, Phil Kessel is good.
Table 2: 2010 SFR.
Table 3: 2009 SFR.
Table 4: 2008 SFR.
Table 5: 2007 SFR.
Rather than offering any more of my conclusions, I would like to open up the comments to discussion. I hope that this post has allowed for even a basic understanding of the derivation of the SFRS. More importantly, I hope that the results provide some interest to everyone and spark some interesting conversations.