clock menu more-arrow no yes mobile

Filed under:

What Statistics Are Meaningful In A Given Season?

What numbers really matter when you're worrying about the success of your team?

Chris McGrath

So we're into the summer time re-hashing of what did and didn't work for various clubs in the NHL this past season. Last off-season we used this time to try and educate people about the meaning and use of a variety of "advanced" statistics - most of which really aren't that advanced and don't require you to be able to do more than calculate a percentage or count.

For those of you that missed those postings, feel free to click on the links below to get caught up.

Intro to PDO

Intro to Fenwick

Intro to Corsi

So that brings me to where we are in THIS off-season. Frankly a lot of ink and/or blood has been spilled in the debate over the Leafs season, and much of it centred around whether or not this team was successful thanks to increased toughness, more blocked shots, defensive "systems" that pushed shots to the outside, goaltending, or just good ole luck. Lastly of course we have the oft-referenced "intangibles" which strike me as an absurd thing to reference personally, since you're basically saying you're referencing something you can't describe in any way.

Then this week we have Joffrey Lupul go out and tweet this:

I get fairly tired - as I'm sure many of you do - of debating the meaningfulness of various statistics in terms of team success in the NHL. A lot of us discuss the importance of Corsi, Fenwick, SV%, Goals, Hits, Blocked Shots, Fighting Majors, etc. when we want to give our personal views on what really "matters" when it comes to team building.

Unfortunately - this isn't the sort of thing that is really open to debate. Anything we track numerically is quantifiable. If someone somewhere is watching it and tracking it with numbers, we can then compare those numbers to the outcomes we're looking for: wins and points in the standings. The "intangibles" side remains in the ether in all of this - obviously if something is "intangible" then nobody can describe it meaningfully with numbers because, well, it's NOT TANGIBLE. But I digress...

So - with all of the above in mind - I decided to run correlations over the past 6 years of regular season NHL hockey between a number of various team stats and the given team's point percentage and win percentage. I then ran year over year correlations to determine the predictive power of each yearly statistic. That is to say - I determined which of these stats are repeatable with a relatively high degree of certainty.

Just to describe what a correlation IS for those of you who aren't particularly math or statistically inclined, we measure correlation using Pearson's correlation coefficient, which on the tables below is under the column R. The R value is a value between +1 and -1 that describes the linear relationship between two variables - in this case the statistic on the table and either Team Pt% or Team Win%. Values closer to +1 are said to be highly positively correlated (i.e if the team does well in that stat, they should get more points or win more). Values closer to -1 are said to be highly negatively correlated (i.e. if a team racks up a lot of that stat they're likely not doing very well). Values close to ZERO have virtually no impact on wins and losses or points in the standings.

When we want to know how MUCH of a result is explainable by a specific variable, we square the R value (the R2 column). This gives us the Coefficient of Determination, which estimates the quality of that statistic as a predictor of the correlated value (Pt% or Win%). These values will also vary between +1 and 0, and thus values closer to +1 predict a higher percentage of Pt% or Win% while values closer to 0 offer no predictive value.

First let us explore the statistics that show a high correlation to wins and points in the standings:

Statistic Reliability Pt% R Pt% R2 Win% R Win% R2
5v5 GF% 0.175 0.838 0.702 0.816 0.665
5-5 F/A 0.163 0.829 0.687 0.811 0.657
5v5 Close GF% 0.079 0.808 0.653 0.773 0.597
GA/G 0.228 -0.722 0.521 -0.685 0.469
5v5 GA60 0.162 -0.685 0.469 -0.637 0.406
G/G 0.114 0.591 0.349 0.604 0.365
5v5 Close GA60 0.050 -0.625 0.391 -0.560 0.314
5v5 Close FF% 0.337 0.584 0.341 0.558 0.312
5v5 Close CF% 0.382 0.584 0.341 0.551 0.304
5v5 Close SF% 0.323 0.560 0.314 0.537 0.289
5v5 PDO 0.042 0.546 0.298 0.531 0.282
5v5 Close GF60 0.038 0.514 0.265 0.531 0.282
5v5 FF% 0.330 0.542 0.294 0.523 0.273
5v5 GF60 0.064 0.500 0.250 0.515 0.266
5v5 SF% 0.326 0.523 0.274 0.508 0.258
5v5 CF% 0.371 0.532 0.283 0.507 0.257

These are ranked with respect to their correlation to Win%. Everything listed here is important to a description of how a team is doing in the regular season and virtually every stat listed is a requirement for a team to be successful. The top seven statistics are all measures of goals for and/or against during the season. Obviously these would have the largest impact on wins and losses. Next come the shot metrics, all of which are reflective of factors that make a significant difference on the ice. Lastly in the middle there - you'll notice 5v5 PDO - which is just the sum of a team's 5v5 SH% and SV%. This is yet another meaningful and important way of tracking a team's success in the regular season.

So we have a collection of the most relevant team metrics in hockey for a single year - but they are NOT all reliable in the long run over multiple years. How can we tell this? Look at the Reliability column right beside the statistic. Those values represent year over year R^2 values for each of the stats over the 6 years of data available. The higher the numbers the more repeatable a given statistic is at the team level year to year.

Look carefully at the 5v5 Close metrics. 5v5 Close Corsi For % is quite highly repeatable - it's the most reliable metric on this list year over year. It is also highly informative of a team's likelihood of winning games. If you want a stat that tells you if your team is doing well, that is likely to mean anything in the future, this is probably the best statistic you can make use of.

Now look carefully at some of the other "important" yearly statistics. 5v5 PDO and the GF60 and GA60 stats. Notice how low their reliability scores are? That's because there's a large amount of variation in how much teams score or how many goals they allow year over year. This is because SH% and SV% are NOT repeatable, reliable statistics at the team level. Yes one player might be consistently good or consistently bad, but the randomness of all of his team-mates (the guy on the hot streak - the guy in a funk) has a way of balancing all of this out over the course of a season.

So while PDO and Goals For and Against mean a LOT in the standings, they aren't something you can rely upon to remain the same in the future.

Ok so those are the stats that correlate significantly to a team's winning or losing in a given season (the ones that really make a difference and matter). Let's have a look at the middling stats that do make a difference, but do not correlate quite so highly. These are the stats that may have more to do with the style of game a team relies upon, that DO affect winning and losing - but they don't correlate as highly to team Win% or Point% because it is possible for teams to win despite being less strong in one specific area.

Statistic Reliability Pt% R Pt% R2 Win% R Win% R2
5v5 Close PDO 0.002 0.514 0.264 0.489 0.239
5v5 Close FA60 0.337 -0.477 0.227 -0.458 0.210
5v5 Close CA60 0.443 -0.468 0.219 -0.447 0.200
5v5 Close SF60 0.257 0.459 0.211 0.436 0.190
5v5 Close FF60 0.258 0.457 0.209 0.434 0.188
5v5 FA60 0.376 -0.442 0.195 -0.433 0.188
5v5 Sv% 0.070 0.483 0.233 0.433 0.187
S/G 0.312 0.457 0.208 0.428 0.183
5v5 Close SA60 0.284 -0.441 0.195 -0.427 0.183
PK% 0.039 0.431 0.185 0.417 0.174
5v5 CA60 0.482 -0.417 0.174 -0.406 0.165
SA/G 0.321 -0.398 0.158 -0.404 0.164
5v5 SA60 0.321 -0.405 0.164 -0.401 0.161
PP% 0.073 0.384 0.148 0.393 0.154
5v5 SF60 0.260 0.412 0.170 0.392 0.154
5v5 Close CF60 0.349 0.411 0.169 0.382 0.146
5v5 FF60 0.260 0.402 0.161 0.380 0.144
5v5 Close Sv% 0.006 0.415 0.172 0.350 0.123
5v5 Close DZFO% 0.361 -0.361 0.130 -0.343 0.118
5v5 Close OZFO% 0.297 0.371 0.138 0.339 0.115
5v5 CF60 0.349 0.352 0.124 0.326 0.106
FO% 0.282 0.343 0.118 0.325 0.105
5v5 OZFO% 0.342 0.339 0.115 0.321 0.103

What you'll notice in this section are the Face Off statistics, Team 5v5 Sv%, and the component rates that make up Corsi and Fenwick. These are all relevant but they make a difference at the margins. Some aspects are VERY repeatable, for instance 5v5 CA60 is the most repeatable of the 90 statistics I examined. Meanwhile others - that we classically think of as extremely important - PP% and PK% are very unreliable year to year, and don't have a major correlation to Win% or Pt% anyway. 5v5 play is obviously far more of a factor to winning than the PP or PK. Not that either aspect doesn't matter, it's just possible for teams to win games without those two factors working in their favour.

Face Offs are also noticeably a fairly reliable feature of a team... but they're getting down in to the lower end of meaning with respect to correlation to winning or pt%. So yes - it's nice that we brought Tyler Bozak back to win draws, but his Face Off prowess is unlikely to make us a better team if he makes the team worse at 5 on 5 when he ISN'T taking a draw.

Lastly - let's look at the statistics with the LOWEST correlations to Win% and Pt%. These statistics show virtually no correlation to winning. That means that teams can win and lose whether or not they rate highly or not in these statistics.

Statistic Reliability Pt% R Pt% R2 Win% R Win% R2
5v5 Close Sh% 0.007 0.267 0.071 0.303 0.092
5v5 Sh% 0.015 0.267 0.071 0.298 0.089
5v5 DZFO% 0.430 -0.306 0.094 -0.291 0.085
Road MsS 0.017 0.227 0.052 0.200 0.040
Total BkS 0.060 -0.172 0.030 -0.182 0.033
Road BkS 0.009 -0.172 0.030 -0.180 0.032
Misconducts 0.083 -0.203 0.041 -0.179 0.032
Home BkS 0.223 -0.151 0.023 -0.161 0.026
Home TkA 0.545 -0.154 0.024 -0.155 0.024
Total MsS 0.059 0.176 0.031 0.149 0.022
Total TkA 0.335 -0.116 0.013 -0.124 0.015
PIM/G 0.339 -0.123 0.015 -0.100 0.010
Home MsS 0.172 0.118 0.014 0.094 0.009
Home GvA 0.645 -0.085 0.007 -0.092 0.008
# Bnch 0.000 -0.089 0.008 -0.090 0.008
B PIM 0.000 -0.089 0.008 -0.090 0.008
Total GvA 0.523 -0.068 0.005 -0.079 0.006
PIM 0.355 -0.078 0.006 -0.071 0.005
5v5 Close NZFO% 0.358 0.050 0.003 0.063 0.004
Majors 0.297 -0.077 0.006 -0.059 0.004
Match 0.048 0.033 0.001 0.035 0.001
Minors 0.516 -0.022 0.000 -0.026 0.001
5v5 NZFO% 0.438 0.016 0.000 0.017 0.000
Road TkA 0.010 0.039 0.002 0.016 0.000
Misc 0.083 0.016 0.000 0.013 0.000
Road Hits 0.031 0.015 0.000 0.003 0.000
Home Hits 0.284 0.028 0.001 -0.003 0.000
Road GvA 0.148 0.014 0.000 0.001 0.000
Total Hits 0.117 0.025 0.001 -0.001 0.000

You'll notice that team 5v5 Sh% is near the top of this list. It also has very low reliability. Then you'll see all of the penalty, hit, and RTSS stats. Virtually none of these matter to teams winning and losing. Teams win with an edge, or win without one. They also lose with an edge and lose without one. Being big and tough is NOT a cure all to a losing franchise... getting better at puck possession and spending more time in the other team's end is.

You'll also see that 3 of the 4 lowest ranked stats are Road, Home and Total hits. None of them matter particularly... but it's interesting to see how reliable Home hits are while Road hits are virtually non-repeatable. Obviously this indicates serious bias by score keepers around the NHL. Similarly Home Giveaways and Road Giveaways diverge enormously in terms of their reliability year over year.

So what does this all mean? Well - it looks like the Leafs are fixated on building a team that does certain things reliably. They want to win Face Offs, and win Fights, and lead the league in hits and blocked shots. Bring that lunch pail along and get to work etc. Unfortunately none of those things are remotely likely to change their actual play in the over-all standings. The reason the Leafs won last year was leading the league in 5v5 Team Sh%, and having a vastly improved 5v5 Sv%. They led the NHL in PDO. But as we can see above, all of those stats are amazingly unreliable and thus not likely to repeat next year.

Unfortunately - when it comes to shot differential metrics - the Leafs were god awful last season, and finished close to the bottom of the rankings. They bought out one of their best possession players in Mikhail Grabovski and let another excellent piece walk in the form of Clarke MacArthur, but decided to keep a stellar Face Off man in Tyler Bozak (who incidentally looks horrible by any puck possession metric). They've replaced some of what they lost by overpaying David Clarkson - who does all that other stuff that doesn't really matter too. They also traded for Dave Bolland who may or may not be effective in terms of possession.

I don't personally see a huge improvement on the horizon for the club, and think a serious regression is due next year in many of the un-reliable statistics the Leafs excelled in this year. If it doesn't happen - then that's wonderful for Randy Carlyle and Dave Nonis, but in the longer term they need to improve their fundamentals if they hope to win consistently.