Fenwick Close Road + PDO A Better Correlation Than Fenwick Tied Road
Regression of P% vs Various Factors
The regression of team winning (P% - percentage points) vs different factors from 2007 to 2011 is shown in the chart below for
1. P% vs Fenwick Tied Road % (FTR)
2. P% vs Fenwick Close Road % (FCR)
3. P% vs PDO (SH%*1000+SV%1000)
4. P% vs FTR + PDO/10 or P% (FTR% + SH% + SV%)
5. P% vs FCR + PDO/10 or P% vs (FCR% + SH% + SV%)
Fenwick Close Road + PDO/10 gave the best average results for Rsquared with the lowest standard deviation. Team winning is 54% explained by team shooting + team defence + puck possession (that is, PDO /10+ Fenwick).
Effect of Power Play and Penalty Kill
Adding special teams (PK PDO and PP PDO) to the regression for a single year improved the regression a few percent though it seems to be scraping the bottom of the barrel of relevant factors. In that regression PK PDO was 2.5x more important then PP for teams that tend to win.
The Hockey Gods Matter
This implies teams that win tend to based on
- 46% Luck (or some yet unidentified skill) and
- 54% PDO+FCR% (SH% + SV% + FCR%)
Winning appears to be almost 50% luck based.
Which team will regress this season?
P% plotted against (SH% + SV% + FCR %) gives an indication of team that maybe expected to regress.
MIN, CGY, DAL, NYR are expected to regress downward and CLB upwards. The regression accounts for 54% of the results so 46% of the results of winning are either luck (or some yet unidentified skill). Note +-1 and 2 sigma bands are shown in green and blue respectively. The red regression line used is the 2011 best fit.
PensionPlanPuppets.com is a fan community that allows members to post their own thoughts and opinions on the Toronto Maple Leafs and hockey in general. These views and thoughts may not be shared by the editor of PensionPlanPuppets.com.
12 comments
|
2 recs |
Do you like this story?
Comments
Minor nitpick
team shooting + team defence + puck possession (that is, PDO /10+ Fenwick)
Don’t you mean team goaltending?
"We are all agreed that your theory is crazy. The question that divides us is whether it is crazy enough to have a chance of being correct."
- Niels Bohr
Sorry, unauthorized hotlinking of copyrighted material not permitted.
Yes on a team basis it would be the average of all the team’s goalies ES SV% (goaltending). And so a good goalie may inflate a players PDO (i.e. SV% portion).
On a player basis it is the goalies ES SV% when the player is on the ice. And I’m not sure if a skater can influence SV% directly. It might be like SH%, in that the player can increase the odds of scoring by certain factors (screening goalie, shooting from a high probality location etc). And perhaps a players can influence his specific SV% by keeping shooters to perimeter and clearing the front of the net. I may look at that later.
But the nice thing about the regression of the sum of three factors (SH%, SV% and Fenwick) is that it negates some variables when using a statistic on its own.
Think of it like this, a poor defensive player would give up a lot of shots. If they have a great goalie the high SV% would “mask” their weak 2 way play. But then the player fenwick would drop accordingly to somewhat negate the effect of the stronger goalie. So that players PDO increase + F% decrease would some offset.. That is why I think the regression is is stronger.
If the player is a strong 2 way player with a poor goalie, then his on ice SV% would drop but his Fenwick would rise.
The SH% is the conversion factor to goals of shots direct at the opposition net.
The SV% is the conversion factor to goals of shots directed at the players own net.
A team that blocks a lot of shots or has an effective trap may be able to “inflate” Fenwick but not their SV%. Or vice versa, a hot goalie would inflate SV% and PDO but the team fenwick would suggest things needs to even out at some point. The two factors works better together.
Ah, the elusive win prediction model
Quick methods question: How did you select games to run the correlation studies. Odd/Even, half-year, year over year?
I think the hardest sell on PDO is that it’s reliability is so low. Even if it does correlate with P%, its not going to be predictive because it isnt repeatable (atleast in the studies I’ve seen). Of course data showing reliability ie. r(self), would be definitive. I’ve always wanted to look that, just never had the time.
I like how you included the SD in the graph at the end, nice touch.
I looked at year over year correlation for 2007 to 2010 and half year (YTD) for 2011. The only half year study completed was for the 2011 YTD campaign. If I can get the dataset for 2007 to 2010 intra-year analysis that could be completed.
Below I show the split half (odd/even) and spearman-brown prophecy for each of Fenwick Tied Road, PDO and Fenwick + PDO measures. By any reliability measure, PDO + Fenwick has a higher correlation with lower standard deviation. The Cronbach’s alpha is strange as it suggest no advanced stat (Fenwick or PDO or Fenwick + PDO) is not material, however the appropriateness of this test for this dataset is questionable (i.e. single variable and variability of P%). I need to to more reading but I think because P% is 50% luck, then Cronback won’t give a good measure near 0.7 (as is shown the Fenwick measure on its own and Fenwick + PDO).

I think the main advantage of Fenwick + PDO is that it can distinguish if P% is sustainable for a skilled team like Boston with higher SV% (Thomas) and SH% (3 strong scoring lines) compared to a team like Minnesota whose PDO + Fenwick is generating a P% beyond what should be expected. In fact, the measure can tell you if a more skilled team (Boston) is lucky or "good" that may not be apparent looking at PDO itself. That is, I’m suggesting that all teams may not regress to 1000 PDO and some may indeed be more skilled then other (though of course on a league wide basis the average is 1000). If so, the metric better describes why some teams can sustain a higher PDO for entire season and not regress to zero (that is why Boston last year can sustain a high PDO and a team like TOR 2009 can sustain a low PDO). Yes I agree SV% and SH% is not "repeatable" but some teams can sustain a higher or lower then league SV% and/or SH% for the season. So for example, the typical NHL team SV% maybe 920 +- luck. And Boston’s NHL team SV% is 940 +- luck. And Columbus maybe 900 +- luck. This metric is not perfect but does a better job then Fenwick Tied Road (average and stdev). And does not require the user to use as much context as PDO may require on its own.
At the end of the day, Fenwick + PDO works because it better predicts why some teams with same Fenwicks (skill) have difference in P% (they have better goalies and maybe shooters). Luck explain a lot but it doesn’t explain everything. And of course just because a team has a small skill advantage in net, we know that that luck accounts for at least 50% of the outcome.
At first I thought Conrach apha relative to P% wasn’t working because it had only 30 subjects (teams) so when I looked at Fenwick + PDO by player the Conrach alpha relative to +-/60 moved up to .65.
But when I look at Fewnick by player relative to +- 60, the alpha was less then 0.1 even for a few hundred players. Has anyone looked at this to make sure this is correct.
The split half approach for either measure were good for teams as well as players.
I’m not sure how to interpret this.
We can’t use fenwick and these type of advanced stats because we don’t have enough teams or players to get to 0.7
Or Conrach is not appropriate (too conservative)?
Or perhaps we need to look at game by game level to get more information
Thanks for lengthy response
I appreciate you taking the time to do all this work. For Cronbach’s you definitely need to look game by game. For all of these measures the best results come from game-by-game analysis.
Although I haven’t run the data for myself, some of what you have been describing doesn’t seem to fit the data that Gabe has presented at BTN. The data he posted on PDO suggested an auto-correlation (ie. reliability) through about 35 games predicting the next 47 games to be around 0.13. Not every study will show the same results, but it seems like controversial data to me. Hopefully I’ll have some time to run the numers for comparison. I’m intrigued with the idea of adding PDO to fenwick, although I’m not toally sold on it yet.
I’ll try to post the numbers when I get to it.
BTW below is the standardized alpha which gives expected correlation more in line with the results of half-split. With a 0.7 cutoff, PDO is marginal, Fenwick good and PDO + Fenwick better.
2010 Cronbach Alpha Std. Alpha
PDO+FTR 0.0691 0.8823
Fenwick 0.0565 0.7356
PDO 0.0076 0.6781
I don’t know how Gabe calculates that curve but it is impressive and strange that the two analysis don’t match up.
Also, I noticed PDO variability and Fenwick variability maybe inversely related (look at 2008 data) where Fenwick is better and PDO is worse. Sort of like an unlucky team (bad goalie or poor sh%) would have to shoot more to compensate. And a lucky team (high SH%) would shoot less and protect its lead. This maybe part of the reason the addition of PDO + Fenwick works well.
Also do you run a script to pull game to game data? I can’t find game to game data anywhere. And I’ll look for the post when done.
Just ran the numbers, for what its worth
So I used split-half 5v5 road data for the past 4 seasons. Turns out that’s like 20 games, which isnt a lot of data, but even at that length I think trends start to bear out.
Stat r(self)
PDO 0.13
PDO + Fenwick% 0.646
Fenwick% 0.733
I think your idea of trying to combine fenwick% with another non-possesional stat is a really good idea, it just looks like PDO is too random (ie. non-repeatable) to be useful imho.
I agree PDO is not consistent. The PDO + Fenwick is decent but not as consistent as Fenwick itself. I’m not sure how to trade off stat r(self) and Rsquared but it seems even though PDO + fenwick is better regression at the end of the season, for mid or early season Fenwick (or others have shown Corsi) to be better predictors.
I might reframe PDO to see if that can be improved.
Also how do you get 20 games or subsets for the test?
I had to code it by hand, which sucked
I have an access database of all available NHL play-by-play data from 2007-2008 through 2010-2011. I use that to parse information. It’s a bit disorganized but I can send it to you if you want basically a huge amount of data. I use querys in access to grab the data I want, then excel to run various functions.

by 




























