FanPost

Correlation of Fenwick Close vs P% and Normality

Previously I noted as shown in the table below how the large year to year variation in the correlation (R-squared) of Fenwick Close to P% should raise suspicions about the reliability of the correlation among users of Fenwick Close. This variability is shown by the average R-squared correlation of 35.6%+-15.1% over the previous 6 seasons. Note the standard deviation is over 40% of the R-squared correlation (15.1%/35.6%) which suggests at best a wide bodied Bell Curve of limited utility and at worse a Multi-Modal distribution function.

Regression_medium

The Concern with Wide Body Gaussian Distributed Data

To Illustrate a potential problem with Fenwick Close consider the following example of a multi-modal distribution (the distribution shown is tri-modal) that is approximated with a Wide Body Gaussian (normal/bell curve). Applying a linear regression which assumes the data is Normally Distributed ignores the existence of these real modes. That is, the linear regression ignore 'reality" which can lead to erroneous or false conclusions by ignoring these real effect that is occurring at these three peaks. To see if this applies to Fenwick Close vs P%, we need to test for normality of the data.

Widebodyvsmultimode_medium

Testing Normality

To test normality (that data follows a typical bell curve for which linear regression is applicable), six seasons of data from all 30 NHL teams of Fenwick Close vs P% was tested and failed the Anderson Darling Normality (46%). Therefore, Fenwick Close vs P% does not conform to Gaussian or Normal Distribution which is confirmed visually below in the plot of predicted P'% - P%. Note that Fenwick Close passes the normal distribution test, however P% and more critically the error term (residuals) fail the normality test which is an underlying assumption when using linear regression models.

Histogram_medium

The distribution shape is better described as an asymmetric multi-modal distribution with small positive skew.

Implications of Non-Normal Distributions

There are some implications being the distribution is not a well behaved normal plot:

  • Over the seasons, a higher # of team have a slightly negative fenwick (median is negative)
  • Teams with a small positive or small negative shot differential win less often or win more often respectively then what a typical normal distribution would suggest (this happens frequently)
  • Teams who out shoot by a large margin lose more then what a typical normal distribution would suggest (this happens moderately)
  • Teams who are out shot by a large margin win more then what a typical normal distribution would suggest (this happens less frequently).

In summary, Fenwick (shot differential) is less important factor then what we may expect from Normal/Gaussian Distribution.

What does this mean about Correlation?

But the more important point is that because Fenwick close vs P% fails the normality test , a linear regression is not appropriate and the results of R-squared correlation between Fenwick Close and P% are uncertain and cannot be trusted. The residuals (error term) have excessive skew (too many errors in the same direction) which introduces a systematic error into the linear regression. The likely sources of this error is from P% itself which I tested and is not a normal distribution and/or perhaps the linear relationship between Fenwick Close and P% is simply not valid.

How to Correct These Errors (And Future Work)

Correcting the systematic error maybe as simple as removing the team points from shootouts or perhaps ignoring results from "blow out" games but further work is needed as the systematic errors make the regression and correlation suspect. Conversely, if Fenwick Close is telling us something about "winning" it maybe missing some important information such as perhaps goaltending SV% (just speculation) that would make the relationship to winning (P%) more statistically robust.

Another possibility to remedy the error is is the application of a nonlinear transformation of Fenwick Close. That is, the relationship between Fenwick and P% maybe log, quadratic or exponential or some non-linear expression. Finally, another source of error could be from outliers. If this is the case, then we need to ignore the tails of the distribution which could imply that teams with extreme high or low Fenwick Close will be ignored as they distort and introduce these systematic errors.

Conclusion
Fenwick Close to P% linear correlation is statistically uncertain but based on observation, one could conjecture that Fenwick Close may still be a useful rule of thumb or approximation in hockey analysis. However, care must be taken as one cannot rule out the probability that Fenwick Close vs P% may lead to erroneous conclusions.

PensionPlanPuppets.com is a fan community that allows members to post their own thoughts and opinions on the Toronto Maple Leafs and hockey in general. These views and thoughts may not be shared by the editor of PensionPlanPuppets.com.

X
Log In Sign Up

forgot?
Log In Sign Up

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior users will need to choose a permanent username, along with a new password.

Your username will be used to login to SB Nation going forward.

I already have a Vox Media account!

Verify Vox Media account

Please login to your Vox Media account. This account will be linked to your previously existing Eater account.

Please choose a new SB Nation username and password

As part of the new SB Nation launch, prior MT authors will need to choose a new username and password.

Your username will be used to login to SB Nation going forward.

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

By becoming a registered user, you are also agreeing to our Terms and confirming that you have read our Privacy Policy.

Join Pension Plan Puppets

You must be a member of Pension Plan Puppets to participate.

We have our own Community Guidelines at Pension Plan Puppets. You should read them.

Join Pension Plan Puppets

You must be a member of Pension Plan Puppets to participate.

We have our own Community Guidelines at Pension Plan Puppets. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9355_tracker