Well I didn't end up doing what I set out to do, but this thing I created almost by accident is pretty cool, so I prettied up the spreadsheet and thought I'd share it with you fine folks.
What it is:
This sheet will let you see at a glance, given a certain score situation and number of minutes remaining in the game, what player will give you the better odds to finish that game in the lead (or tied) based on their historical GF/60 and GA/60 rates over the past 4 seasons (1000 minute min). I might expand it soon to let you pick the season range by adding a layered IF statement but I'm going to be really busy the next couple days so I won't get around to it until at least Thursday if it still interests me by then.
How Does it Work:
Using the historical rates of Goals For and Goals Against per minutes, I'm estimating the probability of each discrete outcome, such as 1 goal for and 0 against, given 1,2,3,4 etc. minutes remaining in the game, and then summing all the positive outcomes.
You as the user only need to make 3 selections:
1) Which players are you going to look at: Use the dropdown box or just type using the correct format
2) What is the starting score situation: Choose an number between -8 (trailing by 8) and 8 (leading by 8)
3) Choose your definition of Good Outcome: Either Tied at the end of regulation or Leading at the end of regulation.
After that, the important line to look at is Good Outcome (%) which will show the probability that a good outcome will result if the team plays to that players historical GF/60 and GA/60 rates.
See the following example:
We can see that with in a game with a 1 goal lead, with 10 minutes left a team full of Matt Martin's would have a 76.34% chance of ending the game with a lead, whereas a team full of Kyle Okposo's would have a 75.12% chance. Martin in fact, retains an advantage for almost the entire 3rd period (up to 16 minutes) before Okposo's much superior GF% takes over.
How is it Calculated:
I'm pretty inept at statics, so off the bat, I'm going to lay out the formula I used to calculate this. It passed the sniff test for me, but if it turns out I'm wrong well...you can disregard this whole thing.
I used a Poisson distribution to calculate each discrete outcome from 0GF/0GA to 8GF/8GA.
The excel formula for a given cell is:
Which is e^-(GF/min*#OfMinutesRemaining)*((GF/min*#OfMinutesRemaining)^GF)/GF! * e^-(GA/min*#OfMinutesRemaining)*((GA/min*#OfMinutesRemaining)^GA)/GA!
Fingers crossed that's right, but if there is an error, please let me know in the comments.
Please play around with this and post any interesting findings you might have. If anyone wants the unprotected version of the file, to make their own adjustments, just let me know and I can email it to you.
I also should use this space to point out several ways that this could be improved.
1) By using score situation GF/GA rates
- This can be done with easily accessible public info, it would just require far more complicated math
2) By acquiring data for scoring rates by time left in the game
- You could use a scraping tool to get this info. but I'm not aware of any public sites that host it. And again, it would require much more complicated math
3) As I said above, if anyone actually finds this interesting, I will change it so that you can select the range of years you want to look at.