Predicting the future is what using statistical models of hockey is all about. Statistics, or things that happen in a game that we count up, can also be used to more accurately describe past games than our faulty perceptions can do. This is why we add up the goals for and against, after all. But if the measurements and the calculations you subject them to and the uses to which you put them don’t predict anything, than you might as well just flip a coin when it’s time to talk future wins.
Predicting the future is what Corsi does well. It predicts itself—so if at a certain point a team has a certain Corsi For percentage, you can feel confident they are likely to have that number for the whole season. Not in every game of the season, obviously, but averaged over the 82 games.
It predicts future goals and therefore wins too. And now we have a valuable thing. Now we have the thing that will say this team is more likely to win than that one.
This is a nice easy to read chart that shows you Corsi, Expected Goals (which is a little better at predicting by this analysis) and Goals.
As you go along the bottom, the number of games in a season, you see that Corsi peaks at about 20 games in R^2, and Expected Goals peaks at about 30 games. They both have higher R^2 numbers than Goals themselves do. They both predict future goals better than goals. So if a team scores a lot more goals than they allow in their first 20 games, but their CF% is low, then they are not very likely to score at that rate for the whole season.
What the heck is R^2 though?
Here’s a nice simple answer:
The coefficient of determination (denoted by R2) is a key output of regression analysis. It is interpreted as the proportion of the variance in the dependent variable that is predictable from the independent variable.
So how much of future wins is predicted by Corsi? The answer is it’s R^2.
And here’s how you interpret the answer:
An R2 of 0 means that the dependent variable cannot be predicted from the independent variable.
An R2 of 1 means the dependent variable can be predicted without error from the independent variable.
An R2 between 0 and 1 indicates the extent to which the dependent variable is predictable. An R2 of 0.10 means that 10 percent of the variance in Y is predictable from X; an R2 of 0.20 means that 20 percent is predictable; and so on.
Please note that in the above chart the Y axis is in hundredths. So that is 0.10 to 0.30.
Let’s look at a messier graph that is Corsi predicting future win percentage:
It’s all kinds of Corsi. Focus on the green line which is Score and Venue Adjusted Corsi. The green line peaks around 20 games, stays up there and then falls, first slowly and then rapidly.
What that means is that at game 22, say, you can look at at team’s Corsi and say, they are fairly likely to win at a good rate over the next 60 games.
When you hit game 42, and you want to predict the outcome of the next 40 games, Corsi works, but not as well.
When you hit game 62 and you want to predict the outcome of the next 20 games, Corsi barely works at all. And by the time there are only 16 games left, it’s barely moving the R^2 needle. This is true for any small selection of games. This is true for any one game. The Corsi in the game you are watching isn’t predictive of its outcome, nor is the Corsi that went before.
Flip a coin ten times. What did you get? Exactly five heads and five tails? Maybe you did, or maybe you got some other proportion. But you know that if you flip that coin enough, eventually you’ll have half and half.
So, now you’ve flipped it 900 times and you have 450 each. Flip it ten more. What do you get? You get a random number of heads and tails. And that is exactly why the last few games can’t be predicted by the bulk of what came before.
There is randomness—luck—in the outcome of all games. Bad bounces, a goalie sneezes, the ice has a rut, it goes in off someone’s butt—none of that is predicted by Corsi and yet it can win or lose a game. And the fewer games there are, the more magnified the other factors, including randomness, are in determining the outcome.
You can account for the non-luck factors. You can create a probability of a win in any hockey game. The strength of the team, the injury status, home ice, the time of day, and a host of other things can be measured for effect and accounted for, and if you want to look at game probability predictions they are out there.
Probability is not destiny. Luck still gets a say.
So if you take a random number between one and sixteen, that’s how many wins the Leafs will get. (I refuse to consider none, although that’s possible.) Some of those numbers are more probable than others. But all are possible outcomes.
So what do I do, you’re saying. Can you cheer for the team, should you believe in them. Are they worthy of your faith? Will they make the playoffs?
That sounds like dating advice to me, and I don’t do that. You picked them already! You chose the Leafs, and you’re an item, and you can give all the faith you can bear to have fulfilled. I’m not going to tell you you picked wrong.
The Leafs making the playoffs is a bit like the dog who finally catches the car, let’s be honest. Make sure you can take it before you give them your faith. They might actually do it!
Maybe you’re the jaded type who likes to nod sagely and say, "This is why we can’t have nice things." Maybe you want to assume the Leafs will lose so you won’t be disappointed.
If this works for you, you should do that.
All things are possible.
All section headings are from Random.org and are numbers between one and sixteen.