clock menu more-arrow no yes

Filed under:

What to expect when you’re expecting (goals)

New, comments

How exactly does one expect goals?

NHL: Toronto Maple Leafs at Pittsburgh Penguins Charles LeClaire-USA TODAY Sports

It’s no secret that the hockey analytics world is in the stone ages when compared to other sports. And that’s fine. As a continuous sport where the objective (scoring goals) has a low success rate, hockey is inherently more difficult to model than a sport like baseball or basketball. Baseball has the advantage of being comprised of mostly independent discrete events, while basketball has more clear delineations between possession for teams, making it easier to separate offense and defense.

For a long time, advanced stats in hockey boiled down to counting shots. While that proved to be a large incremental improvement, counting shot attempts is not the finish line - we can make adjustments to it so it more accurately describes and predicts the game. We already (very commonly) do so when we use score adjustments. But we can go further.

Thankfully, some very bright people (@DTMAboutHeart and Emmanuel Perry/@MannyElk, among others) have done that, and it’s culminated in a concept called ‘Expected Goals’ (xG). Like any relatively new hockey analytics development, there is still some uncertainty around it. So think of this as an FAQ so you can better understand the concept of xG, what it can and can’t be used for, and where hockey stats goes from here. I’ll be borrowing heavily from the writeups of each of the analysts I’ve mentioned. For a deeper dive, I highly recommend checking them out here and here.

What is xG?

You can think of xG as Corsi, but with shots weighted by how likely they are to go in. If that sounds like ‘shot quality’ to you, you’re on the right track. Essentially, xG attempts to adjust for shot quality, based on a variety of factors. These factors will be different from one xG model to another. But to use an example, Perry’s xG model includes adjustments for the following (taken from his post linked above):

Shot type (Wrist shot, slap shot, deflection, etc.)

Shot distance (Adjusted distance from net)

Shot angle (Angle in absolute degrees from the central line normal to the goal line)

Rebounds ([...] Whether or not the shot was a rebound)

Rush shots ([...] Whether or not the shot was a rush shot)

Strength state ([...] – Whether or not the shot was taken on the powerplay)

With these adjustments, the model is able to differentiate from the fact that a slap shot from 30 feet, directly in front of the net has a higher likelihood of being a goal than a wrist shot from the goal line. As such, it can give us a more accurate view about the chances a team generates and gives up than Corsi. Like Corsi, it can be discussed at both an individual and team level.

How does it make these adjustments?

So without getting too heavily into the mathematical details, a big set of shots is run through a logistic regression, which is a type of regression model used to predict binary events (‘yes or no’ events). In this case, what we’re predicting is whether a shot becomes a goal, and we’re trying to predict it using the characteristics of the shot. These characteristics are what we are adjusting for (things like shot location, shot type, and the other items I mentioned in the previous section). In a non-technical sense, this regression outputs a formula where you can plug in the factors you want to adjust for regarding a given shot, with the result being the estimated probability of it turning into a goal. This is not a very robust mathematical explanation, but for pretending to know how this works so you can seem superior to your friends, it’s more than sufficient. Refer to Perry’s or @DTMAboutHeart’s linked write-ups for more mathematical rigour, if you’re interested.

You mentioned (at least) two people who both have their own xG models. Are they the same?

No, they are not the same, and this is a bit of a drawback. Unlike shot attempt models (for simplicity, I’ll refer to these as Corsi from here), which everyone can calculate and get to identical (or close to identical) results, xG models are highly individual. It depends on the different factors adjusted for, the methods they use (which may be slightly different), and a variety of other issues. While the two xG models I’m discussing most use similar methods, they are not exactly the same, and won’t give you the exact same results.

Where can I find xG numbers, and how do I know which model to use?

Corsica.hockey, which is likely the biggest hockey stats site now that War-on-ice has closed, uses Perry’s model, and because of that, it is by far the most accessible public model that I know of. It’s updated nightly throughout the season as well, which is a big plus. I should note that @DTMAboutHeart’s model does have a large portion of its data published, but it doesn’t include anything from 2016-2017. That said, it’s still super useful and illuminating to dig through the data they do have. However, the fact that Corsica’s is updated continuously throughout the season makes it more useful when it comes to looking up more current data.

What is it good for?

We wouldn’t be interested in xG if it didn’t represent a step up from what we already had, or provided some other value. Two key uses for stats like this are describing what has happened, and predicting what will happen. In general, xG does a better job of describing play that has happened than Corsi (defined by its correlation to GF%, within sample). However, prediction is more important for our purposes than description, so that’s what we’ll focus on.

Like many things involving xG, the results are model dependent. @DTMAboutHeart’s model actually predicted future scoring within a season on the team and individual level with a higher degree of efficacy than either goals or Corsi (see figure below). So it’s clear some xG models can be better predictors than what we currently have. However, as I said, this is model specific, so be sure to check that the model you’re citing IS actually predictive before using it in that context.

Predicting future GF% at Team Level (within season)

On the other hand, Perry’s xG model hasn’t shown to be predictive at the team level, though it is predictive at an individual level (individual expected goals being a better predictor of future goal scoring than individual Corsi or goal numbers). As such, it’s best used for descriptive analysis, as well as goaltender/shooter analysis. As of right now, it shouldn’t replace Corsi as one of the key team stats we care about for predictive purposes.

How big a sample do we need before we can trust it?

When we’re talking within seasons, it becomes reliable around the same time that Corsi does (see @DTMAboutHeart’s article for further details). So it becomes a decent predictor as early as 10-15 games in, and its ability to predict GF% for the rest of the season is maximized around 30 games. Again, this will vary slightly depending on the model.

What limitations does it have?

As mentioned a number of times, the model dependence is a bit of a downside. You have to be careful when using xG that the xG model that you are citing is valid and can be used to make the inferences you want to. For example, you shouldn’t use an xG model that hasn’t shown predictive ability for forecasting.

Another notable limitation is that the NHL doesn’t publish shot locations of blocked shots, forcing them to be excluded from xG models. As such, the sample size of xG will always lag behind that of Corsi.

Like Corsi, xG is not a be-all, end-all stat, and is prone to misuse in that regard. Also like Corsi, when we’re looking at the player level, it is subject to all the same factors: teammates, zone usage, competition, and all that fun stuff.

xG is also more of a ‘what’ stat than a ‘why’ stat. It can tell you a lot about a team, but in terms of tactical observations and suggestions, it is limited, and needs to be accompanied by other stats (perhaps microstats - things like board battles, forechecking efficiency, passing networks, and so on), in addition to video analysis.

But these limitations mostly boil down to “consider context, and don’t pretend this is a holy grail”. Keep those in mind, and you’re good.

Why not use scoring chances / high danger scoring chances if we’re concerned with shot quality?

Basically, because scoring chances are a very primitive way of adjusting for shot quality. Rather than attempt to accurately assess the relative likelihoods of shots becoming goals, most scoring chance metrics simply segregate by shot location, throwing away all shots that don’t follow a certain definition of what a scoring chance is (which in of itself can vary from one system to another). The result is that you omit a lot of useful data, and needlessly constrict yourself to a smaller sample.

In addition to throwing away data, using scoring chances / high danger scoring chances is essentially binning shots. Binning, in this case, assumes that shots of a certain class are all equally dangerous, with differing danger levels between different classes of shots. In a sense, Corsi does this as well, as it treats all shots as equally dangerous, no matter where they are from. However, Corsi has an advantage in that its sample size is naturally far larger than that of scoring chances, meaning it stabilizes and becomes predictive faster.

Since shot quality exists on a continuous scale, adjusting for it via binning doesn’t make conceptual sense, except as a purely descriptive tool. Binning also implies a variety of assumptions that may not be accurate, and is something we should generally avoid without good reason.

Look, I just want to pretend like I know hockey stats. Can I keep using Corsi?

You absolutely can. Corsi is still a very useful metric to keep track of, and is still the best publicly available tool we have when it comes to prediction and forecasting at the team level. Future iterations of xG models will likely have it beat there, but for now, Corsi is still king.

What’s the point of all this math if basic counting is still our best team stat?

Good question, other Arvind. The short answer is that this research and math is incremental, and builds off of itself. Current public xG models may not currently be as useful as Corsi when it comes to prediction, but it provides something for future analysts and researchers to build off of and improve on. The list of sports where rigorous and high level math hasn’t been useful is a short one that is growing shorter by the year. xG is one of the preliminary steps to getting hockey to that point.

Thanks to Emmanuel Perry for his assistance with this article.