A Theoretical Framework for Optimizing Forward Lines

Do you think Zach Hyman should be Auston Matthews’ winger?

Why?

We all have opinions about line combinations. It’s why we wait with bated breath to see what Kristen Shilton tweets from Leafs practice. These opinions are informed by various things. We have underlying theories about who to play where, or whether stacking lines or spreading out scoring is superior. We have our eyes telling us that two players just don’t seem to click for whatever reason, or our memories telling us about what happened the last time a combination was used.

For coaches, it’s similar. They use their professional opinion and insight, in conjunction with the results and data that have already been observed and try to come up with the line combinations they think best set up the team for success.

This leads to the question: can we come up with a framework for thinking about forward lines that gives us another, objective way of determining the best possible combinations?

I’ve been thinking a lot about this recently. And as it turns out, the Leafs are an interesting case study for this sort of thinking, because they have kept their forward lines relatively constant throughout the season. When they have made changes, it’s generally been to exchange one player for another across lines (for example, swapping William Nylander and Connor Brown so the former plays on Nazem Kadri’s wing and the latter plays on Auston Matthews’). There’s been very little wholesale line blending .

The Leafs are also interesting because their lineup combinations have been somewhat controversial (separate from the issue of who they are playing). Whether it’s keeping Hyman tethered to Matthews, moving Nylander away from the young American, keeping Soshnikov on the fourth line, or something else, there have been moments when Leafs fans have questioned the decisions of the coaching staff.

What we want to do is come up with a way of thinking about line combinations that allows us to mix and match players to get the best possible lineup. That is, the lineup that outscores their opponent by the largest amount in a given time frame. Let’s first make a couple simplifying assumptions to make the problem more manageable, while still maintaining its applicability to a real-world hockey setting.

Assume centres can only play centre, and wings can only play their typical side on the wing. As it relates to the Leafs, this means Nylander is treated as a normal RW.
Assume a given minute split for each line, as a percentage of all even strength ice time. Obviously, if we were being totally egalitarian, the split would be 25%/25%/25%/25%. For the Leafs, who roll three lines in relatively equal frequency and a less common fourth, something like 27.5%/27.5%/27.5%/17.5% may be more accurate.

With those two assumptions made, we now need to determine something to optimize. It should be clear that what we want to maximize is the team’s scoring differential, and that this will be a straightforward function of the scoring differential of each line, as well as the amount of time that each line plays. Therefore, we’d maximize something of the form:

0.275*GD_Line1 + 0.275*GD_Line2 +0.275*GD_Line3 +0.175*GD_Line4

where GD_LineX is the expected goal differential* on a rate basis of a given line X. Each line can be thought of as a set of three players. Essentially, we want to roll the set of lines that lead to the team outscoring its opponent by the largest amount possible.

* To clarify, this is the goal differential we would expect, not the Expected Goal differential. That is, we are not maximizing the differential between Expected Goals For and Expected Goals Against. We are maximizing the difference between what we predict goals for and goals against will be.

More mathematically, we’d represent each line as a vector of length three, with each component representing a player in a specific position (LW, C, RW). GD_LineX is essentially a function, that takes a line of players as input, and outputs their expected results. What we’re optimizing over is all the possible combinations of these for a specific team. For the Leafs, one possible solution, using their current default lines would be like so (using numbers in lieu of player names):

Line 1: [11 34 12]

Line 2: [47 43 29]

Line 3: [25 42 16]

Line 4: [15 18 26]

If you plug in these lines to the quantity we’re maximizing, you’d get the expected goal differential at ES of this particular lineup (henceforth referred to as objective value). We want to find the combination of lines that lead to that value being as large as possible. This combination of lines is our optimal solution to this problem.

There’s one obvious way to do that, which is brute force. That is, try every possible line combination and see which one has the highest objective value. In this case, there are 2,304 possible combinations (thanks to assumptions 1 and 2). However, if we changed assumption 2 so that the TOI proportions were different for every line, there would be 13,824 possibilities. That’s not a huge amount in the grand scheme of things, but in general, it’s better form to not brute force these types of solutions, because it can lead to extended run time and your computer blowing up.

A smarter way is to use a local search algorithm, which is a method of combinatorial optimization. We can think of each line as an ordered sequence of players (where the order corresponds to position), and use this algorithm to intelligently find the best set of lines.

So now, with the help of some reasonable assumptions, we’ve identified a framework and algorithm that can theoretically provide us with a set of line combinations that maximizes even strength goal differential. That’s pretty cool! However, there’s one issue, and I sort of glossed over it in introducing this framework.

How do we predict even strength scoring differential for a line combination?

This is where the problem gets really hard. I mentioned that GD_LineX is the expected goal differential on a rate basis of a given line X. But coming up with a function that provides that (given a set of three players, representing a line) is a decidedly non-trivial problem, especially since said prediction needs to be reasonably accurate for it to have any value.

A possibility is to use real-world results for this. However, not every possible line combination is used. In fact, most aren’t. So the data is very sparse. A more likely solution is likely to arise via regression or machine learning methods.

Some intrepid analysts have already used these methods with success to create robust Wins Above Replacement (WAR) models, notably DTM About Heart and Emmanuel Perry. These models are similar to one that would predict on-ice even strength goal differential for a player. But adapting that to a group of players is very tricky, especially since hockey involves interactions among players. That is, the success of a player is not wholly independent of his linemates, even in a WAR model that takes them into account.

Constructing a model that can output expected goal differential for a unit of players would be the next step in creating a robust tool that can provide additional insight into whether teams are optimizing their lines correctly. It is by no means easy, but it’s another example of how advanced mathematics and optimization tools can be used by teams to gain a competitive advantage. We have the theoretical framework for doing so - now, we need to apply it.

A Theoretical Framework for Optimizing Forward Lines

From the Branches: Maple Leafs start the new year on a winning streak

Our favourite hockey memories from 2016