Sunday, October 28, 2018

Heading for a Fall - From the Archives 2/24/17

It appears that the Sportdec blog, where I initially posted this, has vanished into the ether. So I figured I'd repost this here.


Saturday, August 4, 2018

In Defense of Per Possession Statistics

A while back now, Ashwin Raman wrote about per possession statistics and how they can be applied in football. One of the takeaways from the article was that per possession statistics were not a suitable choice to replace per 90 statistics as way to judge a player's value, as the most efficient players with the ball are not necessarily who we would consider the stars of the game. He is certainly correct that a player's efficiency in possession is inadequate measure of value by itself, but that doesn't mean it tells you nothing useful. On the contrary, when paired with a way to measure the offensive load a player has by how many possessions they are involved in, you can get a more rounded picture of a player's attacking contribution than by per 90 stats alone.

To see why, let's take a little trip down memory lane. Way back in the Dark Ages, "analysis" of football players was focused mostly on counting statistics. A player who scored 30 goals was reckoned to be better than someone who scored 20 goals. This functioned well enough to settle some dumb arguments at the pub, but it was pretty basic and suffering from the obvious flaw that it doesn't account for the time played by the respective players. The enlightened decided that counting statistics were inferior to rate statistics and moved to goals per 90 minutes, allowing us to see the frequency a player scored a goal given a certain amount of time on the pitch. This was certainly an improvement on the old system. However, as others have shown, the 90 minutes in a game has little actual relation to the action. If West Brom constantly have fewer minutes of on-pitch action because the ball is always on the sidelines, aren't their attackers' per 90 stats artificially deflated? If one team has possession of the ball 90% of the time and the other team can never even get the ball to its forwards, do the forwards of the respective teams really have the same level of opportunity to score? A better rate statistic would have the unit of analysis be more directly related to what is happening on the pitch.

Enter per possession numbers. Let me first state that what I'm talking about here has nothing to do with possession percentage as typically calculated. Instead, possessions (or possession chains) are defined as the number of times a team has the ball in a game, making them the fundamental unit of opportunity in football (Opta does actually define this now, but as far as I know they do not publish this anywhere. I proxy this by adding up a team's shots, unsuccessful passes, unsuccessful dribbles, and dispossesions). A team will only have the ball a given number of times in each game, so what they do with it each time is critically important. A team that is efficient in creating chances with their possessions is a good attacking team; a team that turns the ball over frequently and manufactures few shots is a poor one. As a result, teams need to make sure that each possession the ball is getting to the players who can do the most with it. This is why evaluating players by per possession numbers require looking at two things: efficiency and usage.

Let's view efficiency first. Ashwin was looking at how efficient players were each time they touch the ball, and I agree with him that it is probably a better way to evaluate tendencies of players rather than a means to evaluate overall talent level. But as far as value for the team is concerned, we really should be focused on the possessions a player "uses", or has the final action of a possession. This is because there is no scarcity in the number of times a player can touch the ball in a given possession (as the Portugal-Mexico game in The Simpsons brilliantly illustrates). There is however only one person who can have the final touch on the ball, so how a team allocates those final touches is critically important.  Good teams should direct the ball to players most effective at creating chances with the possessions they use, poor teams will generally have more possessions used by their defenders (or keepers) since they struggle more in build-up.

However, Ashwin's post raises a good question: if it is true teams are best served directing the ball to their most efficient players, why are the most efficient players not the stars of the game? The answer is that creation of good opportunities matters along with converting them. Being able to score tap-ins from crosses put on a plate is a good skill to have certainly, but if you can only convert chances and don't create them for yourself or others your value is limited (I'm looking at you, Jermain Defoe). In order to account for this, we have the concept of Usage Rate. This is a concept borrowed from basketball that looks at what percentage of a team's possessions a given player "uses", or has the final action, weighted by minutes played (you can find the calculation and numbers from the 2017-18 Premier League season here). Usage Rate is a way to measure the burden of creativity on a given player in a team's setup. This is not to say having a high Usage Rate is inherently good or a low Usage Rate bad. Using a ton of possessions if you can't do anything with them is a problem; conversely, for positions where turnovers are more costly you may prioritize safety over creation and a low Usage Rate is preferable. If we are looking for the best attacking players, we'd really want to find the players who create a high volume of chances and still maintain good efficiency with the ball.

For example, if we look at who had a Usage Rate above 15% and had more than 25% of their possessions result in a shot, key pass, or assist (which I'm calling Success Rate) with at least 1,000 minutes in the 2017-18 Premier League season, we get a list of 8 players: Cesc Fabregas, Eden Hazard, Alexis Sanchez, Philippe Coutinho, Kevin De Bruyne, Paul Pogba, Anthony Martial, and Christian Eriksen. To me, that seems like a pretty good list of some of the best attackers in the PL (I'll admit the inclusion of Martial surprised me a little). Use a better efficiency measure (say xG + xA per possession) and the accuracy could be improved further. If you don't account for Usage Rate, Kostas Mitroglou's 55% Success Rate looks very good. But factoring in Mitroglou has the 10th highest Usage Rate on his own team at 9% provides some much needed context, and makes Florian Thauvin (19% Usage Rate and 30% Success Rate) and Dmitri Payet (18% Usage Rate and 37% Success Rate) appear the clear stars.

All well and good you might say, but since the players above generally have very good per 90 stats as well, what's the difference? First, it enables us to make better comparisons between players than using per 90 stats since we can tell whether they accumulate those via efficiency or volume. Ronaldo and Messi are (still) the best players in the world and each averaged 8.5 shots + key passes per 90 last season. However, they are quite different in how they achieve that, with Messi using 21% of Barcelona's possessions (more than any PL player last year) and Ronaldo shooting it basically every time he touches it. Two, it provides a way to see how teams structure their attacks. Do they go long to a target man (like Peter Crouch or Andy Carroll, both of whom consistently have high usage rates)? Are they focused around a single playmaker, like West Ham in the Payet era? Are their attacks built on wing-play, like Leicester with Albrighton and Mahrez? Usage Rate is a useful tool for identifying such patterns. Finally, it can help analyzing how a player, or team for that matter, might do in a different context. A player with a high Usage Rate may see their per 90 stats drop in a new team if there are other high Usage players already present as there is only one ball to go around. This was one of the reasons I was against the Alexis to City transfer (from both sides) and on City signing Nolito. Likewise, if a team loses a player to injury or transfer, how the possessions that player used will be distributed is a key question that Usage Rates can help answer.

All this is definitely not to say that per possession stats are great for everything or that per 90 stats have no purpose. But once you account for creativity by way of Usage Rate, I do think they provide a valuable framework for thinking about the game, at least from an attacking perspective. Because the unit of analysis is more firmly rooted in the action on the pitch, they let you account for things that per 90 statistics can't. Per possession statistics will never, and should never, replace per 90 statistics but they remain under-utilized in analysis of the game.

Tuesday, June 26, 2018

17-18 Usage Rates


Finally got around to updating Usage Rates for the 2017-18 Premier League season. A couple notes:

1) Since Squawka's website has been pretty useless for a while, I used data from Whoscored. This means I went ahead and added dispossessions to the calculation.
2) The two sites appear to count goalkeeper passes quite differently, with Whoscored seeming to include a lot more of them (I assume this is because Squawka only included short passes, but I'm unsure about that).

While the above changes mean the calculation is more accurate, it does make historical comparisons more difficult.

For those unfamiliar with the general concept, I've written about the metric here, here, and here. Usage Rate is an attempt to measure the influence each player has in a team's attack by looking at the percentage of a team's possessions he "uses", or has the final action of a possession, weighted by minutes played:

(Shots+Key Passes+Assists+Unsuccessful Take-Ons+Unsuccessful Passes+Dispossessions)/
((Team Shots+Team UTO+Team UP+Team Dispossessions)/(Minutes Played/Total Minutes))

Thoughts/suggestions/comments are all welcome.

Link is here.