The 2011 Update to the MSq£ Model

Much of the last few months have been focused on models built with £XI data; both long- and short-term. (The £XI being the average cost of a team’s line-up over the course of a season, adjusted with TPI inflation). The resultant M£XI and m£XIR models (see previous blog entries for explanations) both benefit from the accuracy of accounting for the transfer cost of the talent that makes it on to the pitch, with the increased accuracy of not counting the “dead wood” of transfers riding the bench, for whatever reason, in a match.

Both models do an excellent job of post-match analysis, but they are not as good at pre-match or pre-season analysis as they rely on an accounting of the actual starting XI.  Sure, one might make projections days before a match as to what the starting XI may be based upon injury reports, players in- or out-of-favor with the manager, and other factors.  All of those, however, would require a number of assumptions be made about the rosters on top of the usual simplifications of the M£XI and m£XIR models – this is not a recipe for repeated, statistical success.

The one model that does not require such assumptions is the MSq£ model I introduced in my original post on this blog and The Tomkins’ Times, provocatively titled Soccernomics Was Wrong: Why Transfers Matter. This post will serve as the latest update to the MSq£ model, using the 2011 EPL season results and the 2011 Transfer Price Index database as inputs.

The Player Makeup of the EPL

One of the many contributions the TPI can make each year is understanding the makeup of the type of players within the league.  In my response to Stephan Szymanski’s January 2011 review of Pay As You Play I introduced a chart that showed the contribution to the overall Premier League league makeup of three types of players – “trainees”, “free transfers”, and “transfers + others”.  The following conclusions were made via observing trends within the graph that used data through the 2009-2010 season:

…[T]ransfers have consistently accounted for nearly 70% of the Premier League’s players since its inception. That’s not to say 70% of the players transfer teams each year, but rather that at some point in their past they were purchased by the team they played for that season. What has changed over the league’s eighteen years is the number of trainees within it. This number has plummeted from nearly 30% of league player classifications in 1992/93 to below 20% by [the 2009-2010 season]. Much of this change has happened due to an increasing number of free transfers… Free transfers have gone from only 2% of league player classification in 1995 to nearly 10% last season.  Overall, transfers of any variety came to represent 80% of league players by the 2009/2010 season.

This point conclusively proved that either (1) a lot of teams are wasting money in the transfer market, or (2) the transfer market is a functioning market that rewards teams for their increased expenditures versus the competition via improved table positions.  Scenario 2 was then conclusively proven via the series of posts exploring the impacts of the MSq£ model.

The updated TPI database including the 2010-11 season provides further confirmation of the importance of transfers.  The image below represents the league composition graphic, updated with the 2010-11 data.

The percentage of players in the league listed as trainees continued their downward trend in the 2010-11 season.  Over the 19 seasons of the Premier League, the percentage of players listed as trainees has declined by 0.8% per season, going from 29.7% to an all-time low of 15.1% in the 2010-11 season.  In contrast, the percentage of free transfers in the league has steadily climbed since the Bosman ruling, but had been in decline from 2004-05 through 2008-09.  A modest climb in the percentage of players listed as free transfers in 2009-10 continued in 2010-11 to make them 15.3% of the leagues players.  Last season may very well be considered the tipping point in the transfer vs. trainee debate, as it was the first time ever free transfers had eclipsed trainees as a percentage of players within the league.  In the age of digital media and instant success and failure, the quickest way to success is perceived to be via investment in transfers (paid or free) and not trainees.

Squad Cost Trends

Even with the increased reliance on free transfers, squad CTPP costs continued their upward ascent.  The graph below summarizes the continued upward trend in average Sq£ (the price of a squad when inflation is taken into account), as well as average £XI for a bit of comparison; the Premier League seasons displaying from left to right.

One of the adjustments from the original post is the fact that CTPP data from the first three years of the Premier League have been excluded in this graph, as the era of 22 teams in the Premier League introduces drastically reduced average Sq£ and £XI values and skews the regression equation’s results (for statistics nerds, they’re so divergent that the end up leading to a violation of the “normally distributed residuals” requirement of regression analysis).  Thus, season 1 on the graph is actually the 1995-96 season.

Squad costs have risen at a rate of just over £2M per season since the fourth season of the Premier League, and in 2010-11 they resumed their upward trend to go back above £140M for the average club after two years of declines.  This compares with only a £777k per year increase in average £XI – demonstrating yet again the continual decrease in utilization rates observed in my original post on this topic. (Clubs are gathering bigger collections of players, and having more unused talent.)

It’s not just the £50M purchase of Fernando Torres driving this uptick – it’s the knock-on effect such inflated transfers drive that is best exemplified by the snap (inflated) purchase of Andy Carroll to plug the immediate hole left by Torres. Everyone is paying more, because more cash is being introduced into the transfer market.

Of the 17 teams not relegated at the end of the 2009-2010 season, ten posted gains in their Sq£ while seven posted declines. Of the three teams promoted and relegated at the start of the 2010-2011 season, the average change was +£6.9M.  The biggest changes in Sq£ values were found at Manchester City and Chelsea, which increased by £104.5M and decreased by £54.5M, respectively. Overall, the average movement per team was +£5.7M from 2009-10 to 2010-11.  The table below provides a summary of each team’s change in Sq£.

Things at the top of the Premier League table also showed similar growth for a few teams, shrinkage for a few other, and Manchester United continued to be very consistent in their squad transfer costs.  The graph below summarizes the Sq£ for the Big Six EPL clubs in the Post-Abramovich era, while also translating them in to the MSq£ metric used within the regression model.

Chelsea maintained their top position in the Sq£ metric, although they continued their steady decline from their 2006-07 high.  The big news was the inevitable reversal of order in the Manchesters – City finally passed United after three years of heavy investment from Sheikh Mansour.

Liverpool and Tottenham swapped positions after the Torres-for-Carrol-and-Suarez deal drove up the Reds’ squad costs, while loans and non-appearances by Jonathan Woodgate, Kyle Naughton, and Ben Alnwick meant Tottenham took a collective £20.5M hit to their Sq£ metric in 2010-11 that they couldn’t completely fill with incoming transfers. From a statistical perspective, Arsenal has stood still the last two seasons after a lengthy slide down from their undefeated season in 2003-04. They have hovered around an MSq£ of 1.0 with a Sq£ of £153M, adding to supporters’ sense of frustration over not getting the last few pieces they feel their team needs to challenge for a Premier League title.

Effect on the MSq£ Model

What really matters in terms of statistical predictions is how the 2010-2011 season – both in terms of Sq£ and table position – affected the long-term averages used to create the MSq£ regression model.

There was one new team added to the model – Blackpool – which brings the total number of samples in the model to 44. There was also a continued strengthening of the relationship between increased Sq£ and table position – the average change in the MSq£ metric was a miniscule 0.01 while the average table position improved by 0.06.  The table below summarizes each team’s change in average MSq£ and average table position through the 2010 and 2011 seasons. The table is sorted in increasing change in table position from lowest (improvement) to highest (degradation)

The biggest gain in table position was seen at West Bromwich Albion, who improved their average finish position by more than one-and-a-half positions.  This is mainly due to a low sample size (four in the 2010 model, and now five in the 2011 model) that saw them in relegation form each of the previous four seasons except in 2004-2005.

Similarly, Manchester City also benefited from a combination of only 14 samples (out of 19 Premier League seasons), many seasons at or below an MSq£ of 1.0 (7 out of 14), a rapid rise in MSq£ (0.75 in 2007-08 to 2.76 om 2010-11), and table form to match (an average of 13th over their first 12 seasons followed by an average of 4th the last two years).  Even Manchester United and Chelsea were able to improve their average finish positions given United’s 19th championship and Chelsea finishing second yet again.

The impact of these changes can be seen in the regression analysis below.  As in previous posts, the graph contains the nominal regression line as well as sets of lines denoting the bounds of the 50th and 95th percentile prediction intervals (PI).  Falling within the black lines (50% PI) indicates expected performance.  Falling below the black line indicates over performance versus the model, while above the black line demonstrates under performance. Above the green line (lower 95% PI bound) is gross over performance, while above the red line (upper 95% PI bound) is gross under performance.

There are a number of conclusions that can be drawn when comparing the updated model to last year’s model:

  • The additional year of data has strengthened the relationship between average MSq£ and average table position as demonstrated by the R² value of the regression.  Through 2010 it was 0.689, while through 2011 it was 0.710. A full 71% of average table position variation can be explained by a club’s average MSq£.
  • The multiplier effect of additional MSq£ on table position (the slope term in the regression equation) has not changed from 2010.
  • Two new clubs have joined the previous list of seven over performers: Stoke City and Fulham. Fulham’s massive improvement by nearly one half of a table position moved them from the category of expeted performance to one of over performance.  Stoke City had been outperforming the model in 2008-09 and 2009-10, but had been ineligible for consideration until the completion of the 20010-11 season to complete their third season in the Premier League.
  • Even with their improvement in table position, Manchester City was not able to avoid the move from the expected performance to one of under performance.  Much like Chelsea, it will take some time of consistent top-table performance to counteract the rapid rise in City’s MSq£ values and previously poor table positions.
  • Oldham and Swindon remain the only two clubs labeled as gross under performers.
  • Of special note is QPR, who are rejoining the Premier League in 2011-12 after a 15 year absence. To date, QPR is the only club to achieve the gross over performer label.  In their first four seasons in the Premier League they averaged a 10th place finish on a shoe string MSq£ of 0.42.  We’ll see if they can keep similar form upon their return in 2010-11; given the way the Premier League has changed in the past 15 years, it’s hugely unlikely.


Overall, the MSq£ provides a useful guide for clubs and their supporters for setting realistic expectations for their club’s performance given the reality of the squad cost they bring to a season.  The 2010-11 data suggests that with an average Sq£ value of £141M, clubs will have to utilize squads that cost upwards of £211.5M (MSq£ of 1.5) or more to have a reasonable chance of playing in UEFA competitions (25% or better chance of 5th place or better in table).

All other clubs will simply be fighting it out for the remaining twelve positions that guarantee another year of Premier League competition, and hope they can demonstrate along the way enough improvement to attract additional financial resources and talent that come with them.

This model will also allow the multiple contributors to the Transfer Price Index blog to assess the positions each club finds themselves in after the summer transfer window closes at the end of the month.  We can then come back with a few statistically-guided predictions for final table positions that will also factor in some good old fashioned intuition to gauge the impact of the 29% of table position variation not captured within the model.

Zach Slaton is the author of A Beautiful Numbers Game blog, and a contributor at the Transfer Price Index and The Tomkins Times blogs. You can follow him on Twitter and Facebook.


  1. Soccer: The 2011 Update to the MSq£ Model » Stathead » Blog Archive - August 11, 2011

    […] The 2011 Update to the MSq£ Model: At Pay As You Play, Zach Slaton writes about some tweaks he recently made to his MSq£ model (a method of predicting match outcomes based on the transfer fees paid to each team’s starting 11). […]

Leave a Reply