This analysis was done BEFORE I began to learn any systematical techniques on statistics or data mining. Many aspects in this article are premature and need much improvement. But it’s the starting point of my passion of data analytics, especially football analytics. So I’ll start my blog with it.
The motive of this analysis is to settle the debates over some issues of last season of Barça using data analysis. Data analysis can well compensate the vague impression, short memory and biased opinion that we usually have in a qualitative analysis. I believe that the team has a much more advanced and comprehensive system of data analytics, comparing to that, this analysis is very simple and crude. But with only the basic tools and limited data online, we can at least obtain some general idea from it.
Here I analyze the playing minutes of the first team players of Barca in recent five years, in order to understand the issues about rotation, age structure and the situation of the homegrown players.
All data are taken from the database of Spanish football http://www.bdfutbol.com
Before going into the analysis, let’s define a concept for our convenience: The first team player. We follow the practical concept rather than the official status: Those and only those who play more than (including) 90 minutes in all the official games of the first team in a season are defined as the first team players.
Here we use Standard Deviation(SDV) of the playing minutes of all the first team players to measure the degree of rotation. Standard deviation shows the difference among a bunch of data. For a greater standard deviation the data is more scattered and more different, for a smaller standard deviation the data is more concentrated, as we usually say, more “equal”. So more rotation corresponds to smaller standard deviation and vice visa.
In the meanwhile, I also calculate the standard deviation of the major players. “Major players” are defined as those who play more than the average minutes of the team. There are about 14(+ or – 1) major players in each season who are usually the starting XI. The “little rotation” represented by the major players is also an important aspect of the team.
We can see that for both the whole team and the major players, the standard deviations of season 12-13 are smaller than all the other 4 years. Season 10-11 has the largest whole team standard deviation and season 11-12 has the largest major player standard deviation. Especially the minutes of Messi in 11-12 is 5221, which is the only player in the 5 years that played more than 5000 minutes in a season. This season also has the most games(3 more games than average) , which along with the lack of rotation in the previous season, may explain some of the difficulties in this season, such as a lot of injury and poor state of physical energy.
It’s worth mentioning that in season 12-13, no player played more than 4500 minutes, even Messi who played most had only 4095 minutes. It’s a side proof of that this season did the maximum rotation.
There is an interesting feature in season 12-13. While all other seasons have only one peak in the distribution(except the 0-500 minutes range), season 12-13 has two: 1500-2500 and 3500-4500. The fact that many players fall into the range 1500-2500 indicates that the substitutes played more and contributed more to the games.
There is one subtle issue in considering the rotation: injury. Usually, by rotation we mean the choice of players made by the coach when he has the freedom of choosing, let’s call it “active rotation”. But there is another case: when the player is injured, there is no choice, so considering this change of playing time is not so relevant, let’s call it “passive rotation”. When considering the strategy of the team, we often care more about the active rotation.
Ideally, we can take all the injury information into account and calculate for the active rotation. But I don’t have that much information. So let’s deal with one typical player: Messi. The following table shows the total minutes played by Messi in 5 years.
You may suspect that the reduced playing time in season 12-13 is majorly caused by his injury rather than the active rotation (like in 08-09). So let’s take a look. Messi was injured twice in this season, while the one in December 2012 did not have much effect; the one after playing the first leg of UCL with PSG interrupted all his remaining games of the season. So let’s trim away the remaining games after this game, and only calculate the average playing minutes per game in the healthy period. That can be taken to measure pure active rotation. In principle, the average minutes in the healthy period is higher than the general average because in the injury period the player plays much less.
I only do this trimming for season 12-13 and comparing its “healthy” average to the general average in the other 4 years. Even in this extreme, we can see that the average minutes of season 12-13 is still relatively small, especially comparing to 11-12.
In all, through different perspectives, we reach the same conclusion: The season 12-13 did maximum rotation in the past 5 years. And Messi’s injury was NOT caused by overplaying, neither there exited a Messi-dependence more than the other seasons indicated by the data of playing minutes. This may seem anti-intuition for some people, because what they remember are only those 2 months when Tito was in New York in which Barça did almost no rotation at all. But the data shows us a whole picture which is very different.
2. Age Structure
I analyzed both the distribution of number of players in different ages and the distribution of playing minutes in age ranges. It can show us the dynamical dependence of the team on different ages of players, besides the static age structure of the team.
We see that the average age does not change too much each year and the team keeps a good balance of the age structure. The large difference between the whole team and major players in season 10-11 indicates a relatively fixed starting XI with more experienced players, and this is also related to the low degree of rotation.
The following is the playing minutes distribution in age. Starting from the age of 20, each age range is chosen to be 3 years. This choice is decided to make the distribution most smooth.
We can see that the playing minutes of season 12-13 concentrate in the age range 23-25. Comparing to the average distribution, players under 22 have much less playing minutes, and players under 20 have 0. In detail, only one player under 20 played in the first team – Deulofeu(18), but he only played 68 minutes, so he’s not included. The first 3 seasons have so many minutes for under-20 players because of Bojan. His playing time was all above 1000 minutes. We can compare the playing minutes of all the players under 22 in the 5 seasons:
We see that it’s almost decreasing along the years. Season 12-13 has the minimum. It indicates that young players did not get as many chances as the earlier seasons.
In season 12-13, the players older than 25 also have total playing time less than those in the other seasons. This is caused by more injuries of the older players. It’s a signal that although the age structure of the team is not changed much, comparing to the former seasons, more and more players with higher age may not be able to sustain the continuous intensity of the games. It may also be caused by other reasons that can cause worn out injury, such as the change in training, diet, etc. Since in the new season 13-14 there is not too much change in the team structure, it’s expected that this situation will continue or even worsen, which means that when considering the long term plan of the season, the group of players with age older than 26 should not be strongly relied upon because of physical condition.
Season 12-13 shows that the team has a strong dependence on players with age 23-25(now 24-26), which is a good thing in the long run. If everything goes normally, their physical state is expected to maintain during the next 3-4 years.
Let’s look at the season 10-11 again. The minutes’ peak of this season falls in the age range 29-31. If we consider the comprehensive performance, few people would doubt that this season was the peak of the past 5 years. Recalling the conclusion we get before for this season, the whole picture is now clear: it has the least rotation and the most experienced players available to build a stable and strong starting XI. However, the price was paid in the next two seasons in the form of increase of injuries in general and worn-out of older players.
3. The Situation of the Homegrown Players
Now let’s enter a topic that concerns many people: the use of the players from La Masia. Some people argue that the season 12-13 almost “abandoned” the homegrown players from La Masia. So let’s find out the truth by comparing the data of this season to the other 4 seasons.
I analyzed all the homegrown players that entered and left the first team after season 08-09. The homegrowns that entered the first team include those who were promoted from Barça B (who played more than 90 minutes per season in the first team) and those who were bought back from other teams. The homegrowns that left the first team include those who were sold or rent and those who went back to Barça B after playing more than 90 minutes in the first team. These is one special case, Bojan, who entered the first team in 2007, but because he is a typical young homegrown player in these years, I also count him as promoted in 2008. The following is the number of players who entered and left in each season.
Note: In the summer of 2008 there were also homegrown players who left the first team but they can be considered irrelevant to our analysis.
In the past 5 years, 21 homegrown players entered the first team and 8 left. The average entering number is 4.2 and the average leaving number is 1.6. In season 12-13, 2 new homegrowns entered, which is less than average, and 2 left, which is more than average.
“Entering” only means the playing time of the whole season is more than 90 minutes. For further analysis, we need to know how much these players are played in the first team. If we only count the total playing minutes of homegrowns, we have the following result:
Of course, because some homegrowns that entered earlier gradually became major players during the years, it’s expected that the total playing time increased each year.
We care more about the promotion of young homegrowns, so let’s have a look at the number and playing minutes of the homegrowns under 22. We can see that both the number and the playing minutes in season 12-13 are less than average.
A more detailed question would be, among these young homegrowns who entered the first team, how many can stay, and how many can become major players? After all, only those who play as major players or at least as frequent substitutes can be counted as “successfully promoted”. So I classify the homegrowns into 3 groups according to their playing minutes:
“Unstable”: playing minutes < 1000
“Half Stable”: playing minutes between 1000 and 2000
“Stable”: playing minutes > 2000
2000 minutes is approximately the average playing time of the whole team and a typical divider of the major and non-major players. The following table shows the number of homegrowns in each category. “Alive” is the total number.
We see that the season 12-13 has the highest number in the Alive, Stable and Half Stable categories, and lower than average number of the Unstable.
You may argue that the number of homegrowns in each season includes those who were already promoted or stabilized from previous seasons, which makes the increase in numbers expected, yet less relevant. Under this consideration, we can define the ratio of the number of each category over the accumulated number of homegrowns that have entered the first team from 2008 to the corresponding season. For example, the accumulated number of homegrowns that have entered the first team (only entering, no leaving counted) is 21 up to season 12-13, so the “Stable Rate” of 12-13 season is 5/21=0.24. Of course, these rates are biased to the other side, because the accumulated number always increase with time, but the total number of players who can play in the first team is limited, so the rates should be expected to decrease in the long run. However, for 5 years, it does not have a big effect, and at least we can see the picture qualitatively. The following table and graph shows these rates for each season. The last row in the table is the sum of Stable Rate and Half Stable Rate.
All the indices in season 12-13 are below the average, however, we notice that the first two seasons contribute most to the average. It’s expected because the accumulated number of homegrowns is much less in the first two seasons than in the following seasons. If we only compare the last three seasons, we see that the Stable Rate does not change much, and the season 12-13 has the highest Half Stable plus Stable Rate among the recent 3 years. More noticeably, the Unstable Rate of this season is the lowest in all these years. It indicates that although the homegrowns have less chance to play in total, the “quality” is quite high, which means that there are relatively more half stable and stable homegrowns who make significant contribution to the team.
There is an interesting feature of season 12-13 in the graph: the difference among the three rates is very small. As indicated before, the numbers of unstable, half stable and stable players are 4, 3, 5, respectively. It seems like there are smooth “steps” along the slop of playing minutes for the homegrowns. The half stable players are important in the sense that they fill in the gap between the experienced and immature players, and they are expected to grow into major players in a short period. It’s important for the team to have enough half stable players in storage in order to have a structure that can evolve stably in the long run, unless the major strategy of the team is buying players instead of using homegrowns. Season 12-13 had a good structure with the most half stable players: 3. Let me give their names: Thiago, Tello, Montoya. (What a loss we had with Thiago…)
In all, the general situation of playing homegrown players in season 12-13 was not as good as the previous seasons, but it was still within the acceptable range of vibration, and it had a quite healthy structure of playing minutes. After all, this is a very complicated issue. Unlike other technical issues, it’s not an independent index of each season; instead there are strong correlations among the continuous years. It’s impossible to make an objective assessment with only one season of data, or even 5 altogether. Only with a long range of time spanning before and after this period can we make a clearer picture of this issue.
We draw these conclusions of the season 12-13 from this analysis:
- Among the past 5 years, this season distinctively had the maximum rotation. The playing minutes were most equally distributed.
- This season had an age structure featuring young players in the age range of 23-25, thus the general state of the team is expected to be stable during the next 3-5 years.
- The use of young homegrown players in this season was in general not as much as the previous seasons, yet still within the acceptable vibration. With the most half stable players, it might have the healthiest structure of playing minutes for the long-term evolution of the team.
In all, this analysis provides us a preliminary look into the issues of Barça related to players’ playing minutes. I feel that this is just a tip of the iceberg that can be drawn from the data. With more data and more detailed analysis, we may find more interesting patterns that can answer our questions or help us find the problems that are not easy to notice otherwise. Starting from playing minutes, we can also relate it to other data, such as injuries and physical state of the players. It can be important supplement information for the coach to make decisions about the use of players from the long-term point of view.