Complex
Network in Football Clubs and players
When I saw the last blog on cricket world cup's complex network I
thought that such kind of networks may be present in many sports one
way or other. As football is one of the most famous games in the
entire world, I focussed my search on this game and found some
interesting information to create a very versatile complex network.
This complex network is based on the football clubs of Brazil and the
main focus is on understanding if there exists some complex network
properties in them.
Now let us first define out network.
Now let us first define out network.
Network:
Bipartite
network. One side Club and other side Player.
Node:
Any
player who has ever played in any club.
Edge:
If
two players have played in the same club, we connect them with an
edge going through that club.
Weight
of edge: Number
of goals scored by a player for a club.
We can immediately see something like the citation netwok here.
Taking the data of 127 clubs and 13,411 players all taken from
Brazillian championship from 1971-2002, we can define many
statistical features for each club and player. Let G denote the
number of goals scored by a club and M denotes the goals conceded
then the following graph shows the variation of Nc and G/M follows
the Gaussian curve fitting.
Lets try to see the degree distribution of each kind of vertex in the
bipartite network. The player probability P(N) show expected decay
which happens exponentially. Similar to citation network, N
corresponds to the number of clubs a player has been involved in .
Club probability distribution was less obvious because of them being
in small number.

One thing to notice is that the probability of finding nomadic players is very less as compared to the one stable in one or two clubs.
Now here comes the most interesting part and the best result to
showcase in the football network.
We define P(M) as the probability that total of M matches are played
by any player (irrespective of the club). There is a kink or elbow in
the graph so obtained in the semi log plot and happens at Mc = 40.
This is where we can fit the cumulative distribution P(Mc) into two
different exponentials.
Pc(M) = 0.150 + 0.857 * 10^(-0.042M) for M <40 and 0.410*
10^(-0.010M) for M>40.
Both of these are shown in the following figure that the fit the given exponential distribution quite well. One of the most obvious conclusion from the graph is that once a player has found some fame it easier for him to keep playing. But one funny conclusion is that the same applies for player with notoriety. If you cross the threshold Mc, then there is high chance that you have achieved stability in the job as a player. This might be seemingly impractical to say, but we can extend this conclusion to the number of matches a player can survive when suffering from bad form.
Without
goals, a football game can never exist. So goals must form most
important dynamics of football. Hence the same was plotted with P(G)
as the probability of a player scoring G goals. The following graph
shows the plot. The cumulative goals scored P(Gc) is also plotted
below the P(G) one.
Again an interesting threshold of Gc=10 is found here which separates region in apparently two different regions following separate power laws. Such laws are found many times in scientific collaboration network.
Again the two equations that best fit the network were worked upon
and found to follow the following.
Pc(G) = -0.259 + 1.256*G^(-0.500) for G<10
Pc(G) = -0.004 + 4.454*G^(-1.440) for G>10
Also P(G)~G^(-1.5) for G<10
and P(G)~G^(-2.44) for G>10
Using the above graph, of the 11 players in a team, we can easily
find which player is supposed to be in what sort of position, like
goal keeper,defender or striker. From this distribution we found that
nearly two thirds of player forms the less-scoring positions.
Making further attempts to generalise more complex network related
features in this field we created a Soccer Player network with making
edges between two players present in a team at the same time. This
merging is same as that of scientist-paper bipartite network. Out of
the obtained graph of 13,411 vertices and 315,566 edges the degree
probability P(k) was calculated and average degree was found to be k
= 47.1. Also getting the Clustering Coeffiecient of the network gave
us
C = 0.79 making it a highly clustered network. The assortavity
coeffiecient was A = 0.12 which makes it assortative network although
less assortative than that of citation network having A=0.46.
The average shortest path was found to be D=3.29 from which we can
logically conclude that there are 3.29 degrees of separation between
the players.
An attempt to understand the time based evolution of the network was
also made. The results obtained were:
Study on temporal evolution shows the following
1) Increasing mean connectivity k: Player's professional life is
turning longer or transfer rate is going up.
2) Clustering Coefficient decreases: Movement of player to some
outside club not under study.
3) The network is becoming more assortative with time. This means
that segregation of players and transfer between the players in the
clubs of similar type.
Conclusion and future works
The study shows that there are many facets in normal life where
complex networks can be applied and lead to great predictions once we
understand the network. Note that this study was centered around
Brazillian clubs. If similar study is promoted for European clubs
involving many countries we might get to see good clustering as well
as better players to choose for transfer choice among the clubs.
Reference:
http://journals.aps.org/pre/pdf/10.1103/PhysRevE.70.037103
Submission By - Nishkarsh Shastri
Submission By - Nishkarsh Shastri
No comments:
Post a Comment