In social networks, communities are groups of people who share common features. Given a set
of nodes, the most common way of identifying communities consists in
classifying them into classes which may be predefined or not. This is what
traditional classification and clustering approaches do respectively.
In the context of real-world graphs, community detection generally aims
at finding a partition of nodes, i.e. disjoint communities where each node
belongs to exactly one community. However it is unrealistic to apprehend that a user belongs to only a single community. Normally a person is part of several overlapping groups. Overlapping communities therefore should be allowed. However computing all overlapping groups leads to numerous problems including time and space complexity of the algorithm and difficulty in the
interpretation of obtained results.
An interesting compromise consists in focusing on the groups related to
one specific node, referred to as ego-centred communities. Despite promising initial results, ego-centred community
detection is still a difficult problem because a single node can still belong
to numerous groups. Therefore, communities identified by a set of nodes i.e multi-ego-centred communities have gained primary focus. It has been established that for real-world graphs, a small set of nodes is generally sufficient to define a unique community, which is not the case
with a single node usually.
A recently developed proximity measure for ego-centred communities is the carryover opinion metric. For a given node of interest, the framework consists in first setting the
opinion of this node to one and the opinion of all other nodes to zero. Then,
at each time step, the opinion of every node is averaged with the opinion of
one of its neighbours. The opinion of the node of interest is then reset to
one. Thus, its opinion does not change throughout the process and remains
equal to one. The speed of convergence is used to characterize the to what extent nodes are similar to the node of interest.
Two conjectures are needed to
carry on:
Conjecture 1: after a sufficient number of iterations, the ranking of
nodes according to their opinion no longer changes.
Conjecture 2: after a sufficient number of iterations, the difference
between the opinion of two nodes decreases proportionally to the difference
between the opinions of any other two nodes.1
Clearly this measure takes into account the whole depth of the graph, is parameter-free and
is fast to compute.
The question is: how may the communities shared by a set of nodes be
unfolded? The idea is that a node belonging to both a community of node1 AND a
community of node2 has to be somewhat similar to both node1 AND to node2. The following steps are taken:
1) For all nodes, evaluate the proximity to node1 and to node2.
2) The proximity to the set {node1, node2} is then given by the
minimum, or by the geometric mean of the similarities to node1
and the similarities to node2. This quantity measures to what
extent node is close to node1 AND node2.
The method is easily generalizable to a set of more than two nodes.
There are other methods for finding
ego-centred communities, all of them based on the optimisation of a quality
function. Quality function techniques, due to the non-convexity of the
optimisation problem, often lead to small communities, while this approach
does not suffer from this drawback.
References:
No comments:
Post a Comment