You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.
In our kiara centrality measures modules (betweenness centrality, eigenvector centrality) we usually provide the option to account for weighted graphs.
So far, these modules are based on the algorithms in the NetworkX python package, but the problem is that weights are interpreted differently in different NetworkX's centrality algorithms. For some centralities weights are interpreted as the connection strength (eigenvector_centrality documetation) but for others they are interpreted as distancebetweenness_centrality documentation).
This difference is significant for interpreting the results of the centrality algorithm and should not be left to chance, but should become a conscious choice for the user. When weights are interpreted as connection strength, then the algorithm will prioritize edges with high weight values ranking them higher in the centrality calculation. When, on the other hand, the weights are interpreted as distance (or cost), then the algorithm will avoid edges with high weight values.
Example
Take a look at this weighted undirected graph:
The Anna node will always have high betweenness score in this constallation because it connects several clusters. If weights are disregarded, then the nodes Chiara and David will have exactly the same betweenness score. The exact same amount of shortest paths passes through these nodes. When interpreting this as an information network, it could be assumed that their capacity to pass on information to the Boris node is equal.
unweighted betweenness score
Node
0.772
Anna
0.145
Emil
0.145
Frida
0.145
Igor
0.145
Jane
0.081
Chiara
0.081
David
0.018
Gandalf
0.018
Hilda
0.018
Klaus
0.018
Lisa
0.009
Boris
But if we also consider the weights of connections and use the default NetworkX betweenness centrality algorithm with this data, then nodes with low weights (Igor, David, Klaus) would score very high while nodes with high weight will rank lowest:
weighted betweenness score
Node
0.772
Anna
0.290
Igor
0.163
David
0.163
Klaus
0.145
Emil
0.145
Frida
0.018
Gandalf
0.018
Hilda
0.018
Lisa
0.009
Boris
0
Chiara
0
Jane
If this was a network that ( for example) represents the traveling distance between actors as weight, then this ranking would be meaningful (It takes longer to travel from Anna to Boris when taking the route through the Chiara node). However, if this was an information network, then the weights would more likely represent the strength of connections (number of letters, interactions, communication in general) and then this ranking would be misleading. In information/communication networks we generally assume that those connections that are the strongest would be those that would be most likely used. We would expect the Chiara node to have a higher betweenness score than David, because that communication route should be more reliable/efficient etc..
Proposed solution
In order to achive the expected results one could invert all the weights in the edge list (by calculating 1/weight), so that the highest values would become the lowest and then use the NetworkX algorithm with these new inverted weights.
Therefore, for all the NetworkX centralities that are based on shortest path calculation (and therefore interpret weights as distance) I would propose to raise the question of the network type and offer a centrality calculation with inverted weights as an alternative.
betweenness score with inverted weights
Node
0.790
Anna
0.30
Jane
0.163
Chiara
0.163
Lisa
0.145
Emil
0.145
Frida
0.018
Gandalf
0.018
Hilda
0.009
Boris
0
David
0
Igor
0
Klaus
References: Opsahl, T., Agneessens, F., Skvoretz, J., 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32 (3), 245-251.
Note on Gephi: In the current Gephi version (0.10.1), there is currently no way to calculate weighted betweenness centrality. It will always calculate unweighted betweenness. There is an option to consider edge directionality and to normalise betweenness values, but no weighted edges option. The Gephi algorithm is supposed to be based on: Ulrik Brandes, A Faster Algorithm for Betweenness Centrality, in Journal of Mathematical Sociology 25(2):163-177, (2001). Even though Brandes discusses weighted betweenness in that article, it is not currently implemented in Gephi.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
In our kiara centrality measures modules (betweenness centrality, eigenvector centrality) we usually provide the option to account for weighted graphs.
So far, these modules are based on the algorithms in the NetworkX python package, but the problem is that weights are interpreted differently in different NetworkX's centrality algorithms. For some centralities weights are interpreted as the connection strength (eigenvector_centrality documetation) but for others they are interpreted as distance betweenness_centrality documentation).
This difference is significant for interpreting the results of the centrality algorithm and should not be left to chance, but should become a conscious choice for the user. When weights are interpreted as connection strength, then the algorithm will prioritize edges with high weight values ranking them higher in the centrality calculation. When, on the other hand, the weights are interpreted as distance (or cost), then the algorithm will avoid edges with high weight values.
Example
Take a look at this weighted undirected graph:
The Anna node will always have high betweenness score in this constallation because it connects several clusters. If weights are disregarded, then the nodes Chiara and David will have exactly the same betweenness score. The exact same amount of shortest paths passes through these nodes. When interpreting this as an information network, it could be assumed that their capacity to pass on information to the Boris node is equal.
But if we also consider the weights of connections and use the default NetworkX betweenness centrality algorithm with this data, then nodes with low weights (Igor, David, Klaus) would score very high while nodes with high weight will rank lowest:
If this was a network that ( for example) represents the traveling distance between actors as weight, then this ranking would be meaningful (It takes longer to travel from Anna to Boris when taking the route through the Chiara node). However, if this was an information network, then the weights would more likely represent the strength of connections (number of letters, interactions, communication in general) and then this ranking would be misleading. In information/communication networks we generally assume that those connections that are the strongest would be those that would be most likely used. We would expect the Chiara node to have a higher betweenness score than David, because that communication route should be more reliable/efficient etc..
Proposed solution
In order to achive the expected results one could invert all the weights in the edge list (by calculating 1/weight), so that the highest values would become the lowest and then use the NetworkX algorithm with these new inverted weights.
Therefore, for all the NetworkX centralities that are based on shortest path calculation (and therefore interpret weights as distance) I would propose to raise the question of the network type and offer a centrality calculation with inverted weights as an alternative.
References: Opsahl, T., Agneessens, F., Skvoretz, J., 2010. Node centrality in weighted networks: Generalizing degree and shortest paths. Social Networks 32 (3), 245-251.
Note on Gephi: In the current Gephi version (0.10.1), there is currently no way to calculate weighted betweenness centrality. It will always calculate unweighted betweenness. There is an option to consider edge directionality and to normalise betweenness values, but no weighted edges option. The Gephi algorithm is supposed to be based on: Ulrik Brandes, A Faster Algorithm for Betweenness Centrality, in Journal of Mathematical Sociology 25(2):163-177, (2001). Even though Brandes discusses weighted betweenness in that article, it is not currently implemented in Gephi.
Beta Was this translation helpful? Give feedback.
All reactions