eva
eva

Reputation: 3

Why nodes(vertices) in peripheral positions have higher betweenness centrality scores after plotting on the igraph network visualization?

I calculated the betweenness centrality for a matrix using the 'igraph' package and obtained the scores. After plotting the network, I found that nodes (vertices) that are in the peripheral positions of the network have higher betweenness centrality scores compared to the more center-positioned nodes. Since the definition of betweenness centrality is defined by "the number of geodesics (shortest paths) going through a vertex or an edge". In that case, should more central nodes have higher betweenness centrality? The scores I am getting here, with higher centrality scores located in the peripheral positions of the network, does not fit with the definition and the other graphs that I have seen plotting the betweenness centrality. Do you know what's happening here? enter image description here The original matrix to create the network is shared on the github here (https://github.com/evaliu0077/network.matrix.git). My code for plotting the network and also the network visualization plot are both attached.

matrix <- read.csv("matrix.csv")
matrix <-as.matrix(matrix)
network <- graph_from_adjacency_matrix(matrix, weighted=T, mode="undirected", diag=F)
network =delete.edges(network, which(E(network.eng)$weight <=.1)) # delete the negative correlation values to plot it later

set.seed(10)
l=layout.fruchterman.reingold(network)
plot.igraph(network, layout=l, 
            vertex.size=betweenness(network), 
            edge.width=E(network)$weight*2 # rescaled by 2, 
           edge.color=ifelse(E(network)$weight>0.25,"blue","red"),main="Betweenness 
            centrality for the sample")

Thank you!

Upvotes: 0

Views: 199

Answers (2)

Szabolcs
Szabolcs

Reputation: 25703

Pay attention to the meaning of edge weights before you use them.

In the context of betweenness centrality, edge "weights" are interpreted as "lengths", and are used for determining shortest paths. The length of a path is the sum of the weights/lengths of edges along the path. Higher "length" values indicate a weaker link, not a stronger one.

Are your weight values suitable for this use? Does it make sense to add them up along a path? If they are correlations, then I would say no. You could transform them so that weaker links have higher lengths, for example by inverting the values. You will sometimes see this in the literature, but it is a rather dubious practice. It still does not make much sense to add up inverse correlation values.

Similarly, check if the layout function you are calling makes use of weights, and if yes, in what way. First, your graph is almost complete. Therefore, with layout methods that do not use weights, the vertex positions are completely meaningless. Generally, be careful about reading too much into any kind of network visualization unless there are very obvious effects (such as an undisputable community structure). Here you use igraph's Fruchterman-Reingold layout algorithm, which happens to draw vertices connected by a high-weight edge closer to each other, not further. Thus, it interprets weights in exactly the opposite way compared to betweenness calculations: high weight indicates "strong" connections. Some other layout algorithms, such as Kamada-Kawai, interpret high weights (lengths) as weak (long) connections. Yet other layout algorithms ignore weights completely. It's good to keep this in mind when trying to interpret a network visualization.

Upvotes: 1

Gnoom
Gnoom

Reputation: 193

should more central nodes have higher betweenness centrality?

I think the problem is that you're mixing two notions of centrality here. There's the well defined 'betweenness centrality' and then there's 'nodes that end up in the center of the picture after doing a layout with Fruchterman-Reingold'. They are not the same.

For example, take a full graph, and then add one new node A and connect it only to node B (just some random node in the full graph). Then B will have a high betweenness, but there's no reason to draw it in the middle of the graph. If I wanted to make a nice picture of this I would draw A and B at the edge. Maybe Fruchterman-Reingold does that too, because it will force A outward because it's not connected to most nodes.

Betweenness-based layout algorithms do exist: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-19, but I don't think igraph has one available.

Upvotes: 0

Related Questions