David Brown
David Brown

Reputation: 938

NetworkX: Djikstra's shortest path vs Brandes algorithm for betweeness centrality

I have a couple of Python scripts that calculate various network measures.

Given a graph (G), the first script calculates the average shortest path from each node to all other nodes and stores this in an Nx1 matrix (L). An implemntation of Djikstra's algorithm from the NetworkX Python library is used for this:

for i in range(num_nodes):
    for j in range(num_nodes):
        dj_path_matrix[i,j] = nx.dijkstra_path_length(G, i, j)

L = np.sum(dj_path_matrix, axis=0)/(num_nodes - 1)

Given the same graph (G), the second script uses an implementation of Brande's algorithm in the NetworkX library to calculate betweenness centrality and stores this in an Nx1 matrix (BC):

BC = nx.betweenness_centrality(G, normalized=True)

My question is: why does it take so much longer to calculate L compared to BC?

The way I understand it, BC of a node is a measure of how often a shortest path travels through that node. As such, to calculate BC, surely you would need to calculate all possible shortest paths in the graph. And surely then, BC should take at least as long as L? Using my scripts, given the same graph, it will take a couple of seconds to calculate BC, but up to half an hour to calculate L.

Upvotes: 0

Views: 351

Answers (1)

Joel
Joel

Reputation: 23827

If I asked you to find the shortest path from a given node u to every other node in the graph you probably wouldn't choose the technique you've got here. It's possible to calculate all of those path lengths more efficiently if we do them all at the same time. We find all paths of length 1. Then we find all paths of length 2 that don't revisit a previously found node. Then all paths of length 3, etc.

This will be much more efficient than calculating a path from u to one node, then starting over and finding the path to the next node. Betweenness centrality is calculated by finding all shortest paths between u and all nodes in G at once rather than each node in sequence.

Upvotes: 1

Related Questions