Reputation: 824
I am trying to compute the number of shortest path between 2 nodes which in the distance 2 of each other in a sparse graph which contains 23000000 vertices and around 9 X 23000000 edges. Right now I am using
for v,d,parent in graph.bfsiter(source.index, advanced=True):
if (0 < d < 3):
to loop through the nodes which are within distance 2 of the source node (I need the nodes which are in distance 1 but I don't need to compute all shortest path for them). And then I use:
len (graph.get_all_shortest_paths(source,v));
to get the number of all shortest paths from source to v (where v is the node that bfsiter gives me which has the shortest distance 2 from the source).
However this is taking too long. For example for the graph described above it takes around 1 second to compute the shortest distance for each (source,v).
I was wondering if someone could suggest a more efficient way to compute the number of all shortest paths using igraph
Upvotes: 2
Views: 1180
Reputation: 3551
Here is an implementation of the answer suggested in the comments. The time consuming part of this code is the graph generation. To run on an already generated/in-memory graph takes very little time.
from igraph import *
import random
# Generate a graph
numnodes = int(1e6)
thegraph = Graph.GRG(numnodes, 0.003)
print("Graph Generated")
# Choose one node randomly and another a distance of 2 away
node1 = random.randint(0, numnodes-1)
node2 = random.sample(set(Graph.neighborhood(thegraph, node1, 2)).difference(
set(Graph.neighborhood(thegraph, node1, 1))),1)[0]
# Find the number of nodes in the intersection of the neighborhood
# of order 1.
result = len(set(Graph.neighbors(thegraph, node1)).intersection(
Graph.neighbors(thegraph, node2)))
print(result)
The intersection of the two neighborhoods is the number of unique paths. A path of length 2 visits 3 nodes. Since we know the start and end point, the only one which may vary is the middle. Since the middle node must be a distance 1 from both of the endpoints, the number of unique middle points is the number of paths of length 2 between the nodes.
Upvotes: 2