Jakob Troidl
Jakob Troidl

Reputation: 79

Fast computation of node distances to set of nodes

For each node (pink nodes) in an undirected/unweighted graph, I want to compute the closest distance to a given set of connected nodes (red nodes). It is essential that the computation is fast, also on large graphs (~4000 nodes) and for a large set of connected nodes. See the illustration below. What is the best algorithm/graph library for me to do this?

enter image description here

I tried to do something like that with NetworkX - shortest_path_length already, but it is too slow. Especially for a large set of connected red nodes.

import networkx as nx

all_distances = []
for node in red_nodes:
    distances = nx.shortest_path_length(graph, source=node) # compute distances to all nodes in the graph from the source node
    all_distances.append(distances)
shortest_distances = filter_for_shortest_distance(all_distances)

Here is an example on how to access a graph that I am working with. The red nodes could be any subset of connected nodes in the graph.

# Import navis
import navis

# Load one of the example neurons
sk = navis.example_neurons(n=1, kind='skeleton')

# Inspect the neuron graph
graph = sk.get_graph_nx()

Upvotes: 3

Views: 552

Answers (1)

Trizalio
Trizalio

Reputation: 861

You can use Dijkstra's algorithm with multiple start points:

import networkx as nx
import matplotlib.pyplot as plt
import typing

Node = int

def get_distances(graph: nx.Graph, red_nodes: typing.Set[Node]) -> typing.Dict[Node, int]:
    node_to_distance = {}
    node_to_neighbours = graph.adj

    # fill node_to_distance for red nodes and init set of reachable nodes
    reachable = set()
    for node in red_nodes:
        node_to_distance[node] = 0
        for neighbour in node_to_neighbours[node]:
            if neighbour not in red_nodes:
                node_to_distance[neighbour] = 1
                reachable.add(neighbour)

    # for each reachable node add neighbours as reachable nodes with greater distance
    while reachable:
        node = reachable.pop()
        distance = node_to_distance[node]
        for neighbour in node_to_neighbours[node]:
            if distance + 1 < node_to_distance.get(neighbour, len(graph)):
                node_to_distance[neighbour] = distance + 1
                reachable.add(neighbour)

    return node_to_distance

def check():
    g = nx.petersen_graph()
    red_nodes = {1, 2}
    node_to_distance = get_distances(g, red_nodes=red_nodes)
    plt.subplot(121)
    node_colors = ["red" if node in red_nodes else "yellow" for node in g.nodes()]
    nx.draw(g, with_labels=True, node_color=node_colors, labels=node_to_distance)
    plt.show()

check()

i've tested it on 40000 nodes graph and it took 42 ms to calc distances with 10000 red nodes

Upvotes: 1

Related Questions