ivygrowing
ivygrowing

Reputation: 75

How to find all simple paths of no more than k lengths starting at a vertex in a directed graph?

I am trying to find all simple paths up to a given length, and working on using BFS to solve this problem. However, I am not sure about the specific algorithm to use. It seems that BFS cannot easily solve this problem. All problems I found were about finding paths between two given vertexes.

However, I wonder how to find all simple paths that have lengths up to a given k, starting at a given vertex in a directed graph. No cycles are allowed.

k lengths means k hops, not the weights of paths.

enter image description here

For example, we have k = 3, and given vertex 1, we want to find paths of length of k = 1, k = 2, k = 3.

when k = 1, we have paths 12 and 13; when k = 2, we have paths 132 and 134; when k = 3, we don't find appropriate paths.

What algorithm can I use to solve the problem?

Upvotes: 5

Views: 533

Answers (3)

M.A.
M.A.

Reputation: 351

A backtracking* version of the DFS traversal used by Goodrich, Tamassia & Goldwasser (2013) in Section 14.3.1 - 14.3.2, can be modified and adapted to solve this problem.

We also want to keep track of visited vertices and corresponding tree edges (or discovery edges) by the use of an ordered Hash Table (which implies your vertex must be hashable), which therein prevents cyclic traversal (by using an Adjacency Map to represent the Graph, we can query traversed vertices in O(1) time and omit back edges).

This will ultimately produce the (implied) strongly connected components (or connected components in the case of an undirected graph) reachable by traversing at most k-1 edges in the graph, by invoking DFS on a starting vertex s, and then recursively calling DFS on all incidental discovery edge vertices at the specified k-range.

* We populate a concurrent path accumulator for every valid recursive DFS call (when traversed edges < k) , and then backtrack and pop the corresponding retraction edges (triggered when traversed edges >= k, until a new edge is traversed and accumulated again) as the recursion unwinds.


The following solution uses an Adjacency Map representation of a Graph as presented by Goodrich et al. (pp. 635-637). For brevity, the Graph implementation which is 100% compatible with this algorithm with zero dependencies can be found here: graph.py.

Methods Graph.insert_vertex(), Graph.insert_edge(), Edge.opposite(), Edge.endpoints() all run in O(1) time, except Graph.incident_edges(v) which takes O(outgoing-degree(v)) time.

Initially, klen_dfs() ought to take O(n) time at the worst case (when finding a vertex reachable in k edges requires traversing the whole graph). However, saving each unique path (with the line path_array.append(p[:])) incites some necessary recomputation amassing to approximately summation(k) operations per each k-length path traversed, that is nk × summation(k), where nk is the number of vertices reachable by traversing k edges. Overall, klen_dfs() takes O((nk× k2) + n) time in worst case.

from collections import OrderedDict
from typing import List
from graph import Graph


def init_graph():
    G = Graph(directed=True)

    v1 = G.insert_vertex(1)
    v2 = G.insert_vertex(2)
    v3 = G.insert_vertex(3)
    v4 = G.insert_vertex(4)

    G.insert_edge(v1, v2)
    G.insert_edge(v1, v3)
    G.insert_edge(v3, v2)
    G.insert_edge(v3, v4)
    G.insert_edge(v4, v3)

    return G, v1


def klen_dfs(G: Graph, s: Graph.Vertex, k: int):
    """Report all paths (edge lists) less than length k.
    G - Adjacency Map Graph ADT
    s - Starting vertex
    k - Path length limit (actual limit = k-1)
    """
    discovery_edges: OrderedDict[Graph.Vertex, Graph.Edge] = OrderedDict()
    path_accummulator: List[Graph.Edge] = [ ]
    path_array: List[List[Graph.Edge]] = [ ]

    def dfs(g: Graph, u: Graph.Vertex, discovered, klen, p):
        """Traditional DFS with addtional params.
        u - A vertex in G
        p - Concurrent path array of less than k edges
        """
        if klen < k:
            for e in g.incident_edges(u, outgoing=True):
                v = e.opposite(u)
                if not discovered.get(v):   # Cyclic traversal prevention
                    discovered[v] = e
                    p.append(e)
                    path_array.append(p[:])
                    dfs(g, v, discovered, klen + 1, p)
                    p.pop()
                    discovered.pop(v)

    dfs(G, s, discovery_edges, 1, path_accummulator)
    return path_array


if __name__ == "__main__":
    # For graph produced by init_graph(), all k >= 3 has same results
    for K in range(1,4):
        digraph, start = init_graph()
        paths = klen_dfs(digraph, start, K)
        formatted_paths = []
        for path in paths:
            formatted_path = []
            for edge in path:
                endpoints = edge.endpoints()
                formatted_path.append((
                    endpoints[0].element(), endpoints[1].element()))
            formatted_paths.append(formatted_path)
        print(f"ALL paths when k = {K} (i.e of length {K-1} < k length)")
        print(*formatted_paths, sep="\n")
        print()

Output:

ALL paths when k = 1 (i.e of length 0 < k length)

ALL paths when k = 2 (i.e of length 1 < k length)
[(1, 2)]
[(1, 3)]

ALL paths when k = 3 (i.e of length 2 < k length)
[(1, 2)]
[(1, 3)]
[(1, 3), (3, 2)]
[(1, 3), (3, 4)]

Visualization of simple valid paths less than length k=3 when s=1

Valid Edges

Other considerations: strictly k-length paths

Following the modifications to trincot's answer, we can otherwise choose to report only those paths whose lengths are of size k by replacing the line in the nested dfs() function: path_array.append(p[:]) with if klen == k-1: path_array.append(p[:]).

This reduces the time complexity from O((nk× k2) + n) to O((nk× k) + n), and produces the following output:

ALL paths when k = 1 (i.e of length 0 < k length)

ALL paths when k = 2 (i.e of length 1 < k length)
[(1, 2)]
[(1, 3)]

ALL paths when k = 3 (i.e of length 2 < k length)
[(1, 3), (3, 2)]
[(1, 3), (3, 4)]

Upvotes: 1

ravenspoint
ravenspoint

Reputation: 20596

Here is pseudo code for Yen's algorithm

IMPORT graph, source and sink
CLEAR vShortestPaths
calculate shortest path from source to sink, add to vShortestPaths
WHILE TRUE
   CLEAR vPotentialPaths
   SET work = graph
   LOOP PF over vShortestPaths
       LOOP spur over PF
            REMOVE out edge from spur in work used by PF
            calculate spurPath, shortest path from spur to sink in work
            IF spurPath exists
                CONTINUE to next iteration of LOOP spur
            SET newPath to combination of source to spur in PF and spurPath
            IF newPath NOT in vShortestPaths
                ADD newPath to vPotentialPaths
            END LOOP spur
        END LOOP PF
    IF vPotentialPaths NOT empty
        ADD shortest path in vPotentialPaths to vShortestPaths
    END WHILE TRUE 

To find all the paths in a graph:

Set vertices into fixed order ( input order is fine )
LOOP I over vertices
   LOOP J over the (J+1)the to the last vertex
       Apply Yen to I,J vertex pair

Upvotes: 1

trincot
trincot

Reputation: 351019

You would use a depth-first traversal for this. A recursive function is the natural choice for that. It would recur up to depth k and then produce the path it had traversed. At each recursive call make sure you don't add a node to the path that is already on the path. And when you get out of a recursive call, reduce the path accordingly.

Here is an implementation in JavaScript, where the recursive function takes as argument an adjacency list (to represent the graph), the current node, the value of k and the path as a Set (an ordered Hash Set).

function* genPaths(adj, node, k, path=new Set) {
    if (path.has(node)) return; // cycle detected!
    if (k == 0) return yield [...path, node]; // We have a result
    path.add(node);
    for (const neighbor of adj[node]) {
        yield* genPaths(adj, neighbor, k - 1, path);
    }
    path.delete(node); // backtrack
}

// Demo
const adj = {
    "1": ["2", "3"],
    "2": [],
    "3": ["2", "4"],
    "4": ["3"]
};

// Finding paths with k=1, k=2 and k=3:
for (let k = 1; k <= 3; k++) {
    console.log("results for k =", k);
    for (const path of genPaths(adj, "1", k)) {
        console.log("  ", ...path);
    }
}
console.log("done");

Extending it...

The above function returns paths of exactly k lengths, and the main code uses it to get paths for several different lengths. If you are always interested in all shorter paths as well, even including the one with size 0 (just the starting node) then you could extend the function to do that all with one top-level call:

function* genPaths(adj, node, k, path=new Set) {
    if (path.has(node)) return; // cycle detected!
    yield [...path, node]; // We have a result (0 <= path size <= k)
    if (k == 0) return;
    path.add(node);
    for (const neighbor of adj[node]) {
        yield* genPaths(adj, neighbor, k - 1, path);
    }
    path.delete(node); // backtrack
}

// Demo
const adj = {
    "1": ["2", "3"],
    "2": [],
    "3": ["2", "4"],
    "4": ["3"]
};

// Finding paths with size <= 3:
for (const path of genPaths(adj, "1", 3)) {
    console.log("  ", ...path);
}
console.log("done");

Note that the size of a path is expressed as the number of edges on it, so [1] is a path of length 0, [1, 3] is a path of length 1, ...etc.

If you want to exclude the path with length 0, then just add a condition to the yield statement:

    if (path.length > 0) yield [...path, node];

Upvotes: 2

Related Questions