Allen H.
Allen H.

Reputation: 358

Detecting cycles in Topological sort using Kahn's algorithm (in degree / out degree)

I have been practicing graph questions lately.

https://leetcode.com/problems/course-schedule-ii/

https://leetcode.com/problems/alien-dictionary/

The current way I detect cycles is to use two hashsets. One for visiting nodes, and one for fully visited nodes. And I push the result onto a stack with DFS traversal.

If I ever visit a node that is currently in the visiting set, then it is a cycle.

The code is pretty verbose and the length is long.

Can anyone please explain how I can use a more standard top-sort algorithm (Kahn's) to detect cycles and generate the top sort sequence?

I just want my method to exit or set some global variable which flags that a cycle has been detected.

Many thanks.

Upvotes: 1

Views: 5414

Answers (1)

James Lawson
James Lawson

Reputation: 8618

Khan's algorithm with cycle detection (summary)

  • Step 1: Compute In-degree: First we create compute a lookup for the in-degrees of every node. In this particular Leetcode problem, each node has a unique integer identifier, so we can simply store all the in-degrees values using a list where indegree[i] tells us the in-degree of node i.
  • Step 2: Keep track of all nodes with in-degree of zero: If a node has an in-degree of zero it means it is a course that we can take right now. There are no other courses that it depends on. We create a queue q of all these nodes that have in-degree of zero. At any step of Khan's algorithm, if a node is in q then it is guaranteed that it's "safe to take this course" because it does not depend on any courses that "we have not taken yet".
  • Step 3: Delete node and edges, then repeat: We take one of these special safe courses x from the queue q and conceptually treat everything as if we have deleted the node x and all its outgoing edges from the graph g. In practice, we don't need to update the graph g, for Khan's algorithm it is sufficient to just update the in-degree value of its neighbours to reflect that this node no longer exists.
    This step is basically as if a person took and passed the exam for course x, and now we want to update the other courses dependencies to show that they don't need to worry about x anymore.
  • Step 4: Repeat: When we removing these edges from x, we are decreasing the in-degree of x's neighbours; this can introduce more nodes with an in-degree of zero. During this step, if any more nodes have their in-degree become zero then they are added to q. We repeat step 3 to process these nodes. Each time we remove a node from q we add it to the final topological sort list result.
  • Step 5. Detecting Cycle with Khan's Algorithm: If there is a cycle in the graph then result will not include all the nodes in the graph, result will return only some of the nodes. To check if there is a cycle, you just need to check whether the length of result is equal to the number of nodes in the graph, n.
    • Why does this work?: Suppose there is a cycle in the graph: x1 -> x2 -> ... -> xn -> x1, then none of these nodes will appear in the list because their in-degree will not reach 0 during Khan's algorithm. Each node xi in the cycle can't be put into the queue q because there is always some other predecessor node x_(i-1) with an edge going from x_(i-1) to xi preventing this from happening.

Full solution to Leetcode course-schedule-ii in Python 3:

from collections import defaultdict

def build_graph(edges, n):
    g = defaultdict(list)
    for i in range(n):
        g[i] = []
    for a, b in edges:
        g[b].append(a)
    return g

def topsort(g, n):
    # -- Step 1 --
    indeg = [0] * n
    for u in g:
        for v in g[u]:
            indeg[v] += 1


    # -- Step 2 --
    q = []
    for i in range(n):
        if indeg[i] == 0:
            q.append(i)

    # -- Step 3 and 4 --
    result = []
    while q:
        x = q.pop()
        result.append(x)
        for y in g[x]:
            indeg[y] -= 1
            if indeg[y] == 0:
                q.append(y)

    return result

def courses(n, edges):
    g = build_graph(edges, n)
    ordering = topsort(g, n)
    # -- Step 5 --
    has_cycle = len(ordering) < n
    return [] if has_cycle else ordering

Upvotes: 11

Related Questions