Cyberguille
Cyberguille

Reputation: 1602

Recursion depth issue using Python with DFS algorithm

The DFS algorithm is already working with small test cases, but when I run it with a huge sample it throws "RuntimeError: maximum recursion depth exceeded", so I included sys.setrecursionlimit(10 ** 6) and the system throws this message: "python.exe stopped working" and PyCharm throws the message: "Process finished with exit code -1073741571 (0xC00000FD)". You can download a zip file of the sample.

Code:

import sys
import threading

threading.stack_size(67108864)
sys.setrecursionlimit(10 ** 6)


def get_input(filename):
    graph_map = {}

    for line in open(filename, 'r').readlines():
        values = [int(val) for val in line.split()]
        key1 = values.pop(0)
        key2 = values.pop(0)

        if not key1 in graph_map:
            graph_map[key1] = []

        if not key2 in graph_map:
            graph_map[key2] = []

        graph_map[key1].extend([key2])

    return graph_map


def DFS(graph_map, start):
    global visited

    visited[start-1] = True

    for child in graph_map[start]:
        if visited[child-1] is False:
            DFS(graph_map, child)


def DFS_Loop(graph_map):
    global visited

    i = len(graph_map)  # max(graph_map.keys())
    for i in reversed(range(1, i+1)):
        if visited[i-1] is False:
            DFS(graph_map, i)


graph_map = get_input("SCC.txt")
visited = [False]*len(graph_map)  # size of the graph
DFS_Loop(graph_map)

Is there any way of achieving this without taking away the recursion?

Thanks in advance.

Upvotes: 2

Views: 3633

Answers (2)

rici
rici

Reputation: 241721

One of the cool things about Python is that iterators are just a normal datatype. Although one usually uses for loops and comprehensions to iterate, there is nothing to stop you from doing the iteration manually if that is convenient. One reason it might be convenient is to avoid deep recursion, by replacing it with an explicit stack.

While this does "take away the recursion", it does not significantly complicate the program, and the recursive structure is still evident. Python simply does not recurse gracefully, so this sort of transformation is often useful.

When you write

for child in graph_map[start]:
    ...

You're doing something very similar to the following:

it = iter(graph_map[start])
try:
  child = next(it)
  ...
except StopIteration:
  pass

[See Note 1]

The iter function returns an iterator for a collection (or other iterable object, such as a comprehension, a generator, or a range). The next method returns the next value, and advances the iterator; if there is no next value, a StopIteration exception is raised.

The DFS function simply calls itself recursively for every unvisited child of its argument. We can emulate that behaviour precisely using a stack of iterators. Instead of recursively calling with a node, we'll push an iterator to the node's children onto the iterator stack. When the iterator at the top of the stack terminates, we'll pop it off the stack.

def DFS(graph_map):
  visited = [False] * len(graph_map)
  # By initializing the stack with a range iterator, we do
  # the equivalent of DFS_Loop.
  # In Python2, you should use xrange instead of range
  stack = [iter(range(len(graph_map), 0, -1))]
  while stack:
    try:
      child = next(stack[-1])
      if not visited[child - 1]:
        visited[child - 1] = True
        # Do whatever you want to do in the visit
        stack.append(iter(graph_map[child]))
    except StopIteration:
      stack.pop()

The above function applied to the sample data in the file provided in the OP used a maximum of 62794 stack slots. On my Linux laptop, it took somewhere around 3 seconds, once the data was read in; I didn't time it precisely.

It's also interesting to note that the above can be changed to do a breadth-first search simply by changing the stack to a queue. For a depth search first, the stack must be a stack of iterators (or equivalent, in languages in which that is not so simple); for breadth-search first, the queue may be a queue of iterators, but it would work as well to use a queue of values.


Notes

  1. The actual virtual machine uses the defined iterator protocol, which varies slightly between Python2 and Python3. In both versions, the container (or other iterable) objection has a member function called __iter__ which returns a newly created iterator object. The iterator object has a method called next in Python2 and __next__ in Python3, which returns the current value and advances the iterator. Since 2.6, you can use the iter and next global functions to avoid having to worry about these details, and that's what I've done here.

Upvotes: 5

Jignesh
Jignesh

Reputation: 565

You need to change the system stack limit. On linux you can do it this way:

ulimit -s 1000000 (for windows check here)

Then I ran your program, it exited normally.

Upvotes: 0

Related Questions