not understanding the return order based on DFS in graph traversal

Question

I'm currently reviewing DFS and BFS in graphs, and came upon this question on code review.

Here is the code that is given as the "correct" solution

graph = {'A': set(['B', 'C']),
         'B': set(['A', 'D', 'E']),
         'C': set(['A', 'F']),
         'D': set(['B']),
         'E': set(['B', 'F']),
         'F': set(['C', 'E'])}

def dfs(graph, start):
    visited, stack = set(), [start]
    while stack:
        vertex = stack.pop()
        if vertex not in visited:
            visited.add(vertex)
            stack.extend(graph[vertex] - visited)
    return visited

dfs(graph, 'A') # {'E', 'D', 'F', 'A', 'C', 'B'}

Based on the definition of DFS, how can the solution be in the order of # {'E', 'D', 'F', 'A', 'C', 'B'} ? If I were to traverse from graph A, would the solution not be to go as deep as possible first, and then traverse up?

Is this an actual implenentation of DFS, or is it just a "searching algorithm" but is not DFS?

forkrul · Accepted Answer

That is a valid DFS algorithm. However, because visited is a set, the order of the returned visited nodes will not necessarily be the order that they were inserted. If you want to see the order of insertion, then the following would be more appropriate:

graph = {'A': set(['B', 'C']),
         'B': set(['A', 'D', 'E']),
         'C': set(['A', 'F']),
         'D': set(['B']),
         'E': set(['B', 'F']),
         'F': set(['C', 'E'])}

def dfs(graph, start):
    visited, stack = [], [start]
    while stack:
        vertex = stack.pop()
        if vertex not in visited:
            visited.append(vertex)
            stack.extend(graph[vertex] - set(visited))
    return visited

print dfs(graph, 'A') # ['A', 'B', 'D', 'E', 'F', 'C']

Also, just a note on the reason why the original author of the code used a set:

When you do the vertex not in visited the average amortized cost of using a set is O(1) whereas when using a list is O(n) (see https://wiki.python.org/moin/TimeComplexity). There may also be a significant cost associated with the set(visited) call (although I don't know how significant). Either way, this isn't a highly optimized example (and it's in python anyways), and for small graphs no one will care. If you wanted the best of both worlds, you might try maintaining both a list of visited nodes (for the ordering) and a set (for the actual operations), although that then comes at the cost of additional memory use.

not understanding the return order based on DFS in graph traversal

Answers (1)

Related Questions