Reputation: 21

Graph Traversal using DFS

I am learning graph traversal from The Algorithm Design Manual by Steven S. Skiena. In his book, he has provided the code for traversing the graph using dfs. Below is the code.

dfs(graph *g, int v)
{
       edgenode *p;
       int y;
       if (finished) return;
       discovered[v] = TRUE;
       time = time + 1;
       entry_time[v] = time;
       process_vertex_early(v);
       p = g->edges[v];
       while (p != NULL) {
            /* temporary pointer */
            /* successor vertex */
            /* allow for search termination */
            y = p->y;
            if (discovered[y] == FALSE) {
                parent[y] = v;
                process_edge(v,y);
                dfs(g,y);
            }
            else if ((!processed[y]) || (g->directed))
                 process_edge(v,y);
            }
            if (finished) return;
            p = p->next;
        }
        process_vertex_late(v);
        time = time + 1;
        exit_time[v] = time;
        processed[v] = TRUE;
}

In a undirected graph, it looks like below code is processing the edge twice (calling the method process_edge(v,y). One while traversing the vertex v and another at processing the vertex y) .
So I have added the condition parent[v]!=y in else if ((!processed[y]) || (g->directed)).
It processes the edge only once. However, I am not sure how to modify this code to work with the parallel edge and self-loop edge. The code should process the parallel edge and self-loop.

Upvotes: 1

Answers (2)

Lior Kogan

Reputation: 20668

You are correct.

Quoting the book's (2nd edition) errata:

(*) Page 171, line -2 -- The dfs code has a bug, where each tree edge is processed twice in undirected graphs. The test needs to be strengthed to be:
else if (((!processed[y]) && (parent[v]!=y)) || (g->directed))

As for cycles - see here

Upvotes: 1

et_l

Reputation: 1918

Short Answer:

Substitute your (parent[v]!=y) for (!processed[y]) instead of adding it to the condition.

Detailed Answer:

In my opinion there is a mistake in the implementation written in the book, which you discovered and fixed (except for parallel edges. More on that below). The implementation is supposed to be correct for both directed and undeirected graphs, with the distinction between them recorded in the g->directed boolean property.

In the book, just before the implementation the author writes:

The other important property of a depth-first search is that it partitions the edges of an undirected graph into exactly two classes: tree edges and back edges. The tree edges discover new vertices, and are those encoded in the parent relation. Back edges are those whose other endpoint is an ancestor of the vertex being expanded, so they point back into the tree.

So the condition (!processed[y]) is supposed to handle undirected graphs (as the condition (g->directed) is to handle directed graphs) by allowing the algorithm to process the edges that are back-edges and preventing it from re-process those that are tree edges (in the opposite direction). As you noticed, though, the tree-edges are treated as back-edges when read through the child with this condition so you should just replace this condition with your suggested (parent[v]!=y).

The condition (!processed[y]) will ALWAYS be true for an undirected graph when the algorithm reads it as long as there are no parallel edges (further details why this is true - *). If there are parallel edges - those parallel edges that are read after the first "copy" of them will yield false and the edge will not be processed, when it should be. Your suggested condition, however, will distinguish between tree-edges and the rest (back-edges, parallel edges and self-loops) and allow the algorithm to process only those that are not tree-edges in the opposite direction.

To refer to self-edges, they should be fine both with the new and old conditions: they are edges with y==v. Getting to them, y is discovered (because v is discovered before going through its edges), not processed (v is processed only as the last line - after going through its edges) and it is not v's parent (v is not its own parent).

*Going through v's edges, the algorithm reads this condition for y that has been discovered (so it doesn't go into the first conditional block). As quoted above (in the book there is a semi-proof for that as well which I will include at the end of this footnote), p is either a tree-edge or a back-edge. As y is discovered, it cannot be a tree-edge from v to y. It can be a back edge to an ancestor which means the call is in a recursion call that started processing this ancestor at some point, and so the ancestor's call has yet to reach the final line, marking it as processed (so it is still marked as not processed) and it can be a tree-edge from y to v, in which case the same situation holds - and y is still marked as not processed.

The semi-proof for every edge being a tree-edge or a back-edge:

Why can’t an edge go to a brother or cousin node instead of an ancestor? All nodes reachable from a given vertex v are expanded before we finish with the traversal from v, so such topologies are impossible for undirected graphs.

Upvotes: 2

Graph Traversal using DFS

Answers (2)

Short Answer:

Detailed Answer:

Related Questions