Jordan
Jordan

Reputation: 151

Two-directional spanning tree in a directed graph in subquadratic time

Here's the problem I'm trying to solve. Given a directed graph G, does it contain a connected subgraph that:

Intuitively, the subgraph I'm looking for consists of a downward pointing and an upward pointing tree that share the same root and together span G. I'm calling it the two-directional spanning tree problem, but it may have another name.

The dumb algorithm I've thought of is to cycle through each node in the graph, do a backwards and a forwards DFS starting at that node and then concatenate the search trees. If a two-directional spanning tree exists, I'm pretty sure this will find one on some iteration. However, it runs in O(V(V + E)) time. My intuition is that there should be a faster algorithm. Am I correct?

Upvotes: 1

Views: 186

Answers (1)

ruakh
ruakh

Reputation: 183456

Warning: This answer is incomplete; the algorithm here returns "unknown" in some cases. I'm posting it only because no one else has proposed any answers in the past few days, and it's still an improvement over the proposal inside your question itself, which is to try each vertex as a candidate root, perform "backward DFS" and "forward DFS" (where the "forward DFS" is not allowed to visit any nodes that were visited by the "backward DFS"), and make sure that the all vertices are discovered by one or the other.

Specifically, there are two ways in which the below answer improves over your proposal:

  • As you note in your question, your proposal requires O(V·(V + E)) time. The below answer requires only O(V + E) time.
  • If the digraph contains cycles, your proposal can fail to detect a two-directional spanning tree whose vertex is an element of such a cycle. Therefore, your proposal can spuriously return "failure". The below answer never spuriously returns "failure", but (as noted above) returns "unknown" in some cases.

To start with . . . if G is acyclic, then we can determine whether a two-directional spanning tree exists in O(V + E) time, and if so construct it in a further O(V + E) time. To see why, observe the following:

  • If there's a two-directional spanning tree rooted at v, then that means that for every other vertex w, there's a path either from v to w or vice versa.

  • If we have a topological ordering of the vertices (which we can construct in O(V + E) time), then for any given vertices v and w, there is certainly no path from v to w if w precedes v in the topological ordering, and vice versa.

  • This means that we can find the vertices (if any) that have paths to or from all other vertices by topologically sorting the digraph and then making two passes:

    • First, we iterate over the vertices in forward order. We want to find all "backward roots", meaning vertices that are reachable from all of their predecessors — or put another way, we want to eliminate any vertex that is not reachable from some predecessor. To do this, keep a collection of already-encountered vertices. As we iterate, we remove any vertices that have edges pointing to the current vertex. Whenever this collection is empty, the current vertex is a "backward root". (It's a bit tricky to do this iteration in strict O(V + E) time, but it's doable. The key insight is that we need to store a list of inbound edges for each vertex, which may require an O(V + E) preprocessing pass if that wasn't initially part of our graph representation.)
    • Then, we do the reverse, iterating over the vertices in reverse order to find all "forward roots".

    A vertex is the root of some two-directional spanning tree if and only if it's both a "forward root" and a "backward root".

  • Once we've found such a root v, we can construct the tree itself by doing "forward DFS" and "reverse DFS" from that root.

An important special case (whose importance will become clear below) is the case that two consecutive vertices ("consecutive" in the topological ordering, I mean) are both valid roots. In such a case, we can do "reverse DFS" from the first one and "forward DFS" from the second one to build a two-directional spanning tree where every vertex has either indegree ≤ 1 or outdegree ≤ 1.


OK, but what if G contains a cycle? Your proposed algorithm can never return "failure" in such cases, but it is at least guaranteed to find a two-directional spanning tree as long as the root itself isn't part of a cycle in G. The preceding section completely ignores that case.

To start addressing that case, note that we can use Tarjan's algorithm, which takes O(V + E) time, to derive a directed acyclic graph G′ of the strongly connected components of G. (If G is already acyclic, then G′ = G.)

To see why G′ is useful, observe the following:

  • If G has a two-directional spanning tree rooted at a vertex v, then G′ has a two-directional spanning tree rooted at the vertex representing the strongly connected component containing v.

    • I admit, this claim is not completely obvious. After all, it's possible that G has a two-directional spanning tree that has multiple branches that "pass through" a single strongly connected component, such that the component's overall indegree and outdegree are both > 1. However, in such a case, we can always "fix" the problem in G′ by removing one of the resulting outbound edges (if we're on the "sourceward" side of the root) or inbound edges (if we're on the "sinkward" side).
  • Given any strongly connected digraph H and any vertex v in H, we can do "forward DFS" or "reverse DFS" out from v to find a tree that spans H and has v as its only source or its only sink, respectively. So if H is one of the strongly connected components of G, and G′ has a two-directional spanning tree where the vertex representing H has either indegree ≤ 1 or outdegree ≤ 1, then we can straightforwardly (and efficiently) construct a subgraph of H that is a subgraph of a corresponding two-directional spanning tree of G, provided the rest of G's strongly connected components cooperate as well.

So the only remaining problem is with the root of the two-directional spanning tree of G′: just because it's the root of a two-directional spanning tree of G′, that doesn't necessarily mean that the corresponding strongly connected component of G contains the root of any two-directional spanning tree of G. For example, consider this graph:

A   B
↓   ↓
C ↔ D
↓   ↓
E   F

This graph doesn't have any two-directional spanning tree, but the corresponding graph of strongly connected components does (with root corresponding to {C, D}).


So, in other words, we have the following algorithm:

  • Use Tarjan's algorithm to derive a directed acyclic graph G′ of the strongly connected components of G.
  • Topologically sort G′ and identify all valid roots for two-directional spanning trees over G′.
  • If there are no such valid roots, return "failure".
  • If there are any two valid roots that are adjacent in the topological sort, return "success". G must contain an edge vw from some vertex in the one valid root to some vertex in the other valid root; we can obtain the two-directional spanning tree by doing "backward DFS" from v, plus the edge vw, plus doing "forward DFS" from w.
  • If there is any valid root corresponding to a strongly connected component that's just a single vertex of G, return "success". That single vertex is the root of a two-directional spanning tree of G that we can obtain by doing "backward DFS" plus "forward DFS".
  • Otherwise . . . return "unknown". We potentially have a much smaller problem in this case: in theory, for each valid root corresponding to some strongly connected component H, we need to examine H to see if it has a two-directional spanning tree that's suitable in terms of its inbound and outbound connections. But even that much-smaller problem still involves potentially massive numbers of possibilities, so exhaustive search seems infeasible.

So, as I mentioned at the outset, this algorithm requires O(V + E) time, and it deterministically returns a tree or "failure" in all cases where your proposal is deterministic plus some cases where your proposal is nondeterministic; but there are still some cases where this algorithm punts. :-/

Upvotes: 2

Related Questions