Reputation: 376
I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
Upvotes: 2
Views: 1916
Reputation: 3038
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
- find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in
O(n + m)
, e.g. by looping through all edges and marking those nodes that are the end of any edge. The nodes with 0 in-degree are those which have not been marked.
- Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that node the information of its reachability.
In order to achieve this, we need to add the reachability of the successors of this node. Some of these values might have already been computed (if the successor was already visited by DFS), therefore this is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) { visited[node] = true; reachability[node] = 1; for each successor of node { if (!visited[successor]) { DFS(successor); } reachability[node] += reachability[successor]; } }
After calling this for all nodes with 0 in-degree, the
reachability
array will contain the reachability for all nodes in the graph.The overall complexity is
O(n + m)
.
Upvotes: 2
Reputation: 2079
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].
Upvotes: 0