Why DFS isn't fast enough in checking if a graph is a tree

Question

I tried to solve this problem Problem Description It seems correct idea is to check if given graphs have cycles (whether is a tree). However, my code couldn't pass Test 7, (Always Time Limit Exceeded), any idea how to make this faster? I used DFS. Many thanks Yes, finally got accepted. The problem is dfs on each vertex, which is unnecessary. the dfs function should be like this.

function dfs(idx: integer; id: integer): boolean;
begin
  if (visited[idx] = id) then
  begin
    Result := false;
    Exit;
  end;
  if (tree[idx] <> 0) then
  begin
    visited[idx] := id;
    Result := dfs(tree[idx], id);
    Exit;
  end;
  Result := true;
end;



program Project2;

{$APPTYPE CONSOLE}

var
  i, m, j, n, k: integer;
  tree: array [1 .. 25001] of integer;
  visited: array [1 .. 25001] of boolean;

function dfs(idx: integer): boolean;
label
  fin;
var
  buf: array[1 .. 25001] of integer;
  i, cnt: integer;
begin
  cnt := 1;
  while (true) do
  begin
    if (visited[idx]) then
    begin
      Result := false;
      goto fin;
    end;
    if (tree[idx] <> 0) then
    begin
      visited[idx] := true;
      buf[cnt] := idx;
      Inc(cnt);
      idx := tree[idx];
    end
    else
    begin
      break;
    end;
  end;
  Result := true;
fin:
  for i := 1 to cnt - 1 do
  begin
    visited[buf[i]] := false;
  end;
end;

function chk(n: integer): boolean;
var
  i: integer;
begin
  for i := 1 to n do
  begin
    if (tree[i] = 0) then continue;
    if (visited[i]) then continue;
    if (dfs(i) = false) then
    begin
      Result := false;
      Exit;
    end;
  end;
  Result := true;
end;

begin
  Readln(m);
  for i := 1 to m do
  begin
    Readln(n);
    k := 0;
    for j := 1 to n do
    begin
      Read(tree[j]);
      if (tree[j] = 0) then
      begin
        Inc(k);
      end;
    end;
    if (k <> 1) then
    begin
      Writeln('NO');
    end
    else
    if (chk(n)) then
    begin
      Writeln('YES');
    end
    else
    begin
      Writeln('NO');
    end;
    Readln;
  end;
  //Readln;
end.

A. Webb · Accepted Answer

I know next to nothing about Pascal, so I could be misinterpreting what you are doing, but I think the main culprit is at fin where you unmark visited vertices. This forces you into doing a DFS from each vertex, whereas you only need to do one per component.

In the case where there is more than one connected component the movement will either halt

because a vertex points to a vertex already marked, in which case we just halt due to a cycle having been found
because the vertex points to no one (but itself), in which case we need to find the next unmarked vertex and start another DFS again from there

You need not worry about bookkeeping for backtracking as each vertex at most points to one other vertex in this problem. There is also no need to worry about which DFS did which marking, as each will only work within its connected component anyway.

In the case where a vertex that points to itself is encountered first, it should not be marked yet, but skipped over.

Alternate Solution Using Set Union and Vertex/Edge Count

Since a tree has the property that the number of edges is one less than the number of vertices, there is another way to think about the problem -- determine (1) the connected components and (2) compare the edge and vertex count in each component.

In many languages you have a Set data structure with near-constant time Union/Find methods readily available. In this case the solution is easy and fast - near-linear in the number of edges.

Create an Set for each vertex representing its connected component. Then process your edge list. For each edge, Union the Sets represented by the two vertices. As you go, keep track of the number of vertices in each Set and the number edges. Same example:

Initial Sets

Vertex         1  2  3  4  5
Belongs to     S1 S2 S3 S4 S5

Set            S1 S2 S3 S4 S5
Has # vertices 1  1  1  1  1
And # edges    0  0  0  0  0

Process edge from 1 to 2

Vertex         1  2  3  4  5
Belongs to     S1 S1 S3 S4 S5

Set            S1 S3 S4 S5
Has # vertices 2  1  1  1
And # edges    1  0  0  0

Process edge from 2 to 3

Vertex         1  2  3  4  5
Belongs to     S1 S1 S1 S4 S5


Set            S1 S4 S5
Has # vertices 3  1  1
And # edges    2  0  0

Process edge from 3 to 4

Vertex         1  2  3  4  5
Belongs to     S1 S1 S1 S1 S5

Set            S1 S5
Has # vertices 4  1
And # edges    3  0

Process edge from 4 to 1

Vertex         1  2  3  4  5
Belongs to     S1 S1 S1 S1 S5

Set            S1 S5
Has # vertices 4  1
And # edges    4  0

And we can stop here because S1 at this point violates the vertex versus edge count of trees. There is a cycle in S1. It does not matter if vertex 5 points to itself or to someone else.

For posterity, here is an implementation in c. It's been a while, so forgive the sloppiness. It is not the fastest, but it does pass all tests within the time limit. The disjoint set coding is straight from Wikipedia's pseudocode.

#include 

struct ds_node
{
    struct ds_node *parent;
    int rank;
};

struct ds_node v[25001];

void ds_makeSet(struct ds_node *x)
{
    x->parent = x;
    x->rank = 0;
}

struct ds_node* ds_find(struct ds_node *x)
{
    if (x->parent != x) x->parent = ds_find(x->parent);
    return x->parent;
}

int ds_union(struct ds_node *x, struct ds_node *y)
{
    struct ds_node * xRoot;
    struct ds_node * yRoot;

    xRoot = ds_find(x);
    yRoot = ds_find(y);

    if (xRoot == yRoot) return 0;

    if (xRoot->rank < yRoot->rank) 
    {
        xRoot->parent = yRoot;
    }
    else if (xRoot->rank > yRoot->rank) 
    {
        yRoot->parent = xRoot;
    }
    else 
    {
        yRoot->parent = xRoot;
        xRoot->rank++;
    }
    return 1;
}

int test(int n)
{
    int i, e, z = 0;

    for(i=1;i<=n;i++)
    {
        ds_makeSet(&v[i]);
    }
    for(i=1;i<=n;i++)
    {
        scanf("%d",&e);
        if (e)
        {
            if ( !ds_union(&v[i],&v[e]) ) 
            {
                for(i++;i<=n;i++) scanf("%d",&e);
                return 0;
            }
        }
        else
        {
            z++;
        }
    }
    return (z == 1);
}
int main()
{
    int runs; int n;

    scanf("%d", &runs);
    while(runs--)
    {
        scanf("%d", &n); 
        getc(stdin);

        test(n) ? puts("YES") : puts("NO");
    }
}

Why DFS isn't fast enough in checking if a graph is a tree

Answers (1)

Related Questions

Why DFS isn&#39;t fast enough in checking if a graph is a tree

Answers (1)

Related Questions

Why DFS isn't fast enough in checking if a graph is a tree