Daniel Martinez
Daniel Martinez

Reputation: 397

How to validate if a parent item has children in recursive LINQ function?

I'm doing a recursive LINQ function as described in tis question: Simulating CTE recursion in C#

My code is as follows:

private static IEnumerable<KeyValuePair<int, int>> getNet(List<DataRow> list, int? leader, int level)
{
    return list
        .Where(x => x.Field<int?>("LeaderID") == leader)
        .SelectMany(x =>
            new[] { 
               new KeyValuePair<int, int>(x.Field<int>("RepID"), level)
                  }.Concat(getNet(list, x.Field<int>("RepID"), level+ 1))
         );
}

I'd like to validate if a parent has children before entering the function again because every child gets evaluated again and that consumes a lot of time.

i.e Parent A has 5000 children, but only 5 of them have children, I need something to validate if A's children have children before executing the function for all of them.

Thanks!

Upvotes: 1

Views: 1063

Answers (1)

Servy
Servy

Reputation: 203827

So testing the children earlier really isn't going to help you. You're still doing the same work, conceptually. If each recursive call handles two depths at once, instead of one, you're greatly complicating the method itself, duplicating the work of the second depth whenever there are children, and gaining very, very little even when there are no children. The expensive part of the method is in the linear search through the list for the children, not in the method call that starts the search.

To improve performance you need to stop doing linear searches through a very large list many, many times. Fortunately, this is easy enough to do. Just create a lookup once, at the start of the method, of all children for each parent.

var lookup = list.ToLookup(x => x.Field<int?>("LeaderID"));

Now, what you're trying to do, conceptually, is traverse a tree. We can start out with a generalized "traverse" method capable of traversing any tree (using a non-recursive implementation, for performance improvements):

public static IEnumerable<T> Traverse<T>(IEnumerable<T> source, 
    Func<T, IEnumerable<T>> childSelector)
{
    var queue = new Queue<T>(source);
    while (queue.Any())
    {
        var item = queue.Dequeue();
        yield return item;
        foreach (var child in childSelector(item))
        {
            queue.Enqueue(child);
        }
    }
}

Now that we have these we can use the lookup to efficiently get all children for each node, greatly improving the implementation. Of course, it'd be prettier if you didn't also need the depth of each node, but it's still not that bad.

var roots = lookup[leader]
    .Select(row => Tuple.Create(row.Field<int>("RepID"), 0));

return Traverse(roots, node => lookup[node.Item1]
    .Select(row => Tuple.Create(row.Field<int>("RepID"), node.Item2 + 1)));

If you don't need to know the depth of each node you can simplify all of that down to this:

var lookup = list.ToLookup(
    row => row.Field<int?>("LeaderID"),
    row => row.Field<int>("RepID"));
return Traverse(lookup[leader], node => lookup[node]);

Upvotes: 3

Related Questions