Reputation: 40604
I am using a recursive method to go through a tree of items and add its children to a flat collection:
public class Thing
{
public int Id { get; set; }
public string Name { get; set; }
public int? ParentId { get; set; }
}
void Main()
{
var sampleData = new List<Thing>
{
new Thing { Id = 1, Name = "root1", ParentId = null },
new Thing { Id = 2, Name = "2", ParentId = 1 },
new Thing { Id = 3, Name = "3", ParentId = 1 },
new Thing { Id = 4, Name = "4", ParentId = 2 },
new Thing { Id = 5, Name = "5", ParentId = 2 },
new Thing { Id = 6, Name = "6", ParentId = 2 },
new Thing { Id = 7, Name = "7", ParentId = 6 },
new Thing { Id = 8, Name = "8", ParentId = 7 },
new Thing { Id = 9, Name = "9", ParentId = 8 },
new Thing { Id = 10, Name = "10", ParentId = 9 },
new Thing { Id = 11, Name = "11", ParentId = 10 },
new Thing { Id = 12, Name = "12", ParentId = 11 },
new Thing { Id = 13, Name = "13", ParentId = 12 },
new Thing { Id = 14, Name = "14", ParentId = 13 },
new Thing { Id = 15, Name = "root15", ParentId = null }
};
var subThings = new HashSet<Thing>();
var stopWatch = new Stopwatch();
stopWatch.Start();
//AddSubThings(subThings, sampleData, new List<int> { 1 });
AddSubThingsUsingList(subThings, sampleData, new List<int> { 1 });
stopWatch.Elapsed.Dump();
subThings.Dump();
}
private void AddSubThings(HashSet<Thing> resultThings, IEnumerable<Thing> sourceThings, IEnumerable<int> parentIds)
{
if (!sourceThings.Any() || !parentIds.Any())
{
return;
}
var subThings = sourceThings.Where(st => st.ParentId.HasValue && parentIds.Contains(st.ParentId.Value));
resultThings.UnionWith(subThings);
AddSubThings(resultThings, sourceThings.Except(subThings), subThings.Select(st => st.Id));
}
private void AddSubThingsUsingList(HashSet<Thing> resultThings, List<Thing> sourceThings, List<int> parentIds)
{
if (!sourceThings.Any() || !parentIds.Any())
{
return;
}
var subThings = sourceThings.Where(st => st.ParentId.HasValue && parentIds.Contains(st.ParentId.Value));
resultThings.UnionWith(subThings);
AddSubThingsUsingList(resultThings, sourceThings.Except(subThings).ToList(), subThings.Select(st => st.Id).ToList());
}
When I use the AddSubThings
method it takes around 90 seconds to process. However if I use the AddSubThingsUsingList
method it does not even take a second. Why is this?
Upvotes: 0
Views: 127
Reputation: 5513
Ok. This is a bit complex.
Operations on IEnumerable are lazy, i.e. they are not executed until you need the result. Now when you pass sourceThins
and subThings
to AddSubThings
, you've not sent a materialized collection of things, all you've done is you've defined how these collections are calculated from the original List
s.
Now when the method calls itself recursively, it adds more filtering and selection to the data it has received.
All these layers of selection and filtering will be called when you call Any()
.
On the other hand, in the case of List
s, things are materialized after calls to Where
, Except
and Select
, because you call ToList
.
Upvotes: 0
Reputation: 32286
The problem is because your create subThings
from sourceThings
like this
var subThings = sourceThings.Where(
st => st.ParentId.HasValue && parentIds.Contains(st.ParentId.Value));
Then you pass the following as sourceThings
to the recursive call.
sourceThings.Except(subThings)
Which is equivalent to
sourceThings.Except(
sourceThings.Where(
st => st.ParentId.HasValue && parentIds.Contains(st.ParentId.Value)))
That query when iterated with have to iterate over the original list twice. With each recursive call the query will build up and need to iterate the original list 2^n times where n is the recursion level. And your query is being iterated by the Any
and the HashSet.UnionWith
calls meaning it's more like 2^(n+1).
The other one immediately iterates the query before passing them and thus avoids this doubling problem.
You could pass the following to your recursive call for sourceThings
instead to make it faster as it doesn't double up the required iterating of the original list on each recursive call.
sourceThings.Where(st => !st.ParentId.HasValue || !parentIds.Contains(st.ParentId.Value))
Upvotes: 3