Reputation: 52952
I'm trying to get my head around this rather than just chalking it up to general voodoo.
I do an EF query and get some data back, and I .ToList()
it, like this:
IEnumerable<DatabaseMatch<CatName>> nameMatches = nameLogicMatcher.Match(myIQueryableOfCats).ToList();
Some cats appear twice in the database because they have multiple names, but each cat has a primary name. So in order to filter this down, I get all of the ids of the cats in a list:
List<int> catIds = nameMatches.Select(c => c.Match.CatId).ToList();
I then iterate through all of the distinct ids, get all of the matching cat names, and remove anything that isn't a primary name from the list, like this:
foreach (int catId in catIds.Distinct())
{
var allCatNameMatches = nameMatches.Where(c => c.Match.CatId == catId);
var primaryMatch = allCatNameMatches.FirstOrDefault(c => c.Match.NameType == "Primary Name");
nameMatches = nameMatches.Except(allCatNameMatches.Where(c => c != primaryMatch));
}
Now this code, when I first ran it, just hung. Which I thought was odd. I stepped through it, and it seemed to work but after 10 iterations (it is capped at 100 cats in total) it started to slow down and then eventually it was glacial and then hung completely.
I thought maybe it was doing some intensive database work by mistake, but the profiler shows no SQL executed except that which retrieves the initial list of cat names.
I decided to change it from IEnumerable
of nameMatches to a List
, and put the appropriate .ToList()
on the last line. It worked instantly and perfectly after I did this.
The question I'd like to ask is, why?
Upvotes: 2
Views: 167
Reputation: 21723
Without the ToList()
you are building up in nameMatches
a nested chain of IEnumerable
s awaiting delayed execution. This might not be so bad, except you are also calling FirstOrDefault
on each iteration which will execute the chain. So on iteration number n, you are executing the filter operations contained in the loop n-1 times. If you had 1000 distinct cats, the Linq chain is getting executed 1000 + 99 + ... + 1 times. (I think you have something that is O(n³)!)
The moral is, if you want to use delayed execution, make very sure that you're only executing your chain once.
Upvotes: 3
Reputation: 7475
Let's simplify your code a little:
foreach (int catId in catIds.Distinct())
{
var allCatNameMatches = nameMatches.Where(c => c.Match.CatId == catId);
var primaryMatch = null;
nameMatches = nameMatches.Except(allCatNameMatches.Where(c => c != primaryMatch));
}
And a little more:
foreach (int catId in catIds.Distinct())
{
nameMatches = nameMatches.Where(c => c.Match.CatId == catId);
var primaryMatch = null;
nameMatches = nameMatches.Except(nameMatches.Where(c => c != primaryMatch));
}
In the latter one it is obvious that due to deferred execution each pass of foreach
body lengthens the chain of Where
and Except
. Then remember var primaryMatch = allCatNameMatches.FirstOrDefault
. It is not deferred executed, so in each iteration of foreach
it should execute all chain. Therefore it hangs.
Upvotes: 0