haughtonomous
haughtonomous

Reputation: 4850

Linq deferred operations

I mostly understand deferred execution, but I have a question about a particular case:

Given a code fragment such as

                        var resultsOfInterest = from r in ...
                                                select r;
                        foreach (var x in resultsOfInterest)
                        {
                            //do something with x
                        }

how many times is the query resultsOfInterest executed? Once when setting up the foreach loop, or once for every element 'x'? Would it be more efficient with

                        foreach (var x in resultsOfInterest.ToArray())
                        {
                            //do something with x
                        }

?

TIA

Upvotes: 5

Views: 272

Answers (3)

nmclean
nmclean

Reputation: 7724

In both cases, the query is only executed once, but in the second case there are two enumerations.

Assuming 1000 items:

Case 1:

  1. Execute the select clause, assign result to x.
  2. Goto 1, repeat 1000 times.

Case 2:

  1. Create array.
  2. Execute the select clause, assign result to the array.
  3. Goto 2, repeat 1000 times.
  4. Access an element from the array, assign it to x.
  5. Goto 4, repeat 1000 times.

So generally creating an array is not desirable at all. But if you need to enumerate the same items multiple times yourself, and array access is faster than your select, then of course it would be more efficient to create the array.

Upvotes: 0

p.s.w.g
p.s.w.g

Reputation: 149010

In both cases, it only runs once.

In the first example, (if this is a Linq-to-Objects query), it runs just long enough to get the next x on each iteration. In the second example, it has to evaluates the entire result set at once and stores it to an array.

So, suppose this is an expensive query and it takes 1 second to get each item, and there are 20 items in the list, both queries will take about 20 seconds to process all items. However, the first one will be blocked for 1 second on each iteration while it gets the next item, but the second will be blocked for 20 seconds before the start of the loop, then loops through all the items in the array fairly quickly.

Neither is more efficient in when it comes to actually evaluating the query. In general, however, you should be avoid unnecessary calls to ToArray or ToList, since in addition to evaluating the query, it must allocate an array for the results (List<T> stores its items in an internal array). For a list of 20 items, this doesn't mean much, but when you have several thousand items, this can cause some noticeable slow-down. Of course, this doesn't mean that ToArray is always bad. If you had 5 foreach-loops in the previous example, storing the results in an array and looping through the array rather than re-evaluating the query each time would actually speed up the code by about 80 seconds

Upvotes: 3

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236208

It will be executed once, jut before loop, when GetEnumerator() method will be executed over query variable. Here is how foreach loop looks like:

var enumerator = resultsOfInterest.GetEnumerator(); // query executed here

while(enumerator.MoveNext()) // iterating over results of query execution
{
   var x = enumerator.Current;
   // do something with x
}

Second sample will not be more efficient, it simply stores results of query execution in array, and then calls array iterator:

var enumerator = resultsOfInterest.ToArray().GetEnumerator();
// loop stays same

Upvotes: 5

Related Questions