retide
retide

Reputation: 1190

C# LINQ First() faster than ToArray()[0]?

I am running a test.

It looks like:

method 1)

List<int> = new List<int>{1,2,4, .....} //assume 1000k
var  result ErrorCodes.Where(x => ReturnedErrorCodes.Contains(x)).First();

method 2)

List<int> = new List<int>{1,2,4, .....} //assume 1000k
var  result = ErrorCodes.Where(x => ReturnedErrorCodes.Contains(x)).ToArray()[0];

Why is method 2 is so slow compared to method 1?

Upvotes: 19

Views: 3296

Answers (10)

Eric Lippert
Eric Lippert

Reputation: 660573

You have a jar containing a thousand coins, many of which are dimes. You want a dime. Here are two methods for solving your problem:

  1. Pull coins out of the jar, one at a time, until you get a dime. Now you've got a dime.

  2. Pull coins out of the jar, one at a time, putting the dimes in another jar. If that jar turns out to be too small, move them, one at a time, to a larger jar. Keep on doing that until you have all of the dimes in the final jar. That jar is probably too big. Manufacture a jar that is exactly big enough to hold that many dimes, and then move the dimes, one at a time, to the new jar. Now start taking dimes out of that jar. Take out the first one. Now you've got a dime.

Is it now clear why method 1 is a whole lot faster than method 2?

Upvotes: 62

Guffa
Guffa

Reputation: 700910

Because of deferred execution.

The code ErrorCodes.Where(x => ReturnedErrorCodes.Contains(x)) doesn't return a collection of integers, instead it returns an expression that is capable of returning a stream of integers. It doesn't do any actual work until you start reading integers from it.

The ToArray method will consume the entire stream and put all the integers in an array. This means that every item in the entire list has to be compared to the error codes.

The First method on the other hand will only get the first item from the stream, and then stop reading from the stream. This will make it a lot faster, because it will stop comparing items from the list to the error codes as soon as it finds a match.

Upvotes: 4

benwasd
benwasd

Reputation: 1352

var @e = array.GetEnumerator();

// First
@e.MoveNext();
return @e.Current;

// ToArray (with yield [0] should as fast as First...)
while (@e.MoveNext() {
    yield return @e.Current;
}

Upvotes: 1

Maxim
Maxim

Reputation: 7348

First() is complexity of O(1)

ToArray()[0] is complexity O(n)+1

Upvotes: 1

svick
svick

Reputation: 245076

ToArray() walks through the whole sequence it has been given and creates and array out of it.

If you don't callt ToArray(), First() lets Where() return just the first item that matches and immediatelly returns.

Upvotes: 1

Aurojit Panda
Aurojit Panda

Reputation: 929

This makes sense, ToArray probably involves a copy, which is always going to be more expensive, since Linq can't make any guarantees about how you're going to use your array, while First() can just return the single element at the beginning.

Upvotes: 0

Andrew Cooper
Andrew Cooper

Reputation: 32596

Because ToArray() copies the entire sequence to an array.

Method 2 has to iterate over the whole sequence to build an array, and then returns the first element.

Method 1 just iterates over enough of the sequence to find the first matching element.

Upvotes: 1

Marc Gravell
Marc Gravell

Reputation: 1064294

Erm... because you are creating an extra array (rather than just using the iterator). The first approach stops after the first match (Where is a non-buffered streaming API). The second loads all the matches into an array (presumably with several re-sizes), then takes the first item.

As a side note; you can create infinite sequences; the first approach would still work, the second would run forever (or explode).

It could also be:

var  result ErrorCodes.First(x => ReturnedErrorCodes.Contains(x));

(that won't make it any faster, but is perhaps easier to read)

Upvotes: 46

Yuck
Yuck

Reputation: 50865

In method 2 the entire array must be converted to an array first. Also, it seems awkward to mix array access when First() is so much more readable.

Upvotes: 0

Kyle Trauberman
Kyle Trauberman

Reputation: 25694

Because in the second example, you are actually converting the IEnumerable<T> to an array, whereas in the first example, no conversion is taking place.

Upvotes: 0

Related Questions