Luke Baulch
Luke Baulch

Reputation: 3656

What is the best way to check and retrieve the first item of a collection?

I understand this is somewhat trivial but...

What the best way to get the reference the first item of a collection if any exist? Assume the collection contains items of a reference-type.

Code Sample 1:

if (collection.Any())
{
    var firstItem = collection.First();
    // add logic here
}

The above sample has two separate calls on the collection starting an iteration which complete as soon as the first is detected.

Code Sample 2:

var firstItem = collection.FirstOrDefault();
if (firstItem != null)
{
    // add logic here
}

The above sample only has a single call on the collection but introduces a variable that is unnecessarily in a wider scope.

Is there a best-practices related to this scenario? Is there a better solution?

Upvotes: 8

Views: 4228

Answers (7)

Luke Hutton
Luke Hutton

Reputation: 10722

Just did simple test on primitive type, and looks like your code sample #2 is fastest in this case (updated):

[TestFixture] public class SandboxTesting {
  #region Setup/Teardown
  [SetUp] public void SetUp() {
    _iterations = 10000000;
  }
  [TearDown] public void TearDown() {}
  #endregion
  private int _iterations;
  private void SetCollectionSize(int size) {
    _collection = new Collection<int?>();
    for(int i = 0; i < size; i++)
      _collection.Add(i);
  }
  private Collection<int?> _collection;
  private void AnyFirst() {
    if(_collection.Any()) {
      int? firstItem = _collection.First();
      var x = firstItem;
    }
  }
  private void NullCheck() {
    int? firstItem = _collection.FirstOrDefault();
    if (firstItem != null) {
      var x = firstItem;
    }
  }
  private void ForLoop() {
    foreach(int firstItem in _collection) {
      var x = firstItem;
      break;
    }
  }
  private void TryGetFirst() {
    int? firstItem;
    if (_collection.TryGetFirst(out firstItem)) {
      var x = firstItem;
    }
  }    
  private TimeSpan AverageTimeMethodExecutes(Action func) {
    // clean up
    GC.Collect();
    GC.WaitForPendingFinalizers();
    GC.Collect();

    // warm up 
    func();

    var watch = Stopwatch.StartNew();
    for (int i = 0; i < _iterations; i++) {
      func();
    }
    watch.Stop();
    return new TimeSpan(watch.ElapsedTicks/_iterations);
  }
  [Test] public void TimeAnyFirstWithEmptySet() {      
    SetCollectionSize(0);

    TimeSpan averageTime = AverageTimeMethodExecutes(AnyFirst);

    Console.WriteLine("Took an avg of {0} secs on empty set", avgTime);     
  }
  [Test] public void TimeAnyFirstWithLotsOfData() {
    SetCollectionSize(1000000);

    TimeSpan avgTime = AverageTimeMethodExecutes(AnyFirst);

    Console.WriteLine("Took an avg of {0} secs on non-empty set", avgTime);      
  }
  [Test] public void TimeForLoopWithEmptySet() {
    SetCollectionSize(0);

    TimeSpan avgTime = AverageTimeMethodExecutes(ForLoop);

    Console.WriteLine("Took an avg of {0} secs on empty set", avgTime);
  }
  [Test] public void TimeForLoopWithLotsOfData() {
    SetCollectionSize(1000000);

    TimeSpan avgTime = AverageTimeMethodExecutes(ForLoop);

    Console.WriteLine("Took an avg of {0} secs on non-empty set", avgTime);
  }
  [Test] public void TimeNullCheckWithEmptySet() {
    SetCollectionSize(0);

    TimeSpan avgTime = AverageTimeMethodExecutes(NullCheck);

    Console.WriteLine("Took an avg of {0} secs on empty set", avgTime);
  }
  [Test] public void TimeNullCheckWithLotsOfData() {
    SetCollectionSize(1000000);

    TimeSpan avgTime = AverageTimeMethodExecutes(NullCheck);

    Console.WriteLine("Took an avg of {0} secs on non-empty set", avgTime);
  }
  [Test] public void TimeTryGetFirstWithEmptySet() {
    SetCollectionSize(0);

    TimeSpan avgTime = AverageTimeMethodExecutes(TryGetFirst);

    Console.WriteLine("Took an avg of {0} secs on empty set", avgTime);
  }
  [Test] public void TimeTryGetFirstWithLotsOfData() {
    SetCollectionSize(1000000);

    TimeSpan averageTime = AverageTimeMethodExecutes(TryGetFirst);

    Console.WriteLine("Took an avg of {0} secs on non-empty set", avgTime);
  }
}
public static class Extensions {
  public static bool TryGetFirst<T>(this IEnumerable<T> seq, out T value) {
    foreach(T elem in seq) {
      value = elem;
      return true;
    }
    value = default(T);
    return false;
  }
}

AnyFirst
NonEmpty: 00:00:00.0000262 seconds
EmptySet: 00:00:00.0000174 seconds

ForLoop
NonEmpty: 00:00:00.0000158 seconds
EmptySet: 00:00:00.0000151 seconds

NullCheck
NonEmpty: 00:00:00.0000088 seconds
EmptySet: 00:00:00.0000064 seconds

TryGetFirst
NonEmpty: 00:00:00.0000177 seconds
EmptySet: 00:00:00.0000172 seconds

Upvotes: 0

Zooba
Zooba

Reputation: 11448

The second doesn't work on non-nullable value types (Edit: as you assumed - missed that the first time) and doesn't really have an alternative besides the first, which has a race-condition. There are two alternatives which are both suitable - selecting one or the other depends on how frequently you will get an empty sequence.

If it's a common or expected case where you get an empty enumeration, using a foreach loop is relatively neat:

foreach (var firstItem in collection)
{
    // add logic here
    break;
}

or if you really don't want the break in there (which is understandable):

foreach (var firstItem in collection.Take(1))
{
    // add logic here
}

If it is relatively unusual for it to be empty then a try/catch block should give the best performance (since exceptions are only expensive if they are actually raised - an unraised exception is practically free):

try
{
    var firstItem = collection.First();
    // add logic here
}
catch (InvalidOperationException) { }

A third option is to use an enumerator directly, though this should be identical to the foreach version and is slightly less clear:

using (var e = collection.GetEnumerator())
{
    if (e.MoveNext())
    {
        var firstItem = e.Current;
        // add logic here
    }
}

Upvotes: 2

Dennis Smit
Dennis Smit

Reputation: 1178

Or, as an extension to the solution from Gabe, make it use a lambda so you can drop the if:

public static class EnumerableExtensions
{
    public static bool TryGetFirst<T>(this IEnumerable<T> seq, Action<T> action)
    {
        foreach (T elem in seq)
        {
            if (action != null)
            {
                action(elem);
            }

            return true;
        }

        return false;
    }
}

And use it like:

     List<int> ints = new List<int> { 1, 2, 3, 4, 5 };

     ints.TryGetFirst<int>(x => Console.WriteLine(x));

Upvotes: 1

Ian Dallas
Ian Dallas

Reputation: 12741

Since all generic Collections (ie: of type System.Collections.ObjectModel) have the Count member my preferred way of doing this is as follows:

Item item = null;
if(collection.Count > 0)
{
    item = collection[0];
}

This is safe is since all Collections will have both the Count and Item property. Its also very straight forward and easy for any other programmers reading your code to understand what your intent is.

Upvotes: 0

Gabe
Gabe

Reputation: 86768

You could create an extension method like this:

public static bool TryGetFirst<T>(this IEnumerable<T> seq, out T value)
{
    foreach (T elem in seq)
    {
        value = elem;
        return true;
    }
    value = default(T);
    return false;
}

Then you would use it like this:

int firstItem;
if (collection.TryGetFirst(out firstItem))
{
    // do something here
}

Upvotes: 3

JaredPar
JaredPar

Reputation: 755171

I prefer the second example because it's more effecient in the general case. It's possible that this collection is combination of many different delay evaluated LINQ queries such that even getting the first element requires a non-trivial amount of work.

Imagine for example that this collection is build from the following LINQ query

var collection = originalList.OrderBy(someComparingFunc);

Getting just the first element out of collection requires a full sort of the contents of originalList. This full sort will occur each time the elements of collection are evaluated.

The first sample causes the potentially expensive collection to be evaluated twice: via the Any and First method. The second sample only evaluates the collection once and hence I would choose it over the first.

Upvotes: 6

linepogl
linepogl

Reputation: 9355

Sometimes I use this pattern:

foreach (var firstItem in collection) {
    // add logic here
    break;
}

It initiates only one iteration (so it's better than Code Sample 1) and the scope of the variable firstItem is limited inside the brackets (so it's better than Code Sample 2).

Upvotes: 1

Related Questions