Matěj Pokorný
Matěj Pokorný

Reputation: 17845

Calling method with IEnumerable<T> sequence as argument, if that sequence is not empty

I have method Foo, which do some CPU intensive computations and returns IEnumerable<T> sequence. I need to check, if that sequence is empty. And if not, call method Bar with that sequence as argument.

I thought about three approaches...

Condition

var source = Foo();

if (!IsEmpty(ref source))
    Bar(source);

with IsEmpty implemented as

bool IsEmpty<T>(ref IEnumerable<T> source)
{
    var enumerator = source.GetEnumerator();

    if (enumerator.MoveNext())
    {
        source = CreateIEnumerable(enumerator);
        return false;
    }

    return true;

    IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
    {
        yield return usedEnumerator.Current;

        while (usedEnumerator.MoveNext())
        {
            yield return usedEnumerator.Current;
        }
    }
}

Also note, that calling Bar with empty sequence is not option...

EDIT: After some consideration, best answer for my case is from Olivier Jacot-Descombes - avoid that scenario completely. Accepted solution answers this question - if it is really no other way.

Upvotes: 4

Views: 587

Answers (5)

dbc
dbc

Reputation: 116670

You would like to call some function Bar<T>(IEnumerable<T> source) if and only if the enumerable source contains at least one element, but you're running into two problems:

  • There is no method T Peek() in IEnumerable<T> so you would need to actually begin to evaluate the enumerable to see if it's nonempty, but...

  • You don't want to even partially double-evaluate the enumerable since setting up the enumerable might be expensive.

In that case your approach looks reasonable. You do, however, have some issues with your imlementation:

  1. You need to dispose enumerator after using it.

  2. As pointed out by Ivan Stoev in comments, if the Bar() method attempts to evaluate the IEnumerable<T> more than once (e.g. by calling Any() then foreach (...)) then the results will be undefined because usedEnumerator will have been exhausted by the first enumeration.

To resolve these issues, I'd suggest modifying your API a little and create an extension method IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func) that calls a specified method only if the sequence is nonempty, as shown below:

public static partial class EnumerableExtensions
{
    public static bool IfNonEmpty<T>(this IEnumerable<T> source, Action<IEnumerable<T>> func)
    {
        if (source == null|| func == null)
            throw new ArgumentNullException();
        using (var enumerator = source.GetEnumerator())
        {
            if (!enumerator.MoveNext())
                return false;
            func(new UsedEnumerator<T>(enumerator));
            return true;
        }
    }

    class UsedEnumerator<T> : IEnumerable<T>
    {
        IEnumerator<T> usedEnumerator;

        public UsedEnumerator(IEnumerator<T> usedEnumerator)
        {
            if (usedEnumerator == null)
                throw new ArgumentNullException();
            this.usedEnumerator = usedEnumerator;
        }

        public IEnumerator<T> GetEnumerator()
        {
            var localEnumerator = System.Threading.Interlocked.Exchange(ref usedEnumerator, null);
            if (localEnumerator == null)
                // An attempt has been made to enumerate usedEnumerator more than once; 
                // throw an exception since this is not allowed.
                throw new InvalidOperationException();
            yield return localEnumerator.Current;
            while (localEnumerator.MoveNext())
            {
                yield return localEnumerator.Current;
            }
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }
    }
}

Demo fiddle with unit tests here.

Upvotes: 1

InBetween
InBetween

Reputation: 32750

The accepted answer is probably the best approach but, based on, and I quote:

Convert sequence to list, check if that list it empty... and pass it to Bar. This have also limitation. Bar will need only first x items, so Foo will be doing unnecessary work...

Another take would be creating an IEnumerable<T> that partially caches the underlying enumeration. Something along the following lines:

interface IDisposableEnumerable<T>
    :IEnumerable<T>, IDisposable
{
}

static class PartiallyCachedEnumerable
{
    public static IDisposableEnumerable<T> Create<T>(
        IEnumerable<T> source, 
        int cachedCount)
    {
        if (source == null)
            throw new NullReferenceException(
                nameof(source));

        if (cachedCount < 1)
            throw new ArgumentOutOfRangeException(
                nameof(cachedCount));

        return new partiallyCachedEnumerable<T>(
            source, cachedCount);
    }

    private class partiallyCachedEnumerable<T>
        : IDisposableEnumerable<T>
    {
        private readonly IEnumerator<T> enumerator;
        private bool disposed;
        private readonly List<T> cache;
        private readonly bool hasMoreItems;

        public partiallyCachedEnumerable(
            IEnumerable<T> source, 
            int cachedCount)
        {
            Debug.Assert(source != null);
            Debug.Assert(cachedCount > 0);
            enumerator = source.GetEnumerator();
            cache = new List<T>(cachedCount);
            var count = 0;

            while (enumerator.MoveNext() && 
                   count < cachedCount)
            {
                cache.Add(enumerator.Current);
                count += 1;
            }

            hasMoreItems = !(count < cachedCount);
        }

        public void Dispose()
        {
            if (disposed)
                return;

            enumerator.Dispose();
            disposed = true;
        }

        public IEnumerator<T> GetEnumerator()
        {
            foreach (var t in cache)
                yield return t;

            if (disposed)
                yield break;

            while (enumerator.MoveNext())
            {
                yield return enumerator.Current;
                cache.Add(enumerator.Current)
            }

            Dispose();
        }

        IEnumerator IEnumerable.GetEnumerator()
            => GetEnumerator();
    }
}

Upvotes: 0

Theraot
Theraot

Reputation: 40170

One improvement for your IsEmpty would be to check if source is ICollection<T>, and if it is, check .Count (also, dispose the enumerator):

bool IsEmpty<T>(ref IEnumerable<T> source)
{
    if (source is ICollection<T> collection)
    {
        return collection.Count == 0;
    }
    var enumerator = source.GetEnumerator();
    if (enumerator.MoveNext())
    {
        source = CreateIEnumerable(enumerator);
        return false;
    }
    enumerator.Dispose();
    return true;
    IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
    {
        yield return usedEnumerator.Current;
        while (usedEnumerator.MoveNext())
        {
            yield return usedEnumerator.Current;
        }
        usedEnumerator.Dispose();
    }
}

This will work for arrays and lists.

I would, however, rework IsEmpty to return:

IEnumerable<T> NotEmpty<T>(IEnumerable<T> source)
{
    if (source is ICollection<T> collection)
    {
        if (collection.Count == 0)
        {
           return null;
        }
        return source;
    }
    var enumerator = source.GetEnumerator();
    if (enumerator.MoveNext())
    {
        return CreateIEnumerable(enumerator);
    }
    enumerator.Dispose();
    return null;
    IEnumerable<T> CreateIEnumerable(IEnumerator<T> usedEnumerator)
    {
        yield return usedEnumerator.Current;
        while (usedEnumerator.MoveNext())
        {
            yield return usedEnumerator.Current;
        }
        usedEnumerator.Dispose();
    }
}

Now, you would check if it returned null.

Upvotes: 0

Olivier Jacot-Descombes
Olivier Jacot-Descombes

Reputation: 112324

I don't know whether your algorithm in Foo allows to determine if the enumeration will be empty without doing the calculations. But if this is the case, return null if the sequence would be empty:

public IEnumerable<T> Foo()
{
    if (<check if sequence will be empty>) {
        return null;
    }
    return GetSequence();
}

private IEnumerable<T> GetSequence()
{
    ...
    yield return item;
    ...
}

Note that if a method uses yield return, it cannot use a simple return to return null. Therefore a second method is needed.

var sequence = Foo();
if (sequence != null) {
    Bar(sequence);
}

After reading one of your comments

Foo need to initialize some resources, parse XML file and fill some HashSets, which will be used to filter (yield) returned data.

I suggest another approach. The time consuming part seems to be the initialization. To be able to separate it from the iteration, create a foo calculator class. Something like:

public class FooCalculator<T>
{
     private bool _isInitialized;
     private string _file;

     public FooCalculator(string file)
     {
         _file = file;
     }

     private EnsureInitialized()
     {
         if (_isInitialized) return;

         // Parse XML.
         // Fill some HashSets.

         _isInitialized = true;
     }

     public IEnumerable<T> Result
     {
         get {
             EnsureInitialized();
             ...
             yield return ...;
             ...
         }
     }
}

This ensures that the costly initialization stuff is executed only once. Now you can safely use Any().

Other optimizations are conceivable. The Result property could remember the position of the first returned element, so that if it is called again, it could skip to it immediately.

Upvotes: 2

Wanton
Wanton

Reputation: 840

If you can change Bar then how about change it to TryBar that returns false when IEnumerable<T> was empty?

bool TryBar(IEnumerable<Foo> source)
{
  var count = 0;
  foreach (var x in source)
  {
    count++;
  }
  return count > 0;
}

If that doesn't work for you could always create your own IEnumerable<T> wrapper that caches values after they have been iterated once.

Upvotes: 0

Related Questions