bradgonesurfing
bradgonesurfing

Reputation: 32192

Optimizing an IEnumerable extension MaxBy

I have an IEnumerable extension MaxBy to be used like

var longest = new [] {"cat", "dogs", "nit" }.MaxBy(x=>x.Count())

should be

"dogs"

with implementation*emphasized text*

public static T MaxBy<T,M>(this IEnumerable<T> This, Func<T,M> selector)
    where M : IComparable
{
    return This
        .Skip(1)
        .Aggregate
        ( new {t=This.First(), m = selector(This.First())}
        , (a, t) =>
            {
                var m = selector(t);
                if ( m.CompareTo(a.m) > 0)
                {
                    return new { t, m };
                }
                else
                {
                    return a;
                }
            }
        , a => a.t);
}

It's quite elegant and purely functional but I see a problem. I'm using anonymous objects which are reference types and require garbage collection. In the worst case when travesing an IEnumerable of length N I will make N memory allocations and N objects will need garbage collection.

I could write the code to use an external mutable accumulator but aesthetically I'd prefer to stick with the pattern I have.

However are my concerns in reality a problem? Does the .Net generational garbage collector identify that these objects are very short lived, only one at a time and optimize away what is happening? Or is it better that I create a custom value type ( struct ) to hold my accumulator instead of using an anonymous object.

** EDIT **

This is obviously the non functional way to do it.

public static T MaxBy<T,M>(this IEnumerable<T> This, Func<T,M> selector)
    where M : IComparable
{
    var t = This.First();
    var max = selector(t);
    foreach (var item in This.Skip(1))
    {
        var m = selector(item);
        if ( m.CompareTo(max) > 0)
        {
            max = m;
            t = item;
        }

    }
    return t;
}

Upvotes: 1

Views: 791

Answers (1)

Jeppe Stig Nielsen
Jeppe Stig Nielsen

Reputation: 62002

This one is better:

public static T MaxBy<T, M>(this IEnumerable<T> source, Func<T, M> selector)
  where M : IComparable
{
  return source.Aggregate((record, next) =>
     Comparer<M>.Default.Compare(selector(next), selector(record)) > 0
     ? next
     : record);
}
  • You do not have as much code (smarter use of .Aggregate<>)
  • You avoid the anonymous type
  • With Comparer<M>.Default you avoid boxing in the very, very common scenario where your M is a value type that is actually IComparable<M> (and not just IComparble). In your example where M was int, that applies! And with both value types and reference types you can bypass a type check if M really implements the generic IComparable<M>. But everything will still work if M is a "poor" type with only non-generic IComparable (the contraint where M : IComparable).

The behaviour with InvalidOperationException when the source is empty, is unchanged.

Upvotes: 2

Related Questions