johnny 5
johnny 5

Reputation: 21005

Why doesn't IOrderedEnumerable retain order after where filtering

I've created a simplification of the issue. I have an ordered IEnumerable, I'm wondering why applying a where filter could unorder the objects

This does not compile while it should have the potential to

IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
//Error Cannot Implicitly conver IEnumerable<int> To IOrderedEnumerable<int>
tmp = tmp.Where(x => x > 1);

I understand that there would be no gaurenteed execution order if coming from an IQueryable such as using linq to some DB Provider.

However, when dealing with Linq To Object what senario could occur that would unorder your objects, or why wasn't this implemented?

EDIT

I understand how to properly order this that is not the question. My Question is more of a design question. A Where filter on linq to objects should enumerate the give enumerable and apply filtering. So why is that we can only return an IEnumerable instead of an IOrderedEnumerable?

EDIT

To Clarify the senario in when this would be userful. I'm building Queries based on conditions in my code, I want to reuse as much code as possible. I have a function that is returning an OrderedEnumerable, however after applying the additional where I would have to reorder this even though it would be in its original ordered state

Upvotes: 7

Views: 3987

Answers (4)

Ivan Stoev
Ivan Stoev

Reputation: 205739

why wasn't this implemented?

Most likely because the LINQ designers decided that the effort to implement, test, document etc. isn't worth enough compared to the potential use cases. In fact your are the first one I hear complaining about that.

But if it's so important to you, you can add that missing functionality yourself (similar to @Jon Skeet MoreLINQ extension library). For instance, something like this:

namespace MyLinq
{
    public static class Extensions
    {
        public static IOrderedEnumerable<T> Where<T>(this IOrderedEnumerable<T> source, Func<T, bool> predicate)
        {
            return new WhereOrderedEnumerable<T>(source, predicate);
        }

        class WhereOrderedEnumerable<T> : IOrderedEnumerable<T>
        {
            readonly IOrderedEnumerable<T> source;
            readonly Func<T, bool> predicate;
            public WhereOrderedEnumerable(IOrderedEnumerable<T> source, Func<T, bool> predicate)
            {
                if (source == null) throw new ArgumentNullException(nameof(source));
                if (predicate == null) throw new ArgumentNullException(nameof(predicate));
                this.source = source;
                this.predicate = predicate;
            }
            public IOrderedEnumerable<T> CreateOrderedEnumerable<TKey>(Func<T, TKey> keySelector, IComparer<TKey> comparer, bool descending) =>
                new WhereOrderedEnumerable<T>(source.CreateOrderedEnumerable(keySelector, comparer, descending), predicate);
            public IEnumerator<T> GetEnumerator() => Enumerable.Where(source, predicate).GetEnumerator();
            IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
        }
    }
}

And putting it into action:

using System;
using System.Collections.Generic;
using System.Linq;
using MyLinq;

var test = Enumerable.Range(0, 100)
    .Select(n => new { Foo = 1 + (n / 20), Bar = 1 + n })
    .OrderByDescending(e => e.Foo)
    .Where(e => (e.Bar % 2) == 0)
    .ThenByDescending(e => e.Bar) // Note this compiles:)
    .ToList();

Upvotes: 1

Joel Coehoorn
Joel Coehoorn

Reputation: 416049

The tmp variable's type is IOrderedEnumerable.

Where() is a function just like any other with a return type, and that return type is IEnumerable. IEnumerable and IOrderedEnumerable are not the same.

So when you do this:

tmp = tmp.Where(x => x > 1);

You are trying to assign the result of a Where() function call, which is an IEnuemrable, to the tmp variable, which is an IOrderedEnumerable. They are not directly compatible, there is no implicit cast, and so the compiler sends you an error.

The problem is you are being too specific with the tmp variable's type. You can make one simple change that will make this all work by being just be a little less specific with your tmp variable:

IEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1);

Because IOrderedEnumerable inherits from IEnumerable, this code will all work. As long as you don't want to call ThenBy() later on, this should give you exactly the same results as you expect without any other loss of ability to use the tmp variable later.

If you really need an IOrderedEnumerable, you can always just call .OrderBy(x => x) again:

IOrderedEnumerable<int> tmp = new List<int>().OrderBy(x => x);
tmp = tmp.Where(x => x > 1).OrderBy(x => x);

And again, in most cases (not all, but most) you want to get your filtering out of the way before you start sorting. In other words, this is even better:

var tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);

Upvotes: 3

Ren&#233; Vogt
Ren&#233; Vogt

Reputation: 43896

The signature of Where() is this:

public static IEnumerable<TSource> Where<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)

So this method takes an IEnumerable<int> as first argument. The IOrderedEnumerable<int> returned from OrderBy implements IEnumerable<int> so this is no problem.

But as you can see, Where returns an IEnumerable<int> and not an IOrderedEnumerable<int>. And this cannot be casted into one another.

Anyway, the object in that sequence will still have the same order. So you could just do it like this

IEnumerable<int> tmp = new List<int>().OrderBy(x => x).Where(x => x > 1);

and get the sequence you expected.

But of course you should (for performance reasons) filter your objects first and sort them afterwards when there are fewer objects to sort:

IOrderedEnumerable<int> tmp = new List<int>().Where(x => x > 1).OrderBy(x => x);

Upvotes: 4

Eric Lippert
Eric Lippert

Reputation: 660327

Rene's answer is correct, but could use some additional explanation.

IOrderedEnumerable<T> does not mean "this is a sequence that is ordered". It means "this is a sequence that has had an ordering operation applied to it and you may now follow that up with a ThenBy to impose additional ordering requirements."

The result of Where does not allow you to follow it up with ThenBy, and therefore you may not use it in a context where an IOrderedEnumerable<T> is required.

Make sense?

But of course, as others have said, you almost always want to do the filtering first and then the ordering. That way you are not spending time putting items into order that you are just going to throw away.

There are of course times when you do have to order and then filter; for example, the query "songs in the top ten that were sung by a woman" and the query "the top ten songs that were sung by a woman" are potentially very different! The first one is sort the songs -> take the top ten -> apply the filter. The second is apply the filter -> sort the songs -> take the top ten.

Upvotes: 23

Related Questions