Greg Gum
Greg Gum

Reputation: 37905

How to Order an out of sequence sequence

Consider this list of elements which has three properties, Value and Sequence Number and Group:

Value    Sequence Number   Group   

Header,       1,           null
Invoice,      2,           1
Invoice Line, 3,           1
Trailer,      4,           null

The goal is only to sort by the sequence number. The value is irrelevant.

In the above, the obvious answer is to order by sequence number.

However, elements can repeat:

Header,     1,   null
InvoiceA,   2,   1
Line Item,  3,   1
InvoiceB,   2,   2
Line Item,  3,   2
Trailer,  4,   null

The above is the desired sequence. What Linq statement will produce the above?

Sorting by Sequence no longer works. Sorting by Group, then Sequence does not work.

The application of this is in EDI where the order of the data is significant.

Upvotes: 2

Views: 101

Answers (3)

adricadar
adricadar

Reputation: 10209

You can achieve this by sorting the elements by Sequence Number (OrderBy(x => x.SequenceNumber)).

After that you can sort elements with exising group number (.Where(x => x.Group != null).OrderBy(x => x.Group))

In the end you have to insert null elements in list at the corresponding index.

var elements = new List<Element>
{
    new Element{SequenceNumber = 1, Group = null}, new Element{SequenceNumber = 4, Group = null},new Element{SequenceNumber = 3, Group = 1},new Element{SequenceNumber = 3, Group = 3}, new Element{SequenceNumber = 3, Group = 2},new Element{SequenceNumber = 2, Group = 3},new Element{SequenceNumber = 2, Group = 1},new Element{SequenceNumber = 2, Group = 2}       
};

// first sort
var sortedElements = elements.OrderBy(x => x.SequenceNumber).ToList();

// save null elements
var elementsWithNull = sortedElements
    .Where(x => x.Group == null).ToList();

// group sorting
sortedElements = sortedElements
    .Where(x => x.Group != null)
    .OrderBy(x => x.Group).ToList();

// insert elements with null in sorted list
foreach (var element in elementsWithNull)
{
    var firstIndexOfSequence = 0;
    for (firstIndexOfSequence = 0;firstIndexOfSequence < sortedElements.Count && sortedElements[firstIndexOfSequence].SequenceNumber >= element.SequenceNumber; firstIndexOfSequence++)
    {
        // just to get index of the element with null group to know where to insert
    }

    sortedElements.Insert(firstIndexOfSequence, element);
}

Upvotes: 1

John Koerner
John Koerner

Reputation: 38077

If it doesn't have to be a linq query, you could write a single comparer that looks like this:

public class ValSeqGroupComparer : IComparer<ValSeqGroup>
{
    public int Compare(ValSeqGroup x, ValSeqGroup y)
    {
        if (x == y) return 0;

        // If only one has a group or there is no group in either
        if (x.Group.HasValue ^ y.Group.HasValue || !x.Group.HasValue)
            return x.Seq.CompareTo(y.Seq);

        if (x.Group.Value != y.Group.Value)
            return x.Group.Value.CompareTo(y.Group.Value);

        return x.Seq.CompareTo(y.Seq);
    }
}

Then using it like this:

[TestMethod]
public void One()
{
    List<ValSeqGroup> items = new List<ValSeqGroup>()
    {
        new ValSeqGroup("x", 1, null),
        new ValSeqGroup("x", 4, null),
        new ValSeqGroup("x", 2, 1),
        new ValSeqGroup("x", 2, 2),
        new ValSeqGroup("x", 3, 1),
        new ValSeqGroup("x", 3, 2)
    };

    items.Sort(new ValSeqGroupComparer());

    foreach (var item in items)
    {
        Console.WriteLine("{0} {1} {2}", item.Value, item.Seq,item.Group);
    }
}

Upvotes: 1

Servy
Servy

Reputation: 203817

So the first "trick" here is that you want all items with a null group to be separate groups, rather than having all null items combined into a single group.

This is actually fairly easy. We can just create an IEqualityComparer that compares items based on some other comparer, but that always considers two null items to be different, instead of being the same (typically two null items would be considered "equal").

public class SeparateNullComparer<T> : IEqualityComparer<T>
{
    private IEqualityComparer<T> comparer;
    public SeparateNullComparer(IEqualityComparer<T> comparer = null)
    {
        this.comparer = comparer ?? EqualityComparer<T>.Default;
    }

    public bool Equals(T x, T y)
    {
        if (x == null || y == null)
            return false;
        return comparer.Equals(x, y);
    }

    public int GetHashCode(T obj)
    {
        return comparer.GetHashCode(obj);
    }
}

We can now group the items using this comparer so that all non-null items will be grouped together, whereas all of the null items will have their own groups.

Now how do we order the groups? We need to order these groups based on their sequence numbers, but we have a sequence of them, not just one, so we need a way of comparing two sequences to see which sequence comes first. We do this by checking the first item in each sequence, and then continually checking the next until one comes first or one ends and the other doesn't:

public class SequenceComparer<T> : IComparer<IEnumerable<T>>
{
    private IComparer<T> comparer;
    public SequenceComparer(IComparer<T> compareer = null)
    {
        this.comparer = comparer ?? Comparer<T>.Default;
    }

    public int Compare(IEnumerable<T> x, IEnumerable<T> y)
    {
        using (var first = x.GetEnumerator())
        using (var second = x.GetEnumerator())
        {
            while (true)
            {
                var firstHasMore = first.MoveNext();
                var secondHasMore = second.MoveNext();
                if (!firstHasMore && !secondHasMore)
                    return 0;
                var lengthComparison = firstHasMore.CompareTo(secondHasMore);
                if (lengthComparison != 0)
                    return lengthComparison;
                var nextComparison = comparer.Compare(first.Current, second.Current);
                if (nextComparison != 0)
                    return nextComparison;
            }
        }
    }
}

Combine that with flattening all of the groups back out when we're done, and we just need to put it all together:

var query = data.GroupBy(item => item.Group, new SeparateNullComparer<int?>())
    .Select(group => group.OrderBy(item => item.SequenceNumber)
        .ToList())
    .OrderBy(group => group, new SequenceComparer<Foo>())
    .ThenBy(group => group.First().Group)
    .SelectMany(x => x);

You can also rely on the fact that GroupBy maintains the original order of items within groups, allowing you to order the data by SequenceNumber before grouping, instead of after. It'll do basically the same thing. It turns out to be a prettier query, but you just need to "know" that GroupBy maintains the proper ordering:

var query = data.OrderBy(item => item.SequenceNumber)
    .GroupBy(item => item.Group, new SeparateNullComparer<int?>())
    .OrderBy(group => group, new SequenceComparer<Foo>())
    .ThenBy(group => group.Key)
    .SelectMany(x => x);

Upvotes: 2

Related Questions