TheNextman
TheNextman

Reputation: 12566

LINQ GroupBy continuous time

Assuming I have a simple structure that looks like this:

public class Range
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }

    public Range(DateTime start, DateTime end)
    {
        this.Start = start;
        this.End = end;
    }
}

And I create an collection like so:

var dr1 = new Range(new DateTime(2011, 11, 1, 12, 0, 0), 
    new DateTime(2011, 11, 1, 13, 0, 0));
var dr2 = new Range(new DateTime(2011, 11, 1, 13, 0, 0), 
    new DateTime(2011, 11, 1, 14, 0, 0));
var dr3 = new Range(new DateTime(2011, 11, 1, 14, 0, 0), 
    new DateTime(2011, 11, 1, 15, 0, 0));
var dr4 = new Range(new DateTime(2011, 11, 1, 16, 0, 0), 
    new DateTime(2011, 11, 1, 17, 0, 0));

var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };

What I want to do is group the ranges where they are continuous - i.e. they are continuous if the End value of the previous Range is the same as the Start of the next.

We can assume that there are no collisions/duplicates or overlaps in the Range values.

In the example posted, I would end up with two groups:

2011-11-1 12:00:00 - 2011-11-1 15:00:00

2011-11-1 16:00:00 - 2011-11-1 17:00:00

It's fairly easy to come up with an iterative solution for this. But is there some LINQ magic I can use to get this in a pretty one-liner?

Upvotes: 10

Views: 2149

Answers (7)

James Michael Hare
James Michael Hare

Reputation: 38397

You could use the Aggregate() method and a lambda to group them together. This is, as you say, assuming no duplicates or overlaps:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.FirstOrDefault(r => r.End == next.Start);

    if (last != null)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   

Now, if you know the ranges are ordered, you could get away with always comparing the next against the last one we processed, like so:

// build your set of continuous ranges for results
List<Range> continuousRanges = new List<Range>();

ranges.Aggregate(continuousRanges, (con, next) => {
{
    // pull last item (or default if none) - O(1) for List<T>
    var last = continuousRanges.LastOrDefault();

    if (last != null && last.End == next.Start)
        last.End = next.End;
    else
        con.Add(next);

    return con;
});   

Upvotes: 3

cwharris
cwharris

Reputation: 18125

A generic version of casperOne's extension method, used as such:

var items = new[]
    {
        // Range 1
        new { A = 0, B = 1 },
        new { A = 1, B = 2 },
        new { A = 2, B = 3 },
        new { A = 3, B = 4 },
        // Range 2
        new { A = 5, B = 6 },
        new { A = 6, B = 7 },
        new { A = 7, B = 8 },
        new { A = 8, B = 9 },
    };

var ranges = items.ContinousRange(
    x => x.A,
    x => x.B,
    (lower, upper) => new { A = lower, B = upper });

foreach(var range in ranges)
{
    Console.WriteLine("{0} - {1}", range.A, range.B);
}

Implementation of extension method

    /// <summary>
    /// Calculates continues ranges based on individual elements lower and upper selections. Cannot compensate for overlapping.
    /// </summary>
    /// <typeparam name="T">The type containing a range</typeparam>
    /// <typeparam name="T1">The type of range values</typeparam>
    /// <param name="source">The ranges to be combined</param>
    /// <param name="lowerSelector">The range's lower bound</param>
    /// <param name="upperSelector">The range's upper bound</param>
    /// <param name="factory">A factory to create a new range</param>
    /// <returns>An enumeration of continuous ranges</returns>
    public static IEnumerable<T> ContinousRange<T, T1>(this IEnumerable<T> source, Func<T, T1> lowerSelector, Func<T, T1> upperSelector, Func<T1, T1, T> factory)
    {
        //Validate parameters

        // Can order by start date, no overlaps, no collisions
        source = source.OrderBy(lowerSelector);

        // Get enumerator
        using(var enumerator = source.GetEnumerator())
        {
            // Move to the first item, if nothing, break.
            if (!enumerator.MoveNext()) yield break;

            // Set the previous range.
            var previous = enumerator.Current;

            // Cycle while there are more items
            while(enumerator.MoveNext())
            {
                // Get the current item.
                var current = enumerator.Current;

                // If the start date is equal to the end date
                // then merge with the previoud and continue
                if (lowerSelector(current).Equals(upperSelector(previous)))
                {
                    // Merge
                    previous = factory(lowerSelector(previous), upperSelector(current));

                    // Continue
                    continue;
                }

                // Yield the previous item.
                yield return previous;

                // The previous item is the current item.
                previous = current;
            }

            // Yield the previous item.
            yield return previous;
        }
    }

Upvotes: 5

Reza ArabQaeni
Reza ArabQaeni

Reputation: 4907

        var ranges = new List<Range>() { dr1, dr2, dr3, dr4 };
        var starts = ranges.Select(p => p.Start);
        var ends = ranges.Select(p => p.End);
        var discreet = starts.Union(ends).Except(starts.Intersect(ends)).OrderBy(p => p).ToList();
        List<Range> result = new List<Range>();
        for (int i = 0; i < discreet.Count; i = i + 2)
        {
            result.Add(new Range(discreet[i], discreet[i + 1]));
        }
        return result;

Upvotes: 0

Matthew Strawbridge
Matthew Strawbridge

Reputation: 20610

Here is another LINQ solution. It finds the start of each continuous range with one query, the end of each continuous range with another, and then goes through the pairs to construct new ranges.

var starts = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End);
var ends = ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start);
var result = starts.Zip(ends, (s, e) => new Range(s.Start, e.End));

It could be rewritten into a one-liner, but the separate version is clearer and easier to maintain:

var result = ranges.Where((r, i) => i == 0 || r.Start != ranges[i - 1].End).Zip(ranges.Where((r, i) => i == ranges.Count - 1 || r.End != ranges[i + 1].Start), (s, e) => new Range(s.Start, e.End));

Upvotes: 2

Rune FS
Rune FS

Reputation: 21742

The below code does it in query comprehension syntax.

public static List<Range> Group(List<Range> dates){
    if(dates.Any()){
        var previous = dates.FirstOrDefault();
        previous = new Range(previous.Start,previous.Start);
        var last = dates.Last();
        var result = (from dt in dates
                     let isDone = dt.Start != previous.End
                     let prev = isDone || last == dt ? 
                                                previous :  
                                               (previous = new Range(previous.Start,dt.End))
                     where isDone || last == dt
                     let t = (previous = dt)
                     select prev).ToList();
        if(result.Last().End != last.End)
           result.Add(last);
        return result;
    }
    return Enumerable.Empty<Range>().ToList();
}

I don't think I would actually do this in production code because I feel it violates the rule of least surprise. Where linq statements usually has no side effects, this works because it has side effects. However I thought it worthwhile to post to show that indeed it can be solved using query comprehension syntax in O(n)

Upvotes: 1

casperOne
casperOne

Reputation: 74530

Your best bet is to use yield and an extension method:

static IEnumerable<Range> GroupContinuous(this IEnumerable<Range> ranges)
{
    // Validate parameters.

    // Can order by start date, no overlaps, no collisions
    ranges = ranges.OrderBy(r => r.Start);

    // Get the enumerator.
    using (IEnumerator<Range> enumerator = ranges.GetEnumerator();
    {
        // Move to the first item, if nothing, break.
        if (!enumerator.MoveNext()) yield break;

        // Set the previous range.
        Range previous = enumerator.Current;

        // Cycle while there are more items.
        while (enumerator.MoveNext())
        {
            // Get the current item.
            Range current = enumerator.Current;

            // If the start date is equal to the end date
            // then merge with the previous and continue.
            if (current.Start == previous.End)
            {
                // Merge.
                previous = new Range(previous.Start, current.End);

                // Continue.
                continue;
            }

            // Yield the previous item.
            yield return previous;

            // The previous item is the current item.
            previous = current;
        }

        // Yield the previous item.
        yield return previous;
    }
}

Granted, the call to OrderBy is going to cause a full iteration of the ranges sequence, but there's no avoiding that. Once you have it ordered, you can prevent having to materialize your results before returning them; you simply yield the results if the conditions dictate.

If you know that the sequence is ordered, however, then you don't have to call OrderBy at all, and you can yield items as you traverse the list and break on different collapsed Range instances.

Ultimately, if the sequence is unordered, then you have two options:

  • Order the list and then process (remember, OrderBy is deferred as well, but will have to use one full iteration to order the sequence), using yield to return the item when you have one to process
  • Process the entire sequence at once and return as an entire materialized sequence

Upvotes: 9

Daniel Hilgarth
Daniel Hilgarth

Reputation: 174299

The following works but it really is an abuse of LINQ:

// Dummy at the end to get the last continues range
ranges.Add(new Range(default(DateTime), default(DateTime)));
// The previous element in the list
Range previous = ranges.FirstOrDefault();
// The start element of the current continuous range
Range start = previous;
ranges.Skip(1).Select(x => {var result = new {current = x, previous = previous};
                            previous = x; return result;})
              .Where(x => x.previous.End != x.current.Start)
              .Select(x => { var result = new Range(start.Start, x.previous.End);
                             start = x.current; return result; });

The code does the following:

First select:

  1. Save the current value and the current previous value in an instance of a new anonymous type
  2. Set the previous value to the current value for the next iteration

Where: Only select those elements that mark the start of a new continuous range

Second select:

  1. Create a Range object with the start date of the saved start value and the end value of the previous item.
  2. Set the start value to the current item as it marks the start of a new continuous range.
  3. Return the range created in (1)

Please note:
I would stick with the iterative solution, because the above code is unreadable, unmaintainable and it took me significantly more time than just typing a loop and an if...

Upvotes: 1

Related Questions