Reputation: 116810
I have the following:
public class Interval
{
DateTime Start;
DateTime End;
}
I have a List<Interval>
object containing multiple intervals. I am trying to achieve the following (I used numbers to make it easy to understand):
[(1, 5), (2, 4), (3, 6)] ---> [(1,6)]
[(1, 3), (2, 4), (5, 8)] ---> [(1, 4), (5,8)]
I currently do this in Python as follows:
def merge(times):
saved = list(times[0])
for st, en in sorted([sorted(t) for t in times]):
if st <= saved[1]:
saved[1] = max(saved[1], en)
else:
yield tuple(saved)
saved[0] = st
saved[1] = en
yield tuple(saved)
but am trying to achieve the same in C# (LINQ would be best but optional). Any suggestions on how to do this efficiently?
Upvotes: 12
Views: 8313
Reputation: 6259
Here's a version using yield return
- I find it easier to read than doing an Aggregate
query, although it's still lazy evaluated. This assumes you've ordered the list already, if not, just add that step.
IEnumerable<Interval> MergeOverlappingIntervals(IEnumerable<Interval> intervals)
{
var accumulator = intervals.First();
intervals = intervals.Skip(1);
foreach(var interval in intervals)
{
if ( interval.Start <= accumulator.End )
{
accumulator = Combine(accumulator, interval);
}
else
{
yield return accumulator;
accumulator = interval;
}
}
yield return accumulator;
}
Interval Combine(Interval start, Interval end)
{
return new Interval
{
Start = start.Start,
End = Max(start.End, end.End),
};
}
private static DateTime Max(DateTime left, DateTime right)
{
return (left > right) ? left : right;
}
Upvotes: 15
Reputation: 8774
I was beset by "Not Created Here" syndrome tonight, so here's mine. Using an Enumerator directly saved me a couple lines of code, made it clearer (IMO), and handled the case with no records. I suppose it might run a smidge faster as well if you care about that...
public IEnumerable<Tuple<DateTime, DateTime>> Merge(IEnumerable<Tuple<DateTime, DateTime>> ranges)
{
DateTime extentStart, extentEnd;
using (var enumerator = ranges.OrderBy(r => r.Item1).GetEnumerator()) {
bool recordsRemain = enumerator.MoveNext();
while (recordsRemain)
{
extentStart = enumerator.Current.Item1;
extentEnd = enumerator.Current.Item2;
while ((recordsRemain = enumerator.MoveNext()) && enumerator.Current.Item1 < extentEnd)
{
if (enumerator.Current.Item2 > extentEnd)
{
extentEnd = enumerator.Current.Item2;
}
}
yield return Tuple.Create(extentStart, extentEnd);
}
}
}
In my own implementation, I use a TimeRange
type to store each Tuple<DateTime, DateTime>
, as other here do. I didn't include that here simply to stay focused / on-topic.
Upvotes: 4
Reputation: 189
I used TimeRange as a container storing the ranges:
public class TimeRange
{
public TimeRange(DateTime s, DateTime e) { start = s; end = e; }
public DateTime start;
public DateTime end;
}
It divides the problem in combining two time ranges. Therefor, the current time range (work) is matched with the time ranges previously merged. If one of the previously added time ranges is outdated, it is dropped and the new time range (combined from work and the matching time range) is used. The cases I figured out for two ranges () and [] are as follows:
()[]
public static IEnumerable<TimeRange> Merge(IEnumerable<TimeRange> timeRanges)
{
List<TimeRange> mergedData = new List<TimeRange>();
foreach (var work in timeRanges)
{
Debug.Assert(work.start <= work.end, "start date has to be smaller or equal to end date to be a valid TimeRange");
var tr = new TimeRange(work.start, work.end);
int idx = -1;
for (int i = 0; i < mergedData.Count; i++)
{
if (tr.start < mergedData[i].start)
{
if (tr.end < mergedData[i].start)
continue;
if (tr.end < mergedData[i].end)
tr.end = mergedData[i].end;
}
else if (tr.start < mergedData[i].end)
{
tr.start = mergedData[i].start;
if (tr.end < mergedData[i].end)
tr.end = mergedData[i].end;
}
else
continue;
idx = i;
mergedData.RemoveAt(i);
i--;
}
if (idx < 0)
idx = mergedData.Count;
mergedData.Insert(idx, tr);
}
return mergedData;
}
Upvotes: 0
Reputation: 7692
This may not be the prettiest solution, but it may work as well
public static List<Interval> Merge(List<Interval> intervals)
{
var mergedIntervals = new List<Interval>();
var orderedIntervals = intervals.OrderBy<Interval, DateTime>(x => x.Start).ToList<Interval>();
DateTime start = orderedIntervals.First().Start;
DateTime end = orderedIntervals.First().End;
Interval currentInterval;
for (int i = 1; i < orderedIntervals.Count; i++)
{
currentInterval = orderedIntervals[i];
if (currentInterval.Start < end)
{
end = currentInterval.End;
}
else
{
mergedIntervals.Add(new Interval()
{
Start = start,
End = end
});
start = currentInterval.Start;
end = currentInterval.End;
}
}
mergedIntervals.Add(new Interval()
{
Start = start,
End = end
});
return mergedIntervals;
}
Any feedback will be appreciated.
Regards
Upvotes: 3
Reputation: 6155
This kind of merging would typically be considered as a fold in functional languages. The LINQ equivalent is Aggregate
.
IEnumerable<Interval<T>> Merge<T>(IEnumerable<Interval<T>> intervals)
where T : IComparable<T>
{
//error check parameters
var ret = new List<Interval<T>>(intervals);
int lastCount
do
{
lastCount = ret.Count;
ret = ret.Aggregate(new List<Interval<T>>(),
(agg, cur) =>
{
for (int i = 0; i < agg.Count; i++)
{
var a = agg[i];
if (a.Contains(cur.Start))
{
if (a.End.CompareTo(cur.End) <= 0)
{
agg[i] = new Interval<T>(a.Start, cur.End);
}
return agg;
}
else if (a.Contains(cur.End))
{
if (a.Start.CompareTo(cur.Start) >= 0)
{
agg[i] = new Interval<T>(cur.Start, a.End);
}
return agg;
}
}
agg.Add(cur);
return agg;
});
} while (ret.Count != lastCount);
return ret;
}
I made the Interval class generic (Interval<T> where T : IComparable<T>
), added a bool Contains(T value)
method, and made it immutable, but you should not need to change it much if you want to use the class definition as you have it now.
Upvotes: 1