Reputation:
I'd like to 'collapse' the values in some IEnumerable
together such that adjacent elements that are the same get collapsed to a single element.
I can't think of a better way to describe the problem other than an example:
The array [0,0,2,0,1,1,2,2,2,1,0,0,2,1,1,0,1,1,1] Should become [0,2,0,1,2,1,0,2,1,0,1]
In my use case, this needs to happen in a critical loop and therefore must be as fast as possible. I could loop through the array and check each element against the previous, removing if it's a duplicate, but I'm hoping there's a faster way.
My use will only be for relatively short arrays (<100 elements) and only uses int
, however a general purpose solutions would be appreciated.
EDIT: As pointed out below, the problem is fundamentally O(n) complexity, but I was hoping something linqy might beat my (probably clumsy) implementation.
Upvotes: 3
Views: 334
Reputation: 51390
If you want a general-purpose solution, write an extension method:
This should do the job just fine:
public static IEnumerable<T> DistinctConsecutive<T>(this IEnumerable<T> sequence)
=> sequence.DistinctConsecutive(EqualityComparer<T>.Default);
public static IEnumerable<T> DistinctConsecutive<T>(this IEnumerable<T> sequence, IEqualityComparer<T> comparer)
{
if (sequence == null)
throw new ArgumentNullException(nameof(sequence));
if (comparer == null)
throw new ArgumentNullException(nameof(comparer));
return DistinctConsecutiveImpl(sequence, comparer);
}
private static IEnumerable<T> DistinctConsecutiveImpl<T>(IEnumerable<T> sequence, IEqualityComparer<T> comparer)
{
using (var enumerator = sequence.GetEnumerator())
{
if (!enumerator.MoveNext())
yield break;
var lastValue = enumerator.Current;
yield return lastValue;
while (enumerator.MoveNext())
{
var value = enumerator.Current;
if (comparer.Equals(lastValue, value))
continue;
yield return value;
lastValue = value;
}
}
}
Or, the lazier approach:
public static IEnumerable<T> DistinctConsecutive<T>(this IEnumerable<T> sequence, IEqualityComparer<T> comparer = null)
{
if (comparer == null)
comparer = EqualityComparer<T>.Default;
using (var enumerator = sequence.GetEnumerator())
{
if (!enumerator.MoveNext())
yield break;
var lastValue = enumerator.Current;
yield return lastValue;
while (enumerator.MoveNext())
{
var value = enumerator.Current;
if (comparer.Equals(lastValue, value))
continue;
yield return value;
lastValue = value;
}
}
}
If you need an optimized solution, throw out generics and use ==
instead of the IEqualityComparer<T>
. And if this is still a bottleneck, do it with a plain old for
loop.
Upvotes: 4
Reputation: 203827
I could loop through the array and check each element against the previous, removing if it's a duplicate, but I'm hoping there's a faster way.
Fundamentally there's never going to be a way that's faster algorithmically. There can be subtle differences in implementations using that same algorithm, but that's the best that you can get. There's no way to avoid checking each item, so the operation will be O(n) no matter what you do.
Upvotes: 4
Reputation: 120498
You can use the ChunkBy
extension provided on MSDN. Then it's easy:
var src = new[]{0, 0, 2, 0, 1, 1, 2, 2, 2, 1, 0, 0, 2, 1, 1, 0, 1, 1, 1};
var pruned = src.ChunkBy(x => x).Select(c => c.First());
Upvotes: 0