Reputation: 1061
I have a list of integers. I want to find all runs of consecutive numbers on that list, defined by the start index and length. So for example, for input list of [1,2,3,5,7,8]
, the output would be [{1,3}, {5,1}, {7,2}]
. This is easy enough to do using a loop, something like this (untested pseudocode):
for(i=1, i < maxNum; i++)
{
number = list[i];
previousNumber = list[i-1];
if(number - previousNumber == 1)
{
runLength++;
}
else
{
result.Add(startingNumber, runLength);
runLength = 1;
startingNumber = number;
}
}
But I thought it would be possible to do using LINQ. Any ideas how to do that?
Upvotes: 8
Views: 12993
Reputation: 50104
Did someone ask to shoehorn a solution into a LINQ query of dubious readability?
var serieses = input.Aggregate(
new List<Tuple<int, int>>(),
(l, i) =>
{
var last = l.LastOrDefault();
if (last == null ||
last.Item1 + last.Item2 != i)
{
l.Add(new Tuple<int, int>(i, 1));
}
else if (last.Item1 + last.Item2 == i)
{
l.RemoveAt(l.Count - 1);
l.Add(new Tuple<int, int>(last.Item1, last.Item2 + 1));
}
return l;
});
Upvotes: 3
Reputation: 995
Try with this:
// Static class to contain the extension methods.
public static class MyExtensions
{
public static IEnumerable<IGrouping<TKey, TSource>> ChunkBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
return source.ChunkBy(keySelector, EqualityComparer<TKey>.Default);
}
public static IEnumerable<IGrouping<TKey, TSource>> ChunkBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> comparer)
{
// Flag to signal end of source sequence.
const bool noMoreSourceElements = true;
// Auto-generated iterator for the source array.
var enumerator = source.GetEnumerator();
// Move to the first element in the source sequence.
if (!enumerator.MoveNext()) yield break;
// Iterate through source sequence and create a copy of each Chunk.
// On each pass, the iterator advances to the first element of the next "Chunk"
// in the source sequence. This loop corresponds to the outer foreach loop that
// executes the query.
Chunk<TKey, TSource> current = null;
while (true)
{
// Get the key for the current Chunk. The source iterator will churn through
// the source sequence until it finds an element with a key that doesn't match.
var key = keySelector(enumerator.Current);
// Make a new Chunk (group) object that initially has one GroupItem, which is a copy of the current source element.
current = new Chunk<TKey, TSource>(key, enumerator, value => comparer.Equals(key, keySelector(value)));
// Return the Chunk. A Chunk is an IGrouping<TKey,TSource>, which is the return value of the ChunkBy method.
// At this point the Chunk only has the first element in its source sequence. The remaining elements will be
// returned only when the client code foreach's over this chunk. See Chunk.GetEnumerator for more info.
yield return current;
// Check to see whether (a) the chunk has made a copy of all its source elements or
// (b) the iterator has reached the end of the source sequence. If the caller uses an inner
// foreach loop to iterate the chunk items, and that loop ran to completion,
// then the Chunk.GetEnumerator method will already have made
// copies of all chunk items before we get here. If the Chunk.GetEnumerator loop did not
// enumerate all elements in the chunk, we need to do it here to avoid corrupting the iterator
// for clients that may be calling us on a separate thread.
if (current.CopyAllChunkElements() == noMoreSourceElements)
{
yield break;
}
}
}
// A Chunk is a contiguous group of one or more source elements that have the same key. A Chunk
// has a key and a list of ChunkItem objects, which are copies of the elements in the source sequence.
class Chunk<TKey, TSource> : IGrouping<TKey, TSource>
{
// INVARIANT: DoneCopyingChunk == true ||
// (predicate != null && predicate(enumerator.Current) && current.Value == enumerator.Current)
// A Chunk has a linked list of ChunkItems, which represent the elements in the current chunk. Each ChunkItem
// has a reference to the next ChunkItem in the list.
class ChunkItem
{
public ChunkItem(TSource value)
{
Value = value;
}
public readonly TSource Value;
public ChunkItem Next = null;
}
// The value that is used to determine matching elements
private readonly TKey key;
// Stores a reference to the enumerator for the source sequence
private IEnumerator<TSource> enumerator;
// A reference to the predicate that is used to compare keys.
private Func<TSource, bool> predicate;
// Stores the contents of the first source element that
// belongs with this chunk.
private readonly ChunkItem head;
// End of the list. It is repositioned each time a new
// ChunkItem is added.
private ChunkItem tail;
// Flag to indicate the source iterator has reached the end of the source sequence.
internal bool isLastSourceElement = false;
// Private object for thread syncronization
private object m_Lock;
public Chunk(TKey key, IEnumerator<TSource> enumerator, Func<TSource, bool> predicate)
{
this.key = key;
this.enumerator = enumerator ?? throw new NullReferenceException(nameof(enumerator));
this.predicate = predicate ?? throw new NullReferenceException(nameof(predicate));
// A Chunk always contains at least one element.
head = new ChunkItem(enumerator.Current);
// The end and beginning are the same until the list contains > 1 elements.
tail = head;
m_Lock = new object();
}
// Indicates that all chunk elements have been copied to the list of ChunkItems,
// and the source enumerator is either at the end, or else on an element with a new key.
// the tail of the linked list is set to null in the CopyNextChunkElement method if the
// key of the next element does not match the current chunk's key, or there are no more elements in the source.
private bool DoneCopyingChunk => tail == null;
// Adds one ChunkItem to the current group
// REQUIRES: !DoneCopyingChunk && lock(this)
private void CopyNextChunkElement()
{
// Try to advance the iterator on the source sequence.
// If MoveNext returns false we are at the end, and isLastSourceElement is set to true
isLastSourceElement = !enumerator.MoveNext();
// If we are (a) at the end of the source, or (b) at the end of the current chunk
// then null out the enumerator and predicate for reuse with the next chunk.
if (isLastSourceElement || !predicate(enumerator.Current))
{
enumerator = null;
predicate = null;
}
else
{
tail.Next = new ChunkItem(enumerator.Current);
}
// tail will be null if we are at the end of the chunk elements
// This check is made in DoneCopyingChunk.
tail = tail.Next;
}
// Called after the end of the last chunk was reached. It first checks whether
// there are more elements in the source sequence. If there are, it
// Returns true if enumerator for this chunk was exhausted.
internal bool CopyAllChunkElements()
{
while (true)
{
lock (m_Lock)
{
if (DoneCopyingChunk)
{
// If isLastSourceElement is false,
// it signals to the outer iterator
// to continue iterating.
return isLastSourceElement;
}
else
{
CopyNextChunkElement();
}
}
}
}
public TKey Key => key;
// Invoked by the inner foreach loop. This method stays just one step ahead
// of the client requests. It adds the next element of the chunk only after
// the clients requests the last element in the list so far.
public IEnumerator<TSource> GetEnumerator()
{
//Specify the initial element to enumerate.
ChunkItem current = head;
// There should always be at least one ChunkItem in a Chunk.
while (current != null)
{
// Yield the current item in the list.
yield return current.Value;
// Copy the next item from the source sequence,
// if we are at the end of our local list.
lock (m_Lock)
{
if (current == tail)
{
CopyNextChunkElement();
}
}
// Move to the next ChunkItem in the list.
current = current.Next;
}
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() => GetEnumerator();
}
}
// A simple named type is used for easier viewing in the debugger. Anonymous types
// work just as well with the ChunkBy operator.
public class KeyValPair
{
public string Key { get; set; }
public string Value { get; set; }
}
class Program
{
// The source sequence.
public static IEnumerable<KeyValPair> list;
// Query variable declared as class member to be available
// on different threads.
static IEnumerable<IGrouping<string, KeyValPair>> query;
static void Main(string[] args)
{
// Initialize the source sequence with an array initializer.
list = new[]
{
new KeyValPair{ Key = "A", Value = "We" },
new KeyValPair{ Key = "A", Value = "think" },
new KeyValPair{ Key = "A", Value = "that" },
new KeyValPair{ Key = "B", Value = "Linq" },
new KeyValPair{ Key = "C", Value = "is" },
new KeyValPair{ Key = "A", Value = "really" },
new KeyValPair{ Key = "B", Value = "cool" },
new KeyValPair{ Key = "B", Value = "!" }
};
// Create the query by using our user-defined query operator.
query = list.ChunkBy(p => p.Key);
// ChunkBy returns IGrouping objects, therefore a nested
// foreach loop is required to access the elements in each "chunk".
foreach (var item in query)
{
Console.WriteLine($"Group key = {item.Key}");
foreach (var inner in item)
{
Console.WriteLine($"\t{inner.Value}");
}
}
Console.WriteLine("Press any key to exit");
Console.ReadKey();
}
}
The extension method implements the native IGrouping interface (same as GroupBy). The storage is done with a linked element list instead of a List. The execution of the extension method is thread safe and lazy.
Upvotes: 1
Reputation: 1871
There is no such out of box extension method, but you can create you own:
public static class LinqUtils{
public class ConsecutiveGroup<T>
{
public T Left { get; internal set; }
public T Right { get; internal set; }
public long Count { get; internal set; }
}
public static IEnumerable<ConsecutiveGroup<T>> ConsecutiveCounts<T>(this IEnumerable<T> src, Func<T, T, bool> consecutive)
{
ConsecutiveGroup<T> current = null;
foreach (var s in src)
{
if (current==null)
{
current = new ConsecutiveGroup<T>
{
Left = s,
Right = s,
Count = 1
};
continue;
}
if(consecutive(current.Right, s))
{
current.Right = s;
current.Count += 1;
continue;
}
yield return current;
current = new ConsecutiveGroup<T>
{
Left = s,
Right = s,
Count = 1
};
}
if (current!=null)
{
yield return current;
}
}
}
[TestFixture]
public static class LinqUtilsTests
{
[Test]
public void TestConsecutiveCounts()
{
var src = new[] {1,2,3,5,7,8};
var expected = new[]
{
Tuple.Create<int,long>(1, 3),
Tuple.Create<int,long>(5, 1),
Tuple.Create<int,long>(7, 2)
};
var result = src
.ConsecutiveCounts((prev, current) => current == (prev + 1))
.Select(c=>Tuple.Create(c.Left, c.Count));
Assert.IsTrue(expected.SequenceEqual(result));
}
}
Upvotes: 1
Reputation: 32094
You need some way to scan through the sequence, accummulating results and then grouping them. So first a simple Scan
extension method (similar to Aggregate
but it outputs intermediate results) (a Scan
method exists for IObservable
but not for IEnumerable
):
public static IEnumerable<U> Scan<T, U>(this IEnumerable<T> input,
Func<U, T, U> next,
U state)
{
yield return state;
foreach (var item in input)
{
state = next(state, item);
yield return state;
}
}
And using this method and the Zip
extension method, do the following:
var ints = new[] { 1, 2, 3, 5, 7, 8, 10 };
var result = ints
// Zip the list with itself shifted 1 to the left (add dummy value at the end)
// and calculate the difference between each consecutive value.
.Zip(ints
.Skip(1)
.Concat(new[] { int.MaxValue }), (i0, i1) => new { i = i0, diff = i1 - i0 })
// Reverse because it's far easier keeping track of the group we're at
.Reverse()
// Scan through the list, assigning an incremental group number to each
// consecutive sequence
.Scan((state, z) => state == null
? Tuple.Create(z.i, z.diff, 0)
: Tuple.Create(z.i, z.diff,
z.diff > 1 ? state.Item3 + 1 : state.Item3),
(Tuple<int, int, int>) null) // <-- dummy starting state.
// Skip the dummy starting state we started the scan with
.Skip(1)
// Reverse back
.Reverse()
// Group by the group numbers we assigned during the scan
.GroupBy(t => t.Item3, (i, l) => new { l.First().Item1, l = l.Count() });
Upvotes: 0
Reputation: 116098
A linqish way can be writing an extension method GroupWhile
like below (All checks omitted. not optimized to understand easily.)
int[] list = new int[] { 1, 2, 3, 5, 7, 8 };
var result = list.GroupWhile((x, y) => y - x == 1)
.Select(x => new {i = x.First(), len = x.Count() })
.ToList();
public static IEnumerable<IEnumerable<T>> GroupWhile<T>(this IEnumerable<T> seq, Func<T,T,bool> condition)
{
T prev = seq.First();
List<T> list = new List<T>() { prev };
foreach(T item in seq.Skip(1))
{
if(condition(prev,item)==false)
{
yield return list;
list = new List<T>();
}
list.Add(item);
prev = item;
}
yield return list;
}
TODO: use IGrouping
:)
Upvotes: 22
Reputation: 13177
This seems like a reasonable approach:
Range
, so each element is tupled with its indexvar list = new int[] { 1, 2, 3, 5, 7, 8 };
var filtered = list.Zip(Enumerable.Range(0, list.Length), Tuple.Create)
.Where((x, i) => i == 0 || list[i - 1] != x.Item1 - 1).ToArray();
var result = filtered.Select((x, i) => i == filtered.Length - 1
? Tuple.Create(x.Item1, list.Length - x.Item2)
: Tuple.Create(x.Item1, filtered[i + 1].Item2 - x.Item2));
foreach (var t in result)
{
Console.WriteLine(t);
}
This results in
(1, 3)
(5, 1)
(7, 2)
Upvotes: 6