Reputation: 4573
Given sequence :
["1","A","B","C","2","F","K","L","5","6","P","I","E"]
The numbers represent items that I identify as headers, whereas the letters represent items that I identify as data. I want to associate them into groups like this.
1:A,B,C
2:F,K,L
5:
6:P,I,E
I can easily achieve this using a foreach or while loop on the enumerator, but is there a LINQ'ish way to achieve this? This is a recurring pattern in my domain.
Upvotes: 1
Views: 1696
Reputation: 38468
Here's a solution with LINQ. It's a little bit complicated though. There may be room for some tricks. It doesn't look that terrible but it can be more readable with a foreach loop.
int lastHeaderIndex = default(int);
Dictionary<string, IEnumerable<string>> groupedItems =
items.Select((text, index) =>
{
int number;
if (int.TryParse(text, out number))
{
lastHeaderIndex = index;
}
return new { HeaderIndex = lastHeaderIndex, Value = text };
})
.GroupBy(item => item.HeaderIndex)
.ToDictionary(item => item.FirstOrDefault().Value,
item => item.Skip(1).Select(arg => arg.Value));
Upvotes: 3
Reputation: 6983
Since this a common pattern in your domain, consider streaming the results instead of gathering them all into a large in-memory object.
public static IEnumerable<IList<string>> SplitOnToken(IEnumerable<string> input, Func<string,bool> isSplitToken)
{
var set = new List<string>();
foreach(var item in input)
{
if (isSplitToken(item) && set.Any())
{
yield return set;
set = new List<string>();
}
set.Add(item);
}
if (set.Any())
{
yield return set;
}
}
Sample usage:
var sequence = new[] { "1", "A", "B", "C", "2", "F", "K", "L", "5", "6", "P", "I", "E" };
var groups = SplitOnToken(sequence, x => Char.IsDigit(x[0]));
foreach (var @group in groups)
{
Console.WriteLine("{0}: {1}", @group[0], String.Join(" ", @group.Skip(1).ToArray()));
}
output:
1: A B C
2: F K L
5:
6: P I E
Upvotes: 2
Reputation: 4573
Here's what I ended up using. Pretty much the same structure as phg's answer.
Basically, it is an aggregate function that maintains a Tuple containing: 1: the accummulated data. 2: state of the parser.
The aggregating function does an if-else to check if currently examined item is a group header or a regular item. Based on this, it updates the datastore (last part of the tuple) and/or changes the parser state (first part of the tuple).
In my case, the parser state is the currently active list (that upcoming items shall be inserted into).
var sequence = new[]{ "1","A","B","C","2","F","K","L","5","6","P","I","E"};
var aggr = Tuple.Create(new List<string>(), new Dictionary<int,List<string>>());
var res = sequence.Aggregate(aggr, (d, x) => {
int i;
if (Int32.TryParse(x, out i))
{
var newList = new List<string>();
d.Item2.Add(i,newList);
return Tuple.Create(newList,d.Item2);
} else
{
d.Item1.Add(x);
return d;
}
},d=>d.Item2);
Upvotes: 1
Reputation: 20950
You can make use of a fold:
var aggr = new List<Tuple<Int,List<String>>>();
var res = sequence.Aggregate(aggr, (d, x) => {
int i;
if (Int32.TryParse(x, out i)) {
var newDict = d.Add(new Tuple(i, new List<string>()));
return newDict;
}
else {
var newDict = d[d.Count - 1].Item2.Add(x);
return newDict;
}
}).ToDictionary(x => x.Item1, x => x.Item2);
However, this doesn't look so nice, since there's lacking support for immutable values. Also, I couldn't test this right now.
Upvotes: 2
Reputation: 3227
foreach
loop with int.TryParse
should help. 'GroupBy' from LINQ won't help here much.
Upvotes: 2