oden
oden

Reputation: 3

How to use linq to group a list of strings on only certain strings

example list of strings:

var test = new List<string>{
    "hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"
};

I want to partition this list by "hdr*" so that each group contains elements...

"hdr1","abc","def","ghi"  
"hdr2","lmn","opq",  
"hdr3","rst","xyz"  

I tried:

var result = test.GroupBy(g => g.StartsWith("hdr"));

but this gives me two groups

"hdr1","hdr2","hdr3"  
"abc","def"..."xyz"  

What is the proper LINQ statement I should use? Let me emphasize that the strings following "hdr*" could be anything. The only thing they have in common is that they follow "hdr*".

Upvotes: 0

Views: 409

Answers (3)

Olivier Jacot-Descombes
Olivier Jacot-Descombes

Reputation: 112342

You get two groups because one group is the group of elements starting with "hdr" and the other group is the group of elements not starting with "hdr". StartsWith returns a bool, so this results in two groups having the Keys false and true.

You can use statement blocks in LINQ. This enables us to do:

string header = null;
var groups = test
    .Select(s => {
        if (s.StartsWith("hdr")) header = s;
        return s;
    })
    .Where(s => header != s)
    .GroupBy(s => header);

We store the last header in header. The where clause eliminates the header itself, since the header is the group key.

The following test...

foreach (var g in groups) {
    Console.WriteLine(g.Key);
    foreach (var item in g) {
        Console.WriteLine("    " + item);
    }
}

... prints this with the given list:

hdr1
    abc
    def
    ghi
hdr2
    lmn
    opq
hdr3
    rst
    xyz

Instead, we can also create lists with the header as first element:

string header = null;
IEnumerable<List<string>> lists = test
    .Select(s => {
        if (s.StartsWith("hdr")) {
            header = s;
        }
        return s;
    })
    .GroupBy(s => header)
    .Select(g => g.ToList());

This test...

foreach (var l in lists) {
    foreach (var item in l) {
        Console.Write(item + " ");
    }
    Console.WriteLine();
}

... prints:

hdr1 abc def ghi
hdr2 lmn opq
hdr3 rst xyz

Upvotes: 1

EdgeNeko
EdgeNeko

Reputation: 61

Yes, you can do it with LINQ expressions. But I don't think it's is much readable than a foreach loop.

var test = new List<string>{
    "hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"
};

int groupid = 0;

var result = test.GroupBy(t =>
{
    if (t.StartsWith("hdr")) ++groupid;
    return groupid;
}).ToList();

result.Select(t => string.Join(' ', t)).ToList().ForEach(Console.WriteLine);

/*
 Outputs:
hdr1 abc def ghi
hdr2 lmn opq
hdr3 rst xyz
 */

Upvotes: 1

Xerillio
Xerillio

Reputation: 5261

You could make a fancy extension method GroupWhen that starts a new group when it finds a matching item. Just like IEnumerable.GroupBy it will return a "list" of groups:

public static IEnumerable<IGrouping<int, T>> GroupWhen<T>(this IEnumerable<T> source, Func<T, bool> predicate)
{
    var i = 0;
    // This method "marks" which group each item belongs in
    // by creating a Tuple with the item and group number
    IEnumerable<(T Item, int GroupNum)> Iterate()
    {
        foreach (var item in source)
        {
            if (predicate(item)) i++; // Start new group
            yield return (item, i);
        }
    }
    // Group items by the "mark" from above and only
    // output the Item from the Tuple, since the
    // GroupNum will be the 'int' key of the group
    return Iterate().GroupBy(tup => tup.GroupNum, tup => tup.Item);
}

// Use like so:
var list = new List<string> {"hdr1","abc","def","ghi","hdr2","lmn","opq","hdr3","rst","xyz"};
var groups = list.GroupWhen(s => s.StartsWith("hdr"))
Console.WriteLine(string.Join(",", groups.First()))
// hdr1,abc,def,ghi

Check out this fiddle for a test run.

Upvotes: 1

Related Questions