stask
stask

Reputation: 83

How do I count a number of items in each group in sequence using linq?

For example, I have a sequence of integers

1122211121

I'd like to get some dictionary/anonymous class showing:

item | count
1    | 2
2    | 3
1    | 3
2    | 1
1    | 1

Upvotes: 2

Views: 851

Answers (3)

Ssithra
Ssithra

Reputation: 710

Wel... a bit shorter (notice the double Separate call to deal with even/odd occurrences counts) :

    static void Main(string[] args)
    {
        string separatedDigits = Separate(Separate("1122211121"));

        foreach (var ano in separatedDigits.Split('|').Select(block => new { item = block.Substring(0, 1), count = block.Length }))
            Console.WriteLine(ano);

        Console.ReadKey();
    }

    static string Separate(string input)
    {
        return Regex.Replace(input, @"(\d)(?!\1)(\d)", "$1|$2");
    }
}

Upvotes: 0

marr75
marr75

Reputation: 5715

You're looking to do something like the "Batch" operator in the morelinq project, then output the count of the groups.

Unfortunately, the batch operator from morelinq just takes a size and returns buckets batched by that size (or it did when I was looking at morelinq). To correct this deficiency, I had to write my own batch implementation.

private static IEnumerable<TResult> BatchImplementation<TSource, TResult>(
        this IEnumerable<TSource> source,
        Func<TSource, TSource, int, bool> breakCondition,
        Func<IEnumerable<TSource>, TResult> resultSelector
    )
{
    List<TSource> bucket = null;
    var lastItem = default(TSource);
    var count = 0;

    foreach (var item in source)
    {
        if (breakCondition(item, lastItem, count++))
        {
            if (bucket != null)
            {
                yield return resultSelector(bucket.Select(x => x));
            }

            bucket = new List<TSource>();
        }
        bucket.Add(item);
        lastItem = item;
    }

    // Return the last bucket with all remaining elements
    if (bucket.Count > 0)
    {
        yield return resultSelector(bucket.Select(x => x));
    }
}

This is the private version that I expose various public overloads which validate input parameters. You would want your breakCondition to be something of the form:

Func<int, int, int, bool> breakCondition = x, y, z => x != y;

This should give you, for your example sequence: {1, 1}, {2, 2, 2}, {1, 1, 1}, {2}, {1}

From here, grabbing the first item of each sequence and then counting the sequence are trivial.

Edit: To assist in implementation -

public static IEnumerable<IEnumerable<TSource>> Batch<TSource>(
        this IEnumerable<TSource> source,
        Func<TSource, TSource, int, bool> breakCondition
    )
{
    //Validate that source, breakCondition, and resultSelector are not null
    return BatchImplemenatation(source, breakCondition, x => x);
}

Your code would then be:

var sequence = {1, 1, 2, 2, 2, 1, 1, 1, 2, 1};
var batchedSequence = sequence.batch((x, y, z) => x != y);
//batchedSequence = {{1, 1}, {2, 2, 2}, {1, 1, 1}, {2}, {1}}
var counts = batchedSequence.Select(x => x.Count());
//counts = {2, 3, 3, 1, 1}
var items = batchedSequence.Select(x => x.First());
//items = {1, 2, 1, 2, 1}
var final = counts.Zip(items. (c, i) => {Item = i, Count = c});

I haven't compiled and tested any of this except the private method and its overloads that I use in my own codebase, but this should solve your problem and any similar problems you have.

Upvotes: 2

Steves
Steves

Reputation: 3234

        var test = new[] { 1, 2, 2, 2, 2, 1, 1, 3 };
        int previous = test.First();
        int idx = 0;
        test.Select(x =>
                x == previous ?
                new { orig = x, helper = idx } :
                new { orig = previous = x, helper = ++idx })
            .GroupBy(x => x.helper)
            .Select(group => new { number = group.First().orig, count = group.Count() });

initialization of previous and idx could be done in let clause if you want to be even more Linqy.

       from whatever in new[] { "i want to use linq everywhere" }
       let previous = test.First()
       let idx = 0
       from x in test
       ...

Functional programming is nice, but imho this is a case where in C# I would surely choose rather procedural approach.

Upvotes: 6

Related Questions