vfrank66
vfrank66

Reputation: 1458

Finding consecutive rows to group based on arbitrary

I need to group a list (really a datatable but for simplicity) based on 3 columns where in the first column in really a representation of a fiscal year in integer form. I need to group on all rows where:

  1. the current fiscal year has the second column (Service) equal to 1 and
  2. the following fiscal year is equal to 1, and any consecutive fiscal years have 1 as the Service

The end result group will then have merged all consecutive fiscal years that have 1 as the Service, by taking the first year the group started to the last year it ended while summing the Service and Earnings.

var list = new List(){
{20002001, 1 , 100.00},
{20012002, .5  , 100.00},
{20022003, 1.0 , 100.00},
{20042005, 1.0 , 50.00},
{20052006, 1.0 , 50.00
};

Should produce the results:

20002001, 1 , 100.00
20012002, 1 , 100.00
20022006, 3 , 200.00

This is not a legible example to work off of but maybe it will shed light on what I am attempting to work towards:

var test = from r in list
           where r.Item("FiscalYear") + 10001 = list.SkipWhile(r2 => !r.Equals(current)).Skip(1).FirstOrDefault(r3 => r3.Item("FiscalYear")) &&
                 r.Item("Service") = 1 &&
                 list.SkipWhile(r2 => !r.Equals(current)).Skip(1).FirstOrDefault(r => r.Item("Service") = 1D) 
           Select New { FiscalYear = $"{r.Item("FiscalYear") % 1000}{I have NO IDEA HOW TO DETERMINE THIS PART}", Service = list.Sum(r => r.Item("FiscalYear"), Earnings = list.Sum(r => r.Item("Earnings"))

Assuming the get next would work from http://www.herlitz.nu/2011/12/01/getting-the-previous-and-next-record-from-list-using-linq/. But it does not work in my scenerio.

I have considered a group by which I could do except I would miss out on the correct counts to group the Fiscal Year by.

Upvotes: 0

Views: 217

Answers (1)

gnalck
gnalck

Reputation: 972

Using the GroupAdjacent by function described at https://stackoverflow.com/a/4682163/6137718 :

public static class LinqExtensions
{
    public static IEnumerable<IEnumerable<T>> GroupAdjacentBy<T>(
        this IEnumerable<T> source, Func<T, T, bool> predicate)
    {
        using (var e = source.GetEnumerator())
        {
            if (e.MoveNext())
            {
                var list = new List<T> { e.Current };
                var pred = e.Current;
                while (e.MoveNext())
                {
                    if (predicate(pred, e.Current))
                    {
                        list.Add(e.Current);
                    }
                    else
                    {
                        yield return list;
                        list = new List<T> { e.Current };
                    }
                    pred = e.Current;
                }
                yield return list;
            }
        }
    }
}

The result you want can be gotten like this. Note that I use string for year, but this can be changes to use int, if needed. Using the following class structure:

    class Entry
{
    public Entry(string year, double service, double earnings)
    {
        this.Year = year;
        this.Service = service;
        this.Earnings = earnings;
    }

    public string Year;
    public double Service;
    public double Earnings;
}

You can get the result you desire by doing something like this:

var result = list.GroupAdjacentBy((x, y) => x.Service == 1 && y.Service == 1).
        Select(g => new Entry(
            g.First().Year.Substring(0,4) + g.Last().Year.Substring(4,4),
            g.Sum(e => e.Service),
            g.Sum(e => e.Earnings)));

An example of my code can be found at https://dotnetfiddle.net/RqmYa9 .

I'm unsure as to why, in your example result, the second entry has 1 for Service instead of 0.5. If you wanted all Services to be at least one, you can do a ternary in the query when you select the sum of Service.

Upvotes: 1

Related Questions